From dholmes at openjdk.java.net Mon Mar 1 02:42:39 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 1 Mar 2021 02:42:39 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. In-Reply-To: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Fri, 26 Feb 2021 08:50:38 GMT, Robbin Ehn wrote: > With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. > > In some cases we are in native while executing this method and in some in vm. > That's why there is an check for state in vm. > > Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. > This change-set passes T1 stand alone. Looks good. Minor request below. Thanks, David src/hotspot/share/oops/generateOopMap.cpp line 918: > 916: ThreadBlockInVM tbivm(thread->as_Java_thread()); > 917: } > 918: } Can you add a comment as to why this is necessary please. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2742 From dholmes at openjdk.java.net Mon Mar 1 03:03:39 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 1 Mar 2021 03:03:39 GMT Subject: RFR: JDK-8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 17:47:12 GMT, Thomas Stuefe wrote: > This one is trivial and probably inconsequential, but lets fix it anyway. > > There is a buffer overflow in both variants of UNICODE::as_utf8, where in case of truncation due to a zero length output buffer the terminating zero still gets written. > > Added fix + gtest. Ran gtest. Hi Thomas, I'd rather treat passing a zero-length buffer as a programming error and assert the length is non-zero, rather than penalizing every correct call with an unnecessary precondition check. Cheers, David ------------- PR: https://git.openjdk.java.net/jdk/pull/2753 From stuefe at openjdk.java.net Mon Mar 1 05:28:08 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 1 Mar 2021 05:28:08 GMT Subject: RFR: JDK-8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 03:01:13 GMT, David Holmes wrote: > Hi Thomas, > > I'd rather treat passing a zero-length buffer as a programming error and assert the length is non-zero, rather than penalizing every correct call with an unnecessary precondition check. > > Cheers, > David Hi David, okay, I changed it to an assert. I looked at the callers and think this should be okay, but I am not perfectly sure. Lets hope we hit all cases with our tests. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2753 From stuefe at openjdk.java.net Mon Mar 1 05:28:08 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 1 Mar 2021 05:28:08 GMT Subject: RFR: JDK-8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer [v2] In-Reply-To: References: Message-ID: > This one is trivial and probably inconsequential, but lets fix it anyway. > > There is a buffer overflow in both variants of UNICODE::as_utf8, where in case of truncation due to a zero length output buffer the terminating zero still gets written. > > Added fix + gtest. Ran gtest. Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: assert instead ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2753/files - new: https://git.openjdk.java.net/jdk/pull/2753/files/598212e0..ebc602cd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2753&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2753&range=00-01 Stats: 8 lines in 2 files changed: 0 ins; 4 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/2753.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2753/head:pull/2753 PR: https://git.openjdk.java.net/jdk/pull/2753 From dholmes at openjdk.java.net Mon Mar 1 05:33:39 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 1 Mar 2021 05:33:39 GMT Subject: RFR: JDK-8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer [v2] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 05:28:08 GMT, Thomas Stuefe wrote: >> This one is trivial and probably inconsequential, but lets fix it anyway. >> >> There is a buffer overflow in both variants of UNICODE::as_utf8, where in case of truncation due to a zero length output buffer the terminating zero still gets written. >> >> Added fix + gtest. Ran gtest. > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > assert instead Fine by me. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2753 From lucy at openjdk.java.net Mon Mar 1 08:14:59 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Mon, 1 Mar 2021 08:14:59 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Thu, 11 Feb 2021 08:34:09 GMT, Tobias Hartmann wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> comment changes requested by TheRealMDoerr > > Changes requested by thartmann (Reviewer). @TobiHartmann @dholmes-ora @veresov May I kindly ask you to revisit your comments and potentially approve my changes? I believe I have addressed all you concerns and suggestions (as per my comment from Feb 16). Thank you! Lutz ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From akozlov at openjdk.java.net Mon Mar 1 08:15:40 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 1 Mar 2021 08:15:40 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list In-Reply-To: References: Message-ID: On Sat, 27 Feb 2021 02:11:48 GMT, Yasumasa Suenaga wrote: > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Changes requested by akozlov (no project role). src/hotspot/cpu/aarch64/vm_version_ext_aarch64.cpp line 55: > 53: snprintf(_cpu_name, CPU_TYPE_DESC_BUF_SIZE - 1, "AArch64"); > 54: > 55: int fd = open("/proc/device-tree/compatible", O_RDONLY); This should not be done here in os-independent `src/hotspot/cpu/aarch64`. `src/hotspot/os_cpu/linux_aarch64` looks like a better place for this. Hotspot supports at least Windows/AArch64 in addition to Linux/AArch64. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From thartmann at openjdk.java.net Mon Mar 1 08:29:59 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Mon, 1 Mar 2021 08:29:59 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Thu, 25 Feb 2021 09:01:10 GMT, Lutz Schmidt wrote: >> Dear community, >> may I please request reviews for this fix, improving the usefulness of method invocation counters. >> - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). >> - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. >> - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. >> - before/after sample output is attached to the bug description. >> >> Thank you! >> Lutz > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > comment changes requested by TheRealMDoerr This looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Mon Mar 1 08:34:46 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Mon, 1 Mar 2021 08:34:46 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 08:26:41 GMT, Tobias Hartmann wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> comment changes requested by TheRealMDoerr > > This looks good to me. Thank you for your review, Tobias! I'll delay integration for a while to give David and Igor a chance to react. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From ngasson at openjdk.java.net Mon Mar 1 09:58:48 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Mon, 1 Mar 2021 09:58:48 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 08:12:26 GMT, Anton Kozlov wrote: >> HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: >> >> $ jfr print --events jdk.CPUInformation raspi4.jfr >> jdk.CPUInformation { >> startTime = 22:57:13.521 >> cpu = "AArch64" >> description = "AArch64 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). >> >> In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. >> >> After this change, we can get the description as below: >> >> jdk.CPUInformation { >> startTime = 00:32:49.767 >> cpu = "AArch64" >> description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. >> >> jdk.CPUInformation { >> startTime = 17:28:03.907 >> cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" >> description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD >> Family: (0x17), Model: (0x71), Stepping: 0x0 >> Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 >> Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff >> Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff >> Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Di sable Bit, RDTSCP, Intel 64 Architecture" >> sockets = 1 >> cores = 2 >> hwThreads = 2 >> } > > Changes requested by akozlov (no project role). Many server-class AArch64 machines use ACPI instead of Device Tree so won't have `/proc/device-tree`. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From aph at openjdk.java.net Mon Mar 1 10:33:59 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 1 Mar 2021 10:33:59 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v21] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 19:17:12 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Merge remote-tracking branch 'origin/jdk/jdk-macos' into jdk-macos > - Minor fixes src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp line 62: > 60: > 61: #if defined(__APPLE__) || defined(_WIN64) > 62: #define R18_RESERVED #define R18_RESERVED true``` ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From stuefe at openjdk.java.net Mon Mar 1 11:09:03 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 1 Mar 2021 11:09:03 GMT Subject: RFR: JDK-8261552: s390: MacroAssembler::encode_klass_not_null() may produce wrong results for non-zero values of narrow klass base [v3] In-Reply-To: References: Message-ID: > If Compressed class pointer base has a non-zero value it may cause MacroAssembler::encode_klass_not_null() to encode a Klass pointer to a wrong narrow pointer. > > This can be reproduced by starting the VM with > -Xshare:dump -XX:HeapBaseMinAddress=2g -Xmx128m > but CDS is not involved. It is only relevant insofar as this is the only way to get the following combination: > - heap is allocated at 0x800_0000. It is small and ends at 0x8800_0000. > - class space follows at 0x8800_0000 > - the narrow klass pointer base points to the start of the class space at 0x8800_0000. > > In MacroAssembler::encode_klass_not_null(), there is the following section: > > if (base != NULL) { > unsigned int base_h = ((unsigned long)base)>>32; > unsigned int base_l = (unsigned int)((unsigned long)base); > if ((base_h != 0) && (base_l == 0) && VM_Version::has_HighWordInstr()) { > lgr_if_needed(dst, current); > z_aih(dst, -((int)base_h)); // Base has no set bits in lower half. > } else if ((base_h == 0) && (base_l != 0)) { (A) > lgr_if_needed(dst, current); > z_agfi(dst, -(int)base_l); (B) > } else { > load_const(Z_R0, base); > lgr_if_needed(dst, current); > z_sgr(dst, Z_R0); > } > current = dst; > } > > We enter the condition at (A) if the narrow klass pointer base is non-zero but fits into 32bit. At (B), we want to substract the base from the Klass pointer; we do this by calculating the 32bit twos-complement of the base and add it with AGFI. AGFI adds a 32bit immediate to a 64bit register. In this case, it produces the wrong result if the base is >0x800_0000: > > In the case of the crash, we have: > base: 8800_0000 > klass pointer: 8804_1040 > 32bit two's complement of base: 7800_0000 > added to the klass pointer: 1_0004_1040 > > So the result of the "substraction" is 1_0004_1040, it should be 4_1040, which would be the correct offset of the Klass* pointer within the ccs. > > This bug has been dormant; was activated by JDK-8250989 which changed the way class space reservation happens at CDS dump time. It surfaced first as crash in a CDS-specific jtreg test (JDK-8261552). > > ================ > > Fix: > > I changed the AGFI instruction to a pure 32bit add (AFI). That works as long as the Klass pointer also fits into 32bit. So I narrowed the condition at (A) to only fire if it can be ensured that both narrow base and Klass* pointers fit into 32bit. > > I also added a runtime verification in that case that any Klass pointer passed down is indeed a 32bit pointer. However, I am not really sure this is useful, or that this is the best way to do this (using TMHH and TMHL). I was looking for something like TMH or TML to check whole 32bit words but could not find any. > > ---- > > Tests: > > I manually tested that the crash disappears, which it does. I stepped through the encoding code and the values now look right. > > I also did build a VM with the ability to override both class space start address and the narrow klass pointer base to exact values (see https://github.com/openjdk/jdk/compare/master...tstuefe:override-ccs-start-and-base). > > I used this method to test various combinations: > - narrow klass pointer base > 0 < 4g + ccs end < 4g (we hit our branch doing AFI) > - narrow klass pointer base > 0 < 4g + ccs end > 4g (we hit the fallback doing SGR with r0) > - narrow klass pointer base = 0 (we dont do anything) > > (would this override-feature be useful? We could do better testing). > > Thanks, Thomas Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: update ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2595/files - new: https://git.openjdk.java.net/jdk/pull/2595/files/e096f09c..b3cbd715 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2595&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2595&range=01-02 Stats: 26 lines in 1 file changed: 15 ins; 3 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/2595.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2595/head:pull/2595 PR: https://git.openjdk.java.net/jdk/pull/2595 From aph at openjdk.java.net Mon Mar 1 11:09:55 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 1 Mar 2021 11:09:55 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v21] In-Reply-To: References: Message-ID: <5l97Ac1W-JbLiCBJMoWut-c-RZEPN6wBHABym2nHeRE=.6d060249-e315-486b-a0ab-1992de6f54d5@github.com> On Fri, 26 Feb 2021 19:17:12 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: > > - Merge remote-tracking branch 'origin/jdk/jdk-macos' into jdk-macos > - Minor fixes Thanks. With this, I think we're done. ------------- Changes requested by aph (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2200 From aph at openjdk.java.net Mon Mar 1 11:09:56 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 1 Mar 2021 11:09:56 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v10] In-Reply-To: References: Message-ID: On Thu, 4 Feb 2021 23:01:52 GMT, Gerard Ziemski wrote: >> Anton Kozlov has updated the pull request incrementally with six additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/jdk/jdk-macos' into jdk-macos >> - Add comments to WX transitions >> >> + minor change of placements >> - Use macro conditionals instead of empty functions >> - Add W^X to tests >> - Do not require known W^X state >> - Revert w^x in gtests > > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 652: > >> 650: >> 651: void os::setup_fpu() { >> 652: } > > Is there really nothing to do here, or does still need to be implemented? A clarification comment here would help/. There is really nothing to do here. > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 198: > >> 196: >> 197: NOINLINE frame os::current_frame() { >> 198: intptr_t *fp = *(intptr_t **)__builtin_frame_address(0); > > In the bsd_x86 counter part we initialize `fp` to `_get_previous_fp()` - do we need to implement it on aarch64 as well (and using address 0 is just a temp workaround) or is it doing the right thing here? (0)``` looks right to me. > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 291: > >> 289: bool is_unsafe_arraycopy = (thread->doing_unsafe_access() && UnsafeCopyMemory::contains_pc(pc)); >> 290: if ((nm != NULL && nm->has_unsafe_access()) || is_unsafe_arraycopy) { >> 291: address next_pc = pc + NativeCall::instruction_size; > > Replace > > address next_pc = pc + NativeCall::instruction_size; > > with > > address next_pc = Assembler::locate_next_instruction(pc); > > there is at least one other place that needs it. Why is this change needed? AFAICS ```locate_next_instruction()``` is an x86 thing for variable-length instructions, and no other port uses it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From aph at openjdk.java.net Mon Mar 1 11:09:57 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 1 Mar 2021 11:09:57 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Fri, 5 Feb 2021 17:20:55 GMT, Anton Kozlov wrote: >> src/hotspot/cpu/aarch64/vm_version_aarch64.hpp line 93: >> >>> 91: CPU_MARVELL = 'V', >>> 92: CPU_INTEL = 'i', >>> 93: CPU_APPLE = 'a', >> >> The `ARM Architecture Reference Manual ARMv8, for ARMv8-A architecture profile` has 8538 pages, can we be more specific and point to the particular section of the document where the CPU codes are defined? > > They are defined in 13.2.95. MIDR_EL1, Main ID Register. Apple's code is not there, but "Arm can assign codes that are not published in this manual. All values not assigned by Arm are reserved and must not be used.". I assume the value was obtained by digging around https://github.com/apple/darwin-xnu/blob/main/osfmk/arm/cpuid.h#L62 Anton, this paragraph looks like an excellent comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From aph at openjdk.java.net Mon Mar 1 11:09:58 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 1 Mar 2021 11:09:58 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v10] In-Reply-To: References: <9Nasu4m7orJoGYjX4EYCuz5-aevYNno3Ru3jPHgwkvc=.168cfdf0-648b-46e4-9cb4-b24956eeba7d@github.com> Message-ID: On Fri, 12 Feb 2021 11:42:59 GMT, Vladimir Kempik wrote: >> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 194: >> >>> 192: // may get turned off by -fomit-frame-pointer. >>> 193: frame os::get_sender_for_C_frame(frame* fr) { >>> 194: return frame(fr->link(), fr->link(), fr->sender_pc()); >> >> Why is it >> >> return frame(fr->link(), fr->link(), fr->sender_pc()); >> >> and not >> >> return frame(fr->sender_sp(), fr->link(), fr->sender_pc()); >> >> like in the bsd-x86 counter part? > > bsd_aarcb64 was based on linux_aarch64, with addition of bsd-specific things from bsd_x86 > You think the bsd-x86 way is better here ? There's no point copying x86. We don't have any way to know what the sender's SP was in a C frame without using unwind info. I think this is just used when trying to print the stack in a crash dump. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From lucy at openjdk.java.net Mon Mar 1 11:25:41 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Mon, 1 Mar 2021 11:25:41 GMT Subject: RFR: JDK-8261552: s390: MacroAssembler::encode_klass_not_null() may produce wrong results for non-zero values of narrow klass base [v3] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 11:09:03 GMT, Thomas Stuefe wrote: >> If Compressed class pointer base has a non-zero value it may cause MacroAssembler::encode_klass_not_null() to encode a Klass pointer to a wrong narrow pointer. >> >> This can be reproduced by starting the VM with >> -Xshare:dump -XX:HeapBaseMinAddress=2g -Xmx128m >> but CDS is not involved. It is only relevant insofar as this is the only way to get the following combination: >> - heap is allocated at 0x800_0000. It is small and ends at 0x8800_0000. >> - class space follows at 0x8800_0000 >> - the narrow klass pointer base points to the start of the class space at 0x8800_0000. >> >> In MacroAssembler::encode_klass_not_null(), there is the following section: >> >> if (base != NULL) { >> unsigned int base_h = ((unsigned long)base)>>32; >> unsigned int base_l = (unsigned int)((unsigned long)base); >> if ((base_h != 0) && (base_l == 0) && VM_Version::has_HighWordInstr()) { >> lgr_if_needed(dst, current); >> z_aih(dst, -((int)base_h)); // Base has no set bits in lower half. >> } else if ((base_h == 0) && (base_l != 0)) { (A) >> lgr_if_needed(dst, current); >> z_agfi(dst, -(int)base_l); (B) >> } else { >> load_const(Z_R0, base); >> lgr_if_needed(dst, current); >> z_sgr(dst, Z_R0); >> } >> current = dst; >> } >> >> We enter the condition at (A) if the narrow klass pointer base is non-zero but fits into 32bit. At (B), we want to substract the base from the Klass pointer; we do this by calculating the 32bit twos-complement of the base and add it with AGFI. AGFI adds a 32bit immediate to a 64bit register. In this case, it produces the wrong result if the base is >0x800_0000: >> >> In the case of the crash, we have: >> base: 8800_0000 >> klass pointer: 8804_1040 >> 32bit two's complement of base: 7800_0000 >> added to the klass pointer: 1_0004_1040 >> >> So the result of the "substraction" is 1_0004_1040, it should be 4_1040, which would be the correct offset of the Klass* pointer within the ccs. >> >> This bug has been dormant; was activated by JDK-8250989 which changed the way class space reservation happens at CDS dump time. It surfaced first as crash in a CDS-specific jtreg test (JDK-8261552). >> >> ================ >> >> Fix: >> >> I changed the AGFI instruction to a pure 32bit add (AFI). That works as long as the Klass pointer also fits into 32bit. So I narrowed the condition at (A) to only fire if it can be ensured that both narrow base and Klass* pointers fit into 32bit. >> >> I also added a runtime verification in that case that any Klass pointer passed down is indeed a 32bit pointer. However, I am not really sure this is useful, or that this is the best way to do this (using TMHH and TMHL). I was looking for something like TMH or TML to check whole 32bit words but could not find any. >> >> ---- >> >> Tests: >> >> I manually tested that the crash disappears, which it does. I stepped through the encoding code and the values now look right. >> >> I also did build a VM with the ability to override both class space start address and the narrow klass pointer base to exact values (see https://github.com/openjdk/jdk/compare/master...tstuefe:override-ccs-start-and-base). >> >> I used this method to test various combinations: >> - narrow klass pointer base > 0 < 4g + ccs end < 4g (we hit our branch doing AFI) >> - narrow klass pointer base > 0 < 4g + ccs end > 4g (we hit the fallback doing SGR with r0) >> - narrow klass pointer base = 0 (we dont do anything) >> >> (would this override-feature be useful? We could do better testing). >> >> Thanks, Thomas > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > update LGTM, still. ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2595 From ysuenaga at openjdk.java.net Mon Mar 1 12:36:55 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 1 Mar 2021 12:36:55 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 09:55:58 GMT, Nick Gasson wrote: > Many server-class AArch64 machines use ACPI instead of Device Tree so won't have `/proc/device-tree`. What file should we refer to detect SoC on server-class machine? Can we detect SoC in same way? (e.g. sysfs) If we cannot implement it in same way, I want to fix it for device tree at first. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From mdoerr at openjdk.java.net Mon Mar 1 12:45:52 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 1 Mar 2021 12:45:52 GMT Subject: RFR: JDK-8261552: s390: MacroAssembler::encode_klass_not_null() may produce wrong results for non-zero values of narrow klass base [v3] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 11:09:03 GMT, Thomas Stuefe wrote: >> If Compressed class pointer base has a non-zero value it may cause MacroAssembler::encode_klass_not_null() to encode a Klass pointer to a wrong narrow pointer. >> >> This can be reproduced by starting the VM with >> -Xshare:dump -XX:HeapBaseMinAddress=2g -Xmx128m >> but CDS is not involved. It is only relevant insofar as this is the only way to get the following combination: >> - heap is allocated at 0x800_0000. It is small and ends at 0x8800_0000. >> - class space follows at 0x8800_0000 >> - the narrow klass pointer base points to the start of the class space at 0x8800_0000. >> >> In MacroAssembler::encode_klass_not_null(), there is the following section: >> >> if (base != NULL) { >> unsigned int base_h = ((unsigned long)base)>>32; >> unsigned int base_l = (unsigned int)((unsigned long)base); >> if ((base_h != 0) && (base_l == 0) && VM_Version::has_HighWordInstr()) { >> lgr_if_needed(dst, current); >> z_aih(dst, -((int)base_h)); // Base has no set bits in lower half. >> } else if ((base_h == 0) && (base_l != 0)) { (A) >> lgr_if_needed(dst, current); >> z_agfi(dst, -(int)base_l); (B) >> } else { >> load_const(Z_R0, base); >> lgr_if_needed(dst, current); >> z_sgr(dst, Z_R0); >> } >> current = dst; >> } >> >> We enter the condition at (A) if the narrow klass pointer base is non-zero but fits into 32bit. At (B), we want to substract the base from the Klass pointer; we do this by calculating the 32bit twos-complement of the base and add it with AGFI. AGFI adds a 32bit immediate to a 64bit register. In this case, it produces the wrong result if the base is >0x800_0000: >> >> In the case of the crash, we have: >> base: 8800_0000 >> klass pointer: 8804_1040 >> 32bit two's complement of base: 7800_0000 >> added to the klass pointer: 1_0004_1040 >> >> So the result of the "substraction" is 1_0004_1040, it should be 4_1040, which would be the correct offset of the Klass* pointer within the ccs. >> >> This bug has been dormant; was activated by JDK-8250989 which changed the way class space reservation happens at CDS dump time. It surfaced first as crash in a CDS-specific jtreg test (JDK-8261552). >> >> ================ >> >> Fix: >> >> I changed the AGFI instruction to a pure 32bit add (AFI). That works as long as the Klass pointer also fits into 32bit. So I narrowed the condition at (A) to only fire if it can be ensured that both narrow base and Klass* pointers fit into 32bit. >> >> I also added a runtime verification in that case that any Klass pointer passed down is indeed a 32bit pointer. However, I am not really sure this is useful, or that this is the best way to do this (using TMHH and TMHL). I was looking for something like TMH or TML to check whole 32bit words but could not find any. >> >> ---- >> >> Tests: >> >> I manually tested that the crash disappears, which it does. I stepped through the encoding code and the values now look right. >> >> I also did build a VM with the ability to override both class space start address and the narrow klass pointer base to exact values (see https://github.com/openjdk/jdk/compare/master...tstuefe:override-ccs-start-and-base). >> >> I used this method to test various combinations: >> - narrow klass pointer base > 0 < 4g + ccs end < 4g (we hit our branch doing AFI) >> - narrow klass pointer base > 0 < 4g + ccs end > 4g (we hit the fallback doing SGR with r0) >> - narrow klass pointer base = 0 (we dont do anything) >> >> (would this override-feature be useful? We could do better testing). >> >> Thanks, Thomas > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > update Marked as reviewed by mdoerr (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2595 From jbachorik at openjdk.java.net Mon Mar 1 12:56:41 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 12:56:41 GMT Subject: RFR: 8258414: OldObjectSample events too expensive In-Reply-To: <1_LsNBt-Yy5NlHbfwtRSRNvGa2AbTuhMGYuiw3Hy8gU=.3b79e283-87fe-451e-8e60-25b59c5e837a@github.com> References: <1_LsNBt-Yy5NlHbfwtRSRNvGa2AbTuhMGYuiw3Hy8gU=.3b79e283-87fe-451e-8e60-25b59c5e837a@github.com> Message-ID: On Fri, 19 Feb 2021 14:45:00 GMT, Florian David wrote: > The purpose of this change is to reduce the size of JFR recordings when the OldObjectSample event is enabled. > > ## Problem > JFR recordings size blows up when the OldObjectSample is enabled. The memory allocation events are known to be very high traffic and will cause a lot of data, just the sheer number of events produced, and if stacktraces are added to this, the associated metadata can be huge as well. Sampled object are stored in a priority queue and their associated stack traces stored in JFRStackTraceRepository. When sample candidates are removed from the priority queue, their stacktraces remain in the repository, which will be later written at chunk rotation even if the sample has been removed. > > ## Implementation > This PR adds a JFRStackTraceRepository dedicated to store stack traces for the OldObjectSample event. At chunk rotation, every sample stack trace is looked up in this repository and is serialized. Other stack traces are simply removed. > > ## Benchmarks > On an AWS c5.metal instance (96 cores, 192 Gib), running SPECjvm2008 with default profile.jfc configuration with OldObjectSample event enabled gives: > - a recording size of 20.73Mb without the PR fix > - a recording size of 2.78Mb with the PR fix Ok, kicking off the review. The implementation is doing the right thing (AFAICT) and I have no strong objections, just minor things regarding slightly more detailed comments. src/hotspot/share/jfr/recorder/jfrRecorder.cpp line 284: > 282: return false; > 283: } > 284: if (!create_leak_profiler_stacktrace_repository()) { Can you add a comment here explaining the purpose of having a separate leak profiler stacktrace repository? src/hotspot/share/jfr/recorder/stacktrace/jfrStackTraceRepository.cpp line 27: > 25: #include "precompiled.hpp" > 26: #include "jfr/leakprofiler/sampling/objectSample.hpp" > 27: #include "jfr/leakprofiler/sampling/objectSampler.hpp" Are those 2 extra includes needed? src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint.cpp line 200: > 198: } > 199: > 200: class StackTraceChunkWriter { Perhaps, a detailed comment explaining how this is working with the regular stacktrace writer would be good to have here. ------------- PR: https://git.openjdk.java.net/jdk/pull/2645 From jbachorik at openjdk.java.net Mon Mar 1 13:15:49 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 13:15:49 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 18 Feb 2021 10:18:03 GMT, Aleksey Shipilev wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > src/hotspot/share/gc/shared/genCollectedHeap.cpp line 1144: > >> 1142: _old_gen->prepare_for_compaction(&cp); >> 1143: _young_gen->prepare_for_compaction(&cp); >> 1144: > > Stray newline? ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 13:15:48 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 13:15:48 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <30TKeg0tEzRXVOG5fcZp_Zsb9dTxU__Cn1VnpUUZJY4=.d7efe1f7-ebec-4b22-a5ac-9ba4b3d732eb@github.com> On Mon, 22 Feb 2021 16:50:48 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 4578: >> >>> 4576: >>> 4577: void G1CollectedHeap::set_live(size_t bytes) { >>> 4578: Atomic::release_store(&_live_size, bytes); >> >> I don't think this requires `release_store`, regular `store` would be enough. G1 folks can say for sure. > > Not required. ?? >> src/hotspot/share/gc/shared/genCollectedHeap.hpp line 183: >> >>> 181: size_t live = _live_size; >>> 182: return live > 0 ? live : used(); >>> 183: }; >> >> I think the implementation belongs to `genCollectedHeap.cpp`. > > +1. Does not seem to be performance sensitive. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From hseigel at openjdk.java.net Mon Mar 1 13:28:50 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 1 Mar 2021 13:28:50 GMT Subject: RFR: 8262028: Make InstanceKlass::implementor return InstanceKlass In-Reply-To: References: <7ME0ALE4x-SV0Lh8Yrb04OaUCIWcNmft6jkALr3CdyQ=.89947c5e-b039-4fd3-9387-576295a7f9f7@github.com> Message-ID: On Fri, 26 Feb 2021 20:27:09 GMT, Vladimir Ivanov wrote: >> Please review this small fix to change the parameter and return types from Klass* to InstanceKlass* in the InstanceKlass::*implementor() functions. >> >> The fix was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Looks good! Thanks Coleen, Calvin. and Vladimir for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/2755 From hseigel at openjdk.java.net Mon Mar 1 13:28:51 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 1 Mar 2021 13:28:51 GMT Subject: Integrated: 8262028: Make InstanceKlass::implementor return InstanceKlass In-Reply-To: <7ME0ALE4x-SV0Lh8Yrb04OaUCIWcNmft6jkALr3CdyQ=.89947c5e-b039-4fd3-9387-576295a7f9f7@github.com> References: <7ME0ALE4x-SV0Lh8Yrb04OaUCIWcNmft6jkALr3CdyQ=.89947c5e-b039-4fd3-9387-576295a7f9f7@github.com> Message-ID: On Fri, 26 Feb 2021 18:23:34 GMT, Harold Seigel wrote: > Please review this small fix to change the parameter and return types from Klass* to InstanceKlass* in the InstanceKlass::*implementor() functions. > > The fix was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: 75bf1061 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/75bf1061 Stats: 36 lines in 4 files changed: 0 ins; 0 del; 36 mod 8262028: Make InstanceKlass::implementor return InstanceKlass Reviewed-by: coleenp, ccheung, vlivanov ------------- PR: https://git.openjdk.java.net/jdk/pull/2755 From ysuenaga at openjdk.java.net Mon Mar 1 13:39:00 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 1 Mar 2021 13:39:00 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: Message-ID: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: refactoring ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2759/files - new: https://git.openjdk.java.net/jdk/pull/2759/files/0929c855..8c895361 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=00-01 Stats: 59 lines in 4 files changed: 36 ins; 17 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2759.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2759/head:pull/2759 PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Mon Mar 1 13:39:01 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 1 Mar 2021 13:39:01 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 12:33:29 GMT, Yasumasa Suenaga wrote: >> Many server-class AArch64 machines use ACPI instead of Device Tree so won't have `/proc/device-tree`. > >> Many server-class AArch64 machines use ACPI instead of Device Tree so won't have `/proc/device-tree`. > > What file should we refer to detect SoC on server-class machine? Can we detect SoC in same way? (e.g. sysfs) > If we cannot implement it in same way, I want to fix it for device tree at first. I refactored the change to separate Linux AArch64 code from common code in new commit. If we cannot open `/proc/device-tree`, it fallbacks "AArch64" like current implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From harold.seigel at oracle.com Mon Mar 1 14:00:55 2021 From: harold.seigel at oracle.com (Harold Seigel) Date: Mon, 1 Mar 2021 09:00:55 -0500 Subject: RFR: 8262426: Change TRAPS to Thread* for find_constrained_instance_or_array_klass() In-Reply-To: <593e9dcb-f466-db91-eced-bd75f8172af4@oracle.com> References: <7k_fqiZv2p_eRynkwwZodqwGauVk086pE5A4TdqLUXM=.c5dd976c-64bb-4562-a6fc-13580e9f54eb@github.com> <593e9dcb-f466-db91-eced-bd75f8172af4@oracle.com> Message-ID: <084285a5-57b2-c5ef-b48e-00bbd78b6c93@oracle.com> Hi David, There are multiple places with "Thead* THREAD" parameters. Thanks for cleaning them up. Harold On 2/28/2021 5:40 PM, David Holmes wrote: > On 1/03/2021 8:36 am, David Holmes wrote: >> On Fri, 26 Feb 2021 15:55:28 GMT, Harold Seigel >> wrote: >> >>>> This looks good and trivial. >>> >>> Thanks Coleen for reviewing this! >> >> Hi Harold, >> >> When we remove TRAPS we should replace with "Thread* thread", not >> "Thread* THREAD". The THREAD variable is only used for traps-related >> exception processing. (I'll be cleaning these up as part of my >> TRAPS/JavaThread work anyway). > > I messed that in 8261127 as well. > > David > >> Cheers, >> David >> >> ------------- >> >> PR: https://git.openjdk.java.net/jdk/pull/2746 >> From jbachorik at openjdk.java.net Mon Mar 1 14:06:49 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 14:06:49 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 18 Feb 2021 10:19:31 GMT, Aleksey Shipilev wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > src/hotspot/share/gc/shared/generation.hpp line 140: > >> 138: virtual size_t used() const = 0; // The number of used bytes in the gen. >> 139: virtual size_t free() const = 0; // The number of free bytes in the gen. >> 140: virtual size_t live() const = 0; > > Needs a comment to match the lines above? Say, `// The estimate of live bytes in the gen.` ?? > src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 579: > >> 577: event.set_heapLive(heap->live()); >> 578: event.commit(); >> 579: } > > On the first sight, this belongs in `ShenandoahConcurrentMark::finish_mark()`. Placing the event here would fire the event when concurrent GC is cancelled, which is not what you want. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From akozlov at openjdk.java.net Mon Mar 1 14:15:42 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 1 Mar 2021 14:15:42 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> Message-ID: <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> On Mon, 1 Mar 2021 13:39:00 GMT, Yasumasa Suenaga wrote: >> HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: >> >> $ jfr print --events jdk.CPUInformation raspi4.jfr >> jdk.CPUInformation { >> startTime = 22:57:13.521 >> cpu = "AArch64" >> description = "AArch64 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). >> >> In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. >> >> After this change, we can get the description as below: >> >> jdk.CPUInformation { >> startTime = 00:32:49.767 >> cpu = "AArch64" >> description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. >> >> jdk.CPUInformation { >> startTime = 17:28:03.907 >> cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" >> description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD >> Family: (0x17), Model: (0x71), Stepping: 0x0 >> Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 >> Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff >> Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff >> Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Di sable Bit, RDTSCP, Intel 64 Architecture" >> sockets = 1 >> cores = 2 >> hwThreads = 2 >> } > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > refactoring Looks better, thanks for addressing. Please consider few notes from someone not in the reviewer role. src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 172: > 170: > 171: void VM_Version::get_compatible_board(char *buf, int buflen) { > 172: const char *aarch64_label = "AArch64"; All platforms seem to declare themselves `AArch64`, this probably can be in the shared aarch64 code. src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 180: > 178: fstat(fd, &statbuf); > 179: if (buflen < statbuf.st_size) { > 180: strncpy(buf, aarch64_label, buflen); This line is duplicated multiple times in this function, please consider reorganizing the code so we certainly copy the string before return from this function. ------------- Marked as reviewed by akozlov (no project role). PR: https://git.openjdk.java.net/jdk/pull/2759 From shade at openjdk.java.net Mon Mar 1 14:21:40 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 1 Mar 2021 14:21:40 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 22 Feb 2021 17:20:49 GMT, Thomas Schatzl wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > The change also misses liveness update after G1 Full GC: it should at least reset the internal liveness counter to 0 so that `used()` is used. > I think there is the same issue for Parallel Full GC. Serial seems to be handled. Another general comment about Shenandoah. It would seem easier to piggyback liveness summarization on region iteration that heuristics does at the end of mark anyway. See `ShenandoahHeuristics::choose_collection_set`. I can do that when you are done with your changes, or try it yourself. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 14:21:42 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 14:21:42 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 18 Feb 2021 10:22:58 GMT, Aleksey Shipilev wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.cpp line 265: > >> 263: ShenandoahHeap* const heap = ShenandoahHeap::heap(); >> 264: heap->set_concurrent_mark_in_progress(false); >> 265: heap->mark_finished(); > > Let's not rename this method. Introduce a new method, `ShenandoahHeap::update_live`, and call it every time after `mark_complete_marking_context()` is called. ?? > src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp line 494: > >> 492: mark_complete_marking_context(); >> 493: >> 494: class ShenandoahCollectLiveSizeClosure : public ShenandoahHeapRegionClosure { > > We don't usually use the in-method declarations like these, pull it out of the method. ?? > src/hotspot/share/gc/epsilon/epsilonHeap.hpp line 80: > >> 78: virtual size_t capacity() const { return _virtual_space.committed_size(); } >> 79: virtual size_t used() const { return _space->used(); } >> 80: virtual size_t live() const { return used(); } > > I'd prefer to call `_space->used()` directly here. Minor optimization, I know. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 14:21:43 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 14:21:43 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 22 Feb 2021 08:44:25 GMT, Per Liden wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > src/hotspot/share/gc/z/zStat.hpp line 549: > >> 547: static size_t used_at_mark_start(); >> 548: static size_t used_at_relocate_end(); >> 549: static size_t live(); > > Please call this `live_at_mark_end()` to match the names of the neighboring functions. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 14:21:43 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 14:21:43 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Fri, 19 Feb 2021 08:21:36 GMT, Albert Mingkun Yang wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > src/hotspot/share/jfr/periodic/jfrPeriodic.cpp line 649: > >> 647: TRACE_REQUEST_FUNC(HeapUsageSummary) { >> 648: EventHeapUsageSummary event; >> 649: if (event.should_commit()) { > > I believe the `should_commit` check is not needed; the period check is handle by the caller. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 14:27:41 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 14:27:41 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <6L-2nr_YRy2MikNW7fhxblbLswAZD3L05fliGk36WTM=.822e958d-3cc1-43cd-88fb-d4e6bbbed012@github.com> On Mon, 22 Feb 2021 17:12:43 GMT, Thomas Schatzl wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > src/hotspot/share/gc/shared/genCollectedHeap.cpp line 683: > >> 681: } >> 682: // update the live size after last GC >> 683: _live_size = _young_gen->live() + _old_gen->live(); > > I would prefer if that code were placed into `gc_epilogue`. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From ysuenaga at openjdk.java.net Mon Mar 1 14:50:00 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 1 Mar 2021 14:50:00 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Mon, 1 Mar 2021 13:57:45 GMT, Anton Kozlov wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> refactoring > > src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 172: > >> 170: >> 171: void VM_Version::get_compatible_board(char *buf, int buflen) { >> 172: const char *aarch64_label = "AArch64"; > > All platforms seem to declare themselves `AArch64`, this probably can be in the shared aarch64 code. VM_Version is not inherited, and platform-specific functions are declared in os_linux and os_windows. So I think it is difficult to declare shared (default) function for this purpose. It is the best if we declare shared function, and override it like a virtual function. But it seems to be difficult. Do you have any idea? ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From jbachorik at openjdk.java.net Mon Mar 1 15:11:41 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:11:41 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 1 Mar 2021 14:03:37 GMT, Jaroslav Bachorik wrote: >> src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 579: >> >>> 577: event.set_heapLive(heap->live()); >>> 578: event.commit(); >>> 579: } >> >> On the first sight, this belongs in `ShenandoahConcurrentMark::finish_mark()`. Placing the event here would fire the event when concurrent GC is cancelled, which is not what you want. > > ?? Actually, this shouldn't even be here. `EventGCHeapSummary` is emitted via `trace_heap*` calls which should already be hooked into Shenandoah. Let me remove this. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:27:24 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:27:24 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v2] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <8Q_OEFTu-Npp_Pr74VPWQ6DrQUzcaDtVZy7nru2RCbU=.7e80a5b9-be94-4442-beaf-9593b067241d@github.com> On Mon, 1 Mar 2021 14:17:17 GMT, Aleksey Shipilev wrote: >> The change also misses liveness update after G1 Full GC: it should at least reset the internal liveness counter to 0 so that `used()` is used. >> I think there is the same issue for Parallel Full GC. Serial seems to be handled. > > Another general comment about Shenandoah. It would seem easier to piggyback liveness summarization on region iteration that heuristics does at the end of mark anyway. See `ShenandoahHeuristics::choose_collection_set`. I can do that when you are done with your changes, or try it yourself. I have addressed comments with trivial fixes. Will take a look at the remainder of more complex ones next. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:27:22 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:27:22 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v2] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - Merge remote-tracking branch 'origin/master' into jb/live_set_1 - Change dead space calculation - Common PR fixes - Minor G1 related PR fixes - Epsilon related PR fixes - Shenandoah related PR fixes - Rename ZStatHeap::live() to live_at_mark_end() - Update event definition and emission - 8258431: Provide a JFR event with live set size estimate ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/ddc5b5c1..03a8617e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=00-01 Stats: 45701 lines in 1355 files changed: 27365 ins; 10881 del; 7455 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:27:26 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:27:26 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v2] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 18 Feb 2021 10:27:21 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 627: >> >>> 625: >>> 626: size_t ShenandoahHeap::live() const { >>> 627: size_t live = Atomic::load_acquire(&_live); >> >> I understand you copy-pasted from the same file. We have removed `_acquire` with #2504. Do `Atomic::load` here. > > ...which also means you want to merge from master to get recent changes? Yep. Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:27:32 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:27:32 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v2] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 18 Feb 2021 10:23:53 GMT, Aleksey Shipilev wrote: >> Jaroslav Bachorik has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/master' into jb/live_set_1 >> - Change dead space calculation >> - Common PR fixes >> - Minor G1 related PR fixes >> - Epsilon related PR fixes >> - Shenandoah related PR fixes >> - Rename ZStatHeap::live() to live_at_mark_end() >> - Update event definition and emission >> - 8258431: Provide a JFR event with live set size estimate > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 655: > >> 653: >> 654: void ShenandoahHeap::set_live(size_t bytes) { >> 655: Atomic::release_store_fence(&_live, bytes); > > Same, do `Atomic::store` here. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:27:38 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:27:38 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v2] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 22 Feb 2021 17:16:46 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/master' into jb/live_set_1 >> - Change dead space calculation >> - Common PR fixes >> - Minor G1 related PR fixes >> - Epsilon related PR fixes >> - Shenandoah related PR fixes >> - Rename ZStatHeap::live() to live_at_mark_end() >> - Update event definition and emission >> - 8258431: Provide a JFR event with live set size estimate > > src/hotspot/share/gc/shared/space.inline.hpp line 189: > >> 187: oop obj = oop(cur_obj); >> 188: size_t obj_size = obj->size(); >> 189: live_offset += obj_size; > > It seems more natural to me to put this counting into the `DeadSpacer` as this is what this change does. Also, the actual dead space "used" can be calculated from the difference between the `_allowed_deadspace_words` and the maximum (calculated in the constructor of `DeadSpacer`) afaict at the end of evacuation. So there is no need to incur per-object costs during evacuation at all. Something like https://github.com/openjdk/jdk/pull/2579/commits/b28e7a0959a4655173bf1599bd035ab668196af6 would be ok? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:30:02 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:30:02 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v2] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 22 Feb 2021 19:37:37 GMT, Erik Gahlin wrote: >> Jaroslav Bachorik has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/master' into jb/live_set_1 >> - Change dead space calculation >> - Common PR fixes >> - Minor G1 related PR fixes >> - Epsilon related PR fixes >> - Shenandoah related PR fixes >> - Rename ZStatHeap::live() to live_at_mark_end() >> - Update event definition and emission >> - 8258431: Provide a JFR event with live set size estimate > > src/hotspot/share/jfr/metadata/metadata.xml line 205: > >> 203: >> 204: >> 205: > > I think it would be good to mention in the description that it is an estimate, i.e. "Estimate of live bytes ....". ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:37:06 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:37:06 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v3] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 22 Feb 2021 17:08:08 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/parallel/parallelScavengeHeap.hpp line 79: >> >>> 77: size_t _young_live; >>> 78: size_t _eden_live; >>> 79: size_t _old_live; >> >> It's only the sum that's ever exposed, right? I wonder if it makes sense to merge them into one var to only track the sum. > > I agree because they seem to be always read and written at the same time. The original idea was that the sum might be computed from different areas depending on the GC phase. But, apparently, that's not the case so we can have just one common sum. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:37:05 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:37:05 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v3] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Do not track young, eden and old live size separately ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/03a8617e..01c22ce6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=01-02 Stats: 13 lines in 3 files changed: 1 ins; 7 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 15:41:11 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 15:41:11 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v4] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Fix dangling space ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/01c22ce6..6a1aa73e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 16:34:04 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 16:34:04 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v5] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Fix syntax error ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/6a1aa73e..dd204d8c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 17:37:11 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 17:37:11 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v6] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Attempt to fix G1 live set size computation ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/dd204d8c..08c715ab Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=04-05 Stats: 19 lines in 1 file changed: 7 ins; 10 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 1 17:37:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 1 Mar 2021 17:37:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v6] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 22 Feb 2021 17:00:19 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Attempt to fix G1 live set size computation > > src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1114: > >> 1112: >> 1113: _g1h->set_live(live_size * HeapWordSize); >> 1114: > > This code is located in the wrong place. It will return only the live words for the areas that have been marked, not eden or objects allocated in old gen after the marking started. > > Further it iterates over all regions, which can be large compared to actually active regions. > > A better place is in `G1UpdateRemSetTrackingBeforeRebuild::do_heap_region()` after the last method call - at that point, `HeapRegion::live_bytes()` contains the per-region number of live data for all regions. > > `G1UpdateRemSetTrackingBeforeRebuild` is instantiated and then called by multiple threads. It's probably best that that `HeapClosure` locally sums up the live byte estimates and then in the caller `G1UpdateRemSetTrackingBeforeRebuildTask::work()` sums up the per thread results like is done for `G1UpdateRemSetTrackingBeforeRebuildTask::_total_selected_for_rebuild`, which is then set in the caller of the `G1UpdateRemSetTrackingBeforeRebuildTask`. Would something along the line of https://github.com/openjdk/jdk/pull/2579/commits/08c715abccddbd04ced58706b7a705670843b43a be ok? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From mseledtsov at openjdk.java.net Mon Mar 1 18:42:20 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Mon, 1 Mar 2021 18:42:20 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v2] In-Reply-To: References: Message-ID: <8PYYCG3F0i8fbS3usJ9ij5Dc5DdQJrPIWJdaxX0gUbc=.32039234-1800-4bbc-b9a4-fce202e3df55@github.com> > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with three additional commits since the last revision: - Adding const qualifiers - Renaming concurrentTestRunner.inline.hpp to follow hotspot_naming_convention for c/cpp/hpp files - Making TestRunnable to be pure virtual ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/4076519f..8c626c88 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=00-01 Stats: 203 lines in 5 files changed: 96 ins; 100 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Mon Mar 1 19:58:19 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Mon, 1 Mar 2021 19:58:19 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v3] In-Reply-To: References: Message-ID: > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Fixed memory leak ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/8c626c88..4cc3b975 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=01-02 Stats: 14 lines in 4 files changed: 8 ins; 2 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Mon Mar 1 19:58:19 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Mon, 1 Mar 2021 19:58:19 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v3] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 00:48:29 GMT, Igor Ignatyev wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed memory leak > > Changes requested by iignatyev (Reviewer). I believe I have addressed the 2nd round of review feedback. @iignatev Igor, could you please take a look when you have a chance? ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From dcubed at openjdk.java.net Mon Mar 1 21:08:49 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 1 Mar 2021 21:08:49 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. In-Reply-To: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Fri, 26 Feb 2021 08:50:38 GMT, Robbin Ehn wrote: > With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. > > In some cases we are in native while executing this method and in some in vm. > That's why there is an check for state in vm. > > Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. > This change-set passes T1 stand alone. Looks good. Only a couple of minor suggestions. src/hotspot/share/oops/generateOopMap.cpp line 914: > 912: int i = 0; > 913: do { > 914: if (i != 0 && thread->is_Java_thread()) { Perhaps add: `JavaThread* jt = thread->as_Java_thread();` and use it twice below: ------------- Changes requested by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2742 From dcubed at openjdk.java.net Mon Mar 1 21:08:50 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 1 Mar 2021 21:08:50 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Mon, 1 Mar 2021 02:39:41 GMT, David Holmes wrote: >> With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. >> >> In some cases we are in native while executing this method and in some in vm. >> That's why there is an check for state in vm. >> >> Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. >> This change-set passes T1 stand alone. > > src/hotspot/share/oops/generateOopMap.cpp line 918: > >> 916: ThreadBlockInVM tbivm(thread->as_Java_thread()); >> 917: } >> 918: } > > Can you add a comment as to why this is necessary please. Perhaps something like this above L916: // Since this JavaThread has looped at least once and is _thread_in_vm, // we honor any pending blocking request. ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From iklam at openjdk.java.net Mon Mar 1 21:24:42 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 1 Mar 2021 21:24:42 GMT Subject: RFR: JDK-8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer [v2] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 05:28:08 GMT, Thomas Stuefe wrote: >> This one is trivial and probably inconsequential, but lets fix it anyway. >> >> There is a buffer overflow in both variants of UNICODE::as_utf8, where in case of truncation due to a zero length output buffer the terminating zero still gets written. >> >> Added fix + gtest. Ran gtest. > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > assert instead LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2753 From iignatyev at openjdk.java.net Mon Mar 1 21:33:41 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 1 Mar 2021 21:33:41 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v3] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 19:58:19 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Fixed memory leak Changes requested by iignatyev (Reviewer). test/hotspot/gtest/concurrent_test_runner.inline.hpp line 89: > 87: > 88: private: > 89: const TestRunnable* unitTestRunnable; you made it a pointer to const TestRunnable, not a const pointer to TestRunnable. Suggestion: TestRunnable* const unitTestRunnable; test/hotspot/gtest/memory/test_virtualspace.cpp line 681: > 679: ConcurrentTestRunner testRunner(runnable, 30, 15000); > 680: testRunner.run(); > 681: delete runnable; wouldn't it be easier to allocate TestRunnable on stack and pass a pointer? Suggestion: VirtualSpaceRunnable runnable(); ConcurrentTestRunner testRunner(&runnable, 30, 15000); testRunner.run(); ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From dholmes at openjdk.java.net Mon Mar 1 22:56:48 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 1 Mar 2021 22:56:48 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 08:31:40 GMT, Lutz Schmidt wrote: >> This looks good to me. > > Thank you for your review, Tobias! > I'll delay integration for a while to give David and Igor a chance to react. Hi @RealLucy , I didn't do an actual review just made a passing comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From mseledtsov at openjdk.java.net Mon Mar 1 23:37:07 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Mon, 1 Mar 2021 23:37:07 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v4] In-Reply-To: References: Message-ID: > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Using const pointer instead of a pointer to a const value for TestRunnable ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/4cc3b975..f46b5ffd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=02-03 Stats: 6 lines in 1 file changed: 0 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Tue Mar 2 00:00:17 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Tue, 2 Mar 2021 00:00:17 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v5] In-Reply-To: References: Message-ID: > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Simplified uses of TestRunnable ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/f46b5ffd..f8db673b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=03-04 Stats: 12 lines in 3 files changed: 0 ins; 4 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Tue Mar 2 00:00:18 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Tue, 2 Mar 2021 00:00:18 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v3] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 21:30:58 GMT, Igor Ignatyev wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed memory leak > > test/hotspot/gtest/memory/test_virtualspace.cpp line 681: > >> 679: ConcurrentTestRunner testRunner(runnable, 30, 15000); >> 680: testRunner.run(); >> 681: delete runnable; > > wouldn't it be easier to allocate TestRunnable on stack and pass a pointer? > Suggestion: > > VirtualSpaceRunnable runnable(); > ConcurrentTestRunner testRunner(&runnable, 30, 15000); > testRunner.run(); Thanks Igor for these suggestions. I have updated the code accordingly. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From iignatyev at openjdk.java.net Tue Mar 2 00:05:51 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 2 Mar 2021 00:05:51 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v5] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 00:00:17 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Simplified uses of TestRunnable test/hotspot/gtest/concurrent_test_runner.inline.hpp line 67: > 65: unitTestRunnable{runnableArg}, > 66: nrOfThreads{nrOfThreadsArg}, > 67: testDurationMillis{testDurationMillisArg} {} why did you use `{}` instead of `()` to init these (as well as UnitTestThread's) fields? ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From ysuenaga at openjdk.java.net Tue Mar 2 00:12:02 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 2 Mar 2021 00:12:02 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v3] In-Reply-To: References: Message-ID: > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: refactoring ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2759/files - new: https://git.openjdk.java.net/jdk/pull/2759/files/8c895361..f62fa768 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=01-02 Stats: 9 lines in 1 file changed: 1 ins; 8 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2759.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2759/head:pull/2759 PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Tue Mar 2 00:12:03 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 2 Mar 2021 00:12:03 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Mon, 1 Mar 2021 13:59:41 GMT, Anton Kozlov wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> refactoring > > src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 180: > >> 178: fstat(fd, &statbuf); >> 179: if (buflen < statbuf.st_size) { >> 180: strncpy(buf, aarch64_label, buflen); > > This line is duplicated multiple times in this function, please consider reorganizing the code so we certainly copy the string before return from this function. I fixed them in new commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From mseledtsov at openjdk.java.net Tue Mar 2 00:55:16 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Tue, 2 Mar 2021 00:55:16 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: Message-ID: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Using regular brackets in initializer list instead of curly brackets ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/f8db673b..d9f618c2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From iignatyev at openjdk.java.net Tue Mar 2 01:00:50 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 2 Mar 2021 01:00:50 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Tue, 2 Mar 2021 00:55:16 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Using regular brackets in initializer list instead of curly brackets Marked as reviewed by iignatyev (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Tue Mar 2 04:31:45 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 2 Mar 2021 04:31:45 GMT Subject: RFR: JDK-8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer [v2] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 21:21:49 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> assert instead > > LGTM Thanks Ioi and David! ------------- PR: https://git.openjdk.java.net/jdk/pull/2753 From stuefe at openjdk.java.net Tue Mar 2 04:31:46 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 2 Mar 2021 04:31:46 GMT Subject: Integrated: JDK-8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 17:47:12 GMT, Thomas Stuefe wrote: > This one is trivial and probably inconsequential, but lets fix it anyway. > > There is a buffer overflow in both variants of UNICODE::as_utf8, where in case of truncation due to a zero length output buffer the terminating zero still gets written. > > Added fix + gtest. Ran gtest. This pull request has now been integrated. Changeset: f5ab7f68 Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/f5ab7f68 Stats: 58 lines in 2 files changed: 57 ins; 0 del; 1 mod 8262472: Buffer overflow in UNICODE::as_utf8 for zero length output buffer Reviewed-by: dholmes, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/2753 From stuefe at openjdk.java.net Tue Mar 2 04:33:39 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 2 Mar 2021 04:33:39 GMT Subject: RFR: JDK-8261552: s390: MacroAssembler::encode_klass_not_null() may produce wrong results for non-zero values of narrow klass base [v3] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 12:42:33 GMT, Martin Doerr wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> update > > Marked as reviewed by mdoerr (Reviewer). Thanks @TheRealMDoerr and @RealLucy for advice and reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/2595 From stuefe at openjdk.java.net Tue Mar 2 04:33:41 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 2 Mar 2021 04:33:41 GMT Subject: Integrated: JDK-8261552: s390: MacroAssembler::encode_klass_not_null() may produce wrong results for non-zero values of narrow klass base In-Reply-To: References: Message-ID: On Tue, 16 Feb 2021 20:49:49 GMT, Thomas Stuefe wrote: > If Compressed class pointer base has a non-zero value it may cause MacroAssembler::encode_klass_not_null() to encode a Klass pointer to a wrong narrow pointer. > > This can be reproduced by starting the VM with > -Xshare:dump -XX:HeapBaseMinAddress=2g -Xmx128m > but CDS is not involved. It is only relevant insofar as this is the only way to get the following combination: > - heap is allocated at 0x800_0000. It is small and ends at 0x8800_0000. > - class space follows at 0x8800_0000 > - the narrow klass pointer base points to the start of the class space at 0x8800_0000. > > In MacroAssembler::encode_klass_not_null(), there is the following section: > > if (base != NULL) { > unsigned int base_h = ((unsigned long)base)>>32; > unsigned int base_l = (unsigned int)((unsigned long)base); > if ((base_h != 0) && (base_l == 0) && VM_Version::has_HighWordInstr()) { > lgr_if_needed(dst, current); > z_aih(dst, -((int)base_h)); // Base has no set bits in lower half. > } else if ((base_h == 0) && (base_l != 0)) { (A) > lgr_if_needed(dst, current); > z_agfi(dst, -(int)base_l); (B) > } else { > load_const(Z_R0, base); > lgr_if_needed(dst, current); > z_sgr(dst, Z_R0); > } > current = dst; > } > > We enter the condition at (A) if the narrow klass pointer base is non-zero but fits into 32bit. At (B), we want to substract the base from the Klass pointer; we do this by calculating the 32bit twos-complement of the base and add it with AGFI. AGFI adds a 32bit immediate to a 64bit register. In this case, it produces the wrong result if the base is >0x800_0000: > > In the case of the crash, we have: > base: 8800_0000 > klass pointer: 8804_1040 > 32bit two's complement of base: 7800_0000 > added to the klass pointer: 1_0004_1040 > > So the result of the "substraction" is 1_0004_1040, it should be 4_1040, which would be the correct offset of the Klass* pointer within the ccs. > > This bug has been dormant; was activated by JDK-8250989 which changed the way class space reservation happens at CDS dump time. It surfaced first as crash in a CDS-specific jtreg test (JDK-8261552). > > ================ > > Fix: > > I changed the AGFI instruction to a pure 32bit add (AFI). That works as long as the Klass pointer also fits into 32bit. So I narrowed the condition at (A) to only fire if it can be ensured that both narrow base and Klass* pointers fit into 32bit. > > I also added a runtime verification in that case that any Klass pointer passed down is indeed a 32bit pointer. However, I am not really sure this is useful, or that this is the best way to do this (using TMHH and TMHL). I was looking for something like TMH or TML to check whole 32bit words but could not find any. > > ---- > > Tests: > > I manually tested that the crash disappears, which it does. I stepped through the encoding code and the values now look right. > > I also did build a VM with the ability to override both class space start address and the narrow klass pointer base to exact values (see https://github.com/openjdk/jdk/compare/master...tstuefe:override-ccs-start-and-base). > > I used this method to test various combinations: > - narrow klass pointer base > 0 < 4g + ccs end < 4g (we hit our branch doing AFI) > - narrow klass pointer base > 0 < 4g + ccs end > 4g (we hit the fallback doing SGR with r0) > - narrow klass pointer base = 0 (we dont do anything) > > (would this override-feature be useful? We could do better testing). > > Thanks, Thomas This pull request has now been integrated. Changeset: fdd10932 Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/fdd10932 Stats: 79 lines in 1 file changed: 62 ins; 13 del; 4 mod 8261552: s390: MacroAssembler::encode_klass_not_null() may produce wrong results for non-zero values of narrow klass base Co-authored-by: Lutz Schmidt Reviewed-by: mdoerr, lucy ------------- PR: https://git.openjdk.java.net/jdk/pull/2595 From minqi at openjdk.java.net Tue Mar 2 05:03:38 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 2 Mar 2021 05:03:38 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 22:05:06 GMT, Calvin Cheung wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Below are my comments... @calvinccheung @iklam @tstuefe Thanks for review! I will use CDS.java to implement the dumping for next update. This way, we deal with CDS related code in a central place. Also using Runtime.exec will clear your concern, plus the code will be more readable though it will add some bridge functions between java/vm. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From akozlov at openjdk.java.net Tue Mar 2 07:29:40 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 07:29:40 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Mon, 1 Mar 2021 14:47:04 GMT, Yasumasa Suenaga wrote: >> src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 172: >> >>> 170: >>> 171: void VM_Version::get_compatible_board(char *buf, int buflen) { >>> 172: const char *aarch64_label = "AArch64"; >> >> All platforms seem to declare themselves `AArch64`, this probably can be in the shared aarch64 code. > > VM_Version is not inherited, and platform-specific functions are declared in os_linux and os_windows. So I think it is difficult to declare shared (default) function for this purpose. > It is the best if we declare shared function, and override it like a virtual function. But it seems to be difficult. Do you have any idea? Probably we can assume this function to optionally return a board name. IMHO ideally an empty string should a valid output. Now the output from this function is the prefix and the features string is the suffix of the complete string CPU description. What if we start with the constant `AArch64` prefix in the shared CPU code, append output from this function, and then append the features string? ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From akozlov at openjdk.java.net Tue Mar 2 07:48:33 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 07:48:33 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v22] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Update comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2200/files - new: https://git.openjdk.java.net/jdk/pull/2200/files/663cb4a1..e42b82db Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=21 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=20-21 Stats: 4 lines in 2 files changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 2 07:48:34 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 07:48:34 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 10:46:34 GMT, Andrew Haley wrote: >> They are defined in 13.2.95. MIDR_EL1, Main ID Register. Apple's code is not there, but "Arm can assign codes that are not published in this manual. All values not assigned by Arm are reserved and must not be used.". I assume the value was obtained by digging around https://github.com/apple/darwin-xnu/blob/main/osfmk/arm/cpuid.h#L62 > > Anton, this paragraph looks like an excellent comment. Thanks, I've added the comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 2 07:48:36 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 07:48:36 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Mon, 15 Feb 2021 17:59:54 GMT, Vladimir Kempik wrote: >> src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp line 810: >> >>> 808: #ifdef __APPLE__ >>> 809: // Less-than word types are stored one after another. >>> 810: // The code unable to handle this, bailout. >> >> Perhaps: // The code is unable to handle this so bailout. > > Hello, we have updated PR, now this bailout is used only by the code which can handle it (native wrapper generator), for the rest it will cause guarantee failed if this bailout is triggered I've fixed the spelling. Sorry for it to take so long :( >> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 195: >> >>> 193: frame os::get_sender_for_C_frame(frame* fr) { >>> 194: return frame(fr->link(), fr->link(), fr->sender_pc()); >>> 195: } >> >> Is this file going to be built by GCC or just macOS compilers? > > there is no support for compiling java with gcc on macos since about jdk11, only clang. > considering this and the absence of gcc for macos_m1, the answer is - just macOS compilers. I've fixed the comment. Now it states `JVM compiled with -fno-omit-frame-pointer, so RFP is saved on the stack.` ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 2 08:06:49 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 08:06:49 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Fri, 12 Feb 2021 12:40:09 GMT, Florian Weimer wrote: >> only macos comiplers > > The comment is also wrong for glibc: The AArch64 ABI requires a 64 KiB guard region independently of page size, otherwise `-fstack-clash-protection` is not reliable. Thanks, I deleted the comment. It describes implementation distinguishing initial and regular thread http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/548cb3b7b713#l12.7. I'm not sure why the comment was preserved for the bsd_x86, but I don't think it makes sense here. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 2 08:14:46 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 08:14:46 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Fri, 12 Feb 2021 13:32:52 GMT, Vladimir Kempik wrote: >> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 486: >> >>> 484: } >>> 485: } >>> 486: } >> >> This appears to be a mix for Mavericks (10.9) and 10.12 >> work arounds. Is this code needed by this project? > > I wasn't able to replicate JDK-8020753 and JDK-8186286. So will remove these workaround > @gerard-ziemski, 8020753 was originally your fix, do you know if it still needed on intel-mac ? The x86_bsd still carries the workaround https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp#L745. It's worth having macos ports to be aligned by features. I would propose to have this workaround for now, and decide on it later for macos/x86 and macos/aarch64 at once. Sorry for chiming in so late. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From ysuenaga at openjdk.java.net Tue Mar 2 08:26:46 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 2 Mar 2021 08:26:46 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Tue, 2 Mar 2021 07:25:18 GMT, Anton Kozlov wrote: >> VM_Version is not inherited, and platform-specific functions are declared in os_linux and os_windows. So I think it is difficult to declare shared (default) function for this purpose. >> It is the best if we declare shared function, and override it like a virtual function. But it seems to be difficult. Do you have any idea? > > Probably we can assume this function to optionally return a board name. IMHO ideally an empty string should a valid output. Now the output from this function is the prefix and the features string is the suffix of the complete string CPU description. What if we start with the constant `AArch64` prefix in the shared CPU code, append output from this function, and then append the features string? Do you mean `VM_Version::get_compatible_board()` should return `NULL` instead of `AArch64` to add it as a prefix at the caller? I don't want to do so because we should return board name as possible like a x86. In Windows AArch64, it seems to have SoC name in registry. Now I do not have Windows AArch64, so I just set "AArch64" for it, but I hope someone work for it in future. https://stackoverflow.com/questions/60588765/how-to-get-cpu-brand-information-in-arm64 ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From dongbo at openjdk.java.net Tue Mar 2 08:36:00 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 2 Mar 2021 08:36:00 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code Message-ID: The aarch64 LSE atomic operations are introduced to C++ hotspot code in JDK-8261027 and optimized in JDK-8261649. For memory_order_conservative, the acquire semantics in atomic instructions, i.e. ldaddal, swpal, casal, ensure that no subsequent accesses can pass the atomic operations. We also have a trailing dmb to ensure barrier-ordered-after relationship, it can ensure what the acquire does. So the acquire semantics is no longer needed, {ldaddl, swpl, casl} would be enough. Checked by using the herd7 consistency model simulator with the test in comments before `gen_cas_entry`: AArch64 LseCasAfter { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } P0 | P1 ; LDR W4, [X2] | MOV W3, #0 ; DMB LD | MOV W4, #1 ; LDR W3, [X1] | CASL W3, W4, [X1] ; | DMB ISH ; | STR W4, [X2] ; exists (0:X3=0 /\ 0:X4=1) No `X3 == 0 && X4 == 1` witnessed. Remove the acquire semantics does not allow prior accesses to pass the atomic operations, because the release semantics are still there. Just in case, checked by herd7 with the testcase below: AArch64 LseCasPrior { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } P0 | P1 ; LDR W3, [X1] | MOV W3, #0 ; DMB LD | MOV W4, #1 ; LDR W4, [X2] | STR W4, [X2] ; | CASL W3, W4, [X1] ; | DMB ISH ; exists (0:X3=1 /\ 0:X4=0) No `X3 == 1 && X4 == 0` witnessed. Similarly, the default implementations of `atomic_fetch_add` and `atomic_xchg` via `ldaxr+stlxr+dmb` can be replaced by `ldxr+stlxr+dmb`. ------------- Commit messages: - 8262519: AArch64: Unnecessary acquire semantics of atomics in C++ Hotspot code Changes: https://git.openjdk.java.net/jdk/pull/2788/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2788&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262519 Stats: 7 lines in 2 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/2788.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2788/head:pull/2788 PR: https://git.openjdk.java.net/jdk/pull/2788 From lucy at openjdk.java.net Tue Mar 2 08:37:50 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 08:37:50 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 22:54:15 GMT, David Holmes wrote: >> Thank you for your review, Tobias! >> I'll delay integration for a while to give David and Igor a chance to react. > > Hi @RealLucy , I didn't do an actual review just made a passing comment. @dholmes-ora OK then. I just didn't want to ignore anyone's opinion. I plan to integrate the change before my EOB (GMT+1), provided there are no objections. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From akozlov at openjdk.java.net Tue Mar 2 09:02:10 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 09:02:10 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Tue, 2 Feb 2021 22:18:43 GMT, Daniel D. Daugherty wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> support macos_aarch64 in hsdis > > src/hotspot/os_cpu/bsd_aarch64/thread_bsd_aarch64.cpp line 43: > >> 41: assert(Thread::current() == this, "caller must be current thread"); >> 42: return pd_get_top_frame(fr_addr, ucontext, isInJava); >> 43: } > > Is AsyncGetCallTrace() being supported by this port? I assume answer to be yes (I have a little experience with AsyncGetCallTrace). This code is identical to one in bsd/x86 and linux/aarch64. After few changes in the build system, I got jtreg/serviceability/AsyncGetCallTrace test compiled and passed. I've filed JDK-8262839 to enable the test. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From jbachorik at openjdk.java.net Tue Mar 2 09:07:55 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 09:07:55 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v6] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 22 Feb 2021 17:10:28 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Attempt to fix G1 live set size computation > > src/hotspot/share/gc/parallel/parallelScavengeHeap.inline.hpp line 49: > >> 47: _young_live = young_gen()->used_in_bytes(); >> 48: _eden_live = young_gen()->eden_space()->used_in_bytes(); >> 49: _old_live = old_gen()->used_in_bytes(); > > `_young_live` already seems to contain `_eden_live` looking at the implementation of `PSYoungGen::used_in_bytes()`: > > I.e. > > `size_t PSYoungGen::used_in_bytes() const { > return eden_space()->used_in_bytes() > + from_space()->used_in_bytes(); // to_space() is only used during scavenge > } > ` > > but maybe I'm wrong here. This seems like a correct summary - I have updated the implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 2 09:22:55 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 09:22:55 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v6] In-Reply-To: <8Q_OEFTu-Npp_Pr74VPWQ6DrQUzcaDtVZy7nru2RCbU=.7e80a5b9-be94-4442-beaf-9593b067241d@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8Q_OEFTu-Npp_Pr74VPWQ6DrQUzcaDtVZy7nru2RCbU=.7e80a5b9-be94-4442-beaf-9593b067241d@github.com> Message-ID: On Mon, 1 Mar 2021 15:24:48 GMT, Jaroslav Bachorik wrote: >> Another general comment about Shenandoah. It would seem easier to piggyback liveness summarization on region iteration that heuristics does at the end of mark anyway. See `ShenandoahHeuristics::choose_collection_set`. I can do that when you are done with your changes, or try it yourself. > > I have addressed comments with trivial fixes. > Will take a look at the remainder of more complex ones next. Going over my initial changes I am starting to doubt the usefulness of defaulting to `used` value when `live` estimate is not available. It seems to be giving false information - perhaps it would be ok to return `0` as an invalid value to indicate that that particular information is not available and it would be up to the event consumer to fall back to using the `used` value or deal with the missing value by some other means. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From aph at openjdk.java.net Tue Mar 2 09:35:43 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 2 Mar 2021 09:35:43 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 08:29:14 GMT, Dong Bo wrote: > The aarch64 LSE atomic operations are introduced to C++ hotspot code in JDK-8261027 and optimized in JDK-8261649. > For memory_order_conservative, the acquire semantics in atomic instructions, i.e. ldaddal, swpal, casal, ensure that no subsequent accesses can pass the atomic operations. > We also have a trailing dmb to ensure barrier-ordered-after relationship, it can ensure what the acquire does. So the acquire semantics is no longer needed, {ldaddl, swpl, casl} would be enough. > > Checked by using the herd7 consistency model simulator with the test in comments before `gen_cas_entry`: > AArch64 LseCasAfter > { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } > P0 | P1 ; > LDR W4, [X2] | MOV W3, #0 ; > DMB LD | MOV W4, #1 ; > LDR W3, [X1] | CASL W3, W4, [X1] ; > | DMB ISH ; > | STR W4, [X2] ; > exists > (0:X3=0 /\ 0:X4=1) > No `X3 == 0 && X4 == 1` witnessed. > > Remove the acquire semantics does not allow prior accesses to pass the atomic operations, because the release semantics are still there. > Just in case, checked by herd7 with the testcase below: > AArch64 LseCasPrior > { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } > P0 | P1 ; > LDR W3, [X1] | MOV W3, #0 ; > DMB LD | MOV W4, #1 ; > LDR W4, [X2] | STR W4, [X2] ; > | CASL W3, W4, [X1] ; > | DMB ISH ; > exists > (0:X3=1 /\ 0:X4=0) > No `X3 == 1 && X4 == 0` witnessed. > > Similarly, the default implementations of `atomic_fetch_add` and `atomic_xchg` via `ldaxr+stlxr+dmb` can be replaced by `ldxr+stlxr+dmb`. No. Try this with and without the acquire: Jon1 { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; } P0 | P1 ; MOV W0,#1 | LDR W0,[X1]; STR W0,[X1] | CASAL W5,W6,[X4]; MOV W2,#1 | LDR W2,[X3]; STLR W2,[X3] | ; exists (1:X0=1 /\ 1:X2=0)``` ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From akozlov at openjdk.java.net Tue Mar 2 09:36:51 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 09:36:51 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: <0dfmwagb4EGbnQnZIzpQ_36c9_fpvvebrgnKDW4E9Ps=.e54e9fb2-7500-40d8-8202-ce4903fd106f@github.com> On Tue, 2 Feb 2021 22:47:04 GMT, Daniel D. Daugherty wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> support macos_aarch64 in hsdis > > src/hotspot/share/prims/nativeEntryPoint.cpp line 45: > >> 43: guarantee(status == JNI_OK && !env->ExceptionOccurred(), >> 44: "register jdk.internal.invoke.NativeEntryPoint natives"); >> 45: JNI_END > > I thought that jcheck caught a missing new-line? It seems it did not https://github.com/openjdk/jdk/commit/0fb31dbf3a2adc0f7fb2f9924083908724dcde5a#diff-f39cd3f794a337734adf30863f702725ee04182fee2345b2669e59ebed17a2ccR44. Anyway I reverted this change as it is the only change in the file, thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From shade at openjdk.java.net Tue Mar 2 09:54:40 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 2 Mar 2021 09:54:40 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v6] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8Q_OEFTu-Npp_Pr74VPWQ6DrQUzcaDtVZy7nru2RCbU=.7e80a5b9-be94-4442-beaf-9593b067241d@github.com> Message-ID: <850jkqk912ORnuVtTM7JhVB4gP5BlxuggFktpkmJVvU=.f076b561-4b3a-4378-83b7-99b14b75e099@github.com> On Tue, 2 Mar 2021 09:20:21 GMT, Jaroslav Bachorik wrote: >> I have addressed comments with trivial fixes. >> Will take a look at the remainder of more complex ones next. > > Going over my initial changes I am starting to doubt the usefulness of defaulting to `used` value when `live` estimate is not available. It seems to be giving false information - perhaps it would be ok to return `0` as an invalid value to indicate that that particular information is not available and it would be up to the event consumer to fall back to using the `used` value or deal with the missing value by some other means. Shenandoah parts are better like this (applies on top of your PR): https://cr.openjdk.java.net/~shade/8258431/shenandoah.patch ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From akozlov at openjdk.java.net Tue Mar 2 11:07:56 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 11:07:56 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Tue, 2 Feb 2021 23:10:17 GMT, Daniel D. Daugherty wrote: > For platform files that were copied from other ports to this port, if the file wasn't > changed I presume the copyright years are left alone. If the file required changes > for this port, I expect the year to be updated to 2021. How are you verifying that > these copyright years are being properly managed on the new files? There are no exact copies, based on git -c diff.renameLimit=10000000 diff --find-copies-harder -C75% --name-status upstream/master... So every file changed in the branch potentially needs the copyright update. All file diffs are not trivial, IMHO. I'll run the copyright update after we fix a few remaining issues with the PR, to avoid updating copyright and changing/reverting the actual content. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From github.com+10482586+therealeliu at openjdk.java.net Tue Mar 2 11:19:53 2021 From: github.com+10482586+therealeliu at openjdk.java.net (Eric Liu) Date: Tue, 2 Mar 2021 11:19:53 GMT Subject: RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v11] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 06:10:01 GMT, Dong Bo wrote: >> In vectorAPI, when right-shifting a vector with a shift equals to the element width, the shift is transformed to zero, >> see `src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java`: >> /** Produce {@code a>>>(n&(ESIZE*8-1))}. Integral only. */ >> public static final /*bitwise*/ Binary LSHR = binary("LSHR", ">>>", VectorSupport.VECTOR_OP_URSHIFT, VO_SHIFT); >> >> The aarch64 assembler generates wrong or illegal instructions in this case, e.g. for the JAVA code below on aarch64, >> assembler call `__ ushr(dst, __ T8B, src, 0)`, the instruction generated is not `ushr dst.8B, src.8B, 0`, but `ushr dst.4H, src.4H, 16` instead. >> According to local tests, JVM gives wrong results for byte/short and crashes with SIGILL for integer/long. >> ByteVector vba = ByteVector.fromArray(byte64SPECIES, bytesA, 8 * i); >> vbb.lanewise(VectorOperators.ASHR, 8).intoArray(arrBytes, 8 * i); >> >> The legal right shift amount should be in the range 1 to the element width in bits on aarch64: >> https://developer.arm.com/documentation/dui0801/f/A64-SIMD-Vector-Instructions/USHR--vector-?lang=en >> >> This fix handles zero shift separately. If the shift is zero, it generates `orr` for right shift, `addv` for right shift and accumulate. >> Verified with linux-aarch64-server-fastdebug, tier1. Also created a jtreg to reproduce the issue and for regression tests. > > Dong Bo has updated the pull request incrementally with one additional commit since the last revision: > > refactor tests test/hotspot/jtreg/compiler/vectorapi/TestVectorShiftImm.java line 133: > 131: static int shift_with_op(VectorOperators.Binary op, ByteVector vbb, > 132: byte arr[][], int end, int ind) { > 133: vbb.lanewise(op, 1).intoArray(arr[end++], ind); How about adding case 0 in this test? Those codes were changed. ------------- PR: https://git.openjdk.java.net/jdk/pull/2472 From stuefe at openjdk.java.net Tue Mar 2 11:28:49 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 2 Mar 2021 11:28:49 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Tue, 2 Mar 2021 00:55:16 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Using regular brackets in initializer list instead of curly brackets Hi, Have we decided that the STL can be used now? If yes, I must have missed this. If no, could you please change the ConcurrentTestRunner to use GrowableArray or just a simply a malloced array? Thanks! ..Thomas ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2436 From jbachorik at openjdk.java.net Tue Mar 2 11:44:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 11:44:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v7] In-Reply-To: <850jkqk912ORnuVtTM7JhVB4gP5BlxuggFktpkmJVvU=.f076b561-4b3a-4378-83b7-99b14b75e099@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8Q_OEFTu-Npp_Pr74VPWQ6DrQUzcaDtVZy7nru2RCbU=.7e80a5b9-be94-4442-beaf-9593b067241d@github.com> <850jkqk912ORnuVtTM7JhVB4gP5BlxuggFktpkmJVvU=.f076b561-4b3a-4378-83b7-99b14b75e099@github.com> Message-ID: On Tue, 2 Mar 2021 09:50:20 GMT, Aleksey Shipilev wrote: >> Going over my initial changes I am starting to doubt the usefulness of defaulting to `used` value when `live` estimate is not available. It seems to be giving false information - perhaps it would be ok to return `0` as an invalid value to indicate that that particular information is not available and it would be up to the event consumer to fall back to using the `used` value or deal with the missing value by some other means. > > Shenandoah parts are better like this (applies on top of your PR): https://cr.openjdk.java.net/~shade/8258431/shenandoah.patch @shipilev Thanks! The patch has been applied. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 2 11:44:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 11:44:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v7] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <5FpR9OM3pJ2JvMS0p9s7vHaft1oVb9wGdtmSqL7IlcM=.a5a51fde-ce07-4b9a-a4a1-5d9df52eb9df@github.com> > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Proper Shenandoah implementation of live size esitmate Co-authored-by: Aleksey Shipilev ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/08c715ab..aa180d11 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=05-06 Stats: 45 lines in 6 files changed: 16 ins; 27 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 2 12:15:17 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 12:15:17 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v8] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Use '0' to indicate unvailable live estimate ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/aa180d11..6c6f8a8a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=06-07 Stats: 21 lines in 11 files changed: 1 ins; 4 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From dongbo at openjdk.java.net Tue Mar 2 12:24:54 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 2 Mar 2021 12:24:54 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: Message-ID: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> On Tue, 2 Mar 2021 09:33:12 GMT, Andrew Haley wrote: >> The aarch64 LSE atomic operations are introduced to C++ hotspot code in JDK-8261027 and optimized in JDK-8261649. >> For memory_order_conservative, the acquire semantics in atomic instructions, i.e. ldaddal, swpal, casal, ensure that no subsequent accesses can pass the atomic operations. >> We also have a trailing dmb to ensure barrier-ordered-after relationship, it can ensure what the acquire does. So the acquire semantics is no longer needed, {ldaddl, swpl, casl} would be enough. >> >> Checked by using the herd7 consistency model simulator with the test in comments before `gen_cas_entry`: >> AArch64 LseCasAfter >> { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } >> P0 | P1 ; >> LDR W4, [X2] | MOV W3, #0 ; >> DMB LD | MOV W4, #1 ; >> LDR W3, [X1] | CASL W3, W4, [X1] ; >> | DMB ISH ; >> | STR W4, [X2] ; >> exists >> (0:X3=0 /\ 0:X4=1) >> No `X3 == 0 && X4 == 1` witnessed. >> >> Remove the acquire semantics does not allow prior accesses to pass the atomic operations, because the release semantics are still there. >> Just in case, checked by herd7 with the testcase below: >> AArch64 LseCasPrior >> { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } >> P0 | P1 ; >> LDR W3, [X1] | MOV W3, #0 ; >> DMB LD | MOV W4, #1 ; >> LDR W4, [X2] | STR W4, [X2] ; >> | CASL W3, W4, [X1] ; >> | DMB ISH ; >> exists >> (0:X3=1 /\ 0:X4=0) >> No `X3 == 1 && X4 == 0` witnessed. >> >> Similarly, the default implementations of `atomic_fetch_add` and `atomic_xchg` via `ldaxr+stlxr+dmb` can be replaced by `ldxr+stlxr+dmb`. > > No. Try this with and without the acquire: > > Jon1 > { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; } > P0 | P1 ; > MOV W0,#1 | LDR W0,[X1]; > STR W0,[X1] | CASAL W5,W6,[X4]; > MOV W2,#1 | LDR W2,[X3]; > STLR W2,[X3] | ; > exists > (1:X0=1 /\ 1:X2=0)``` Without the acquire, `1:X0=1 /\ 1:X2=0` exists, then we do have a problem. But for `memory_order_conservative`, I guess there should be a trailing DMB right after the CASAL under current `gen_cas_entry` code: __ lse_cas(prev, exchange_val, ptr, size, acquire, release, /*not_pair*/true); if (order == memory_order_conservative) { __ membar(Assembler::StoreStore|Assembler::StoreLoad); } That is: AArch64 exper { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; } P0 | P1 ; MOV W0,#1 | LDR W0,[X1] ; STR W0,[X1] | CASAL W5,W6,[X4] ; MOV W2,#1 | DMB ISH ; STLR W2,[X3] | LDR W2,[X3] ; exists (1:X0=1 /\ 1:X2=0) With the DMB, herd7 produces same results with or without acquire. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From aph at openjdk.java.net Tue Mar 2 12:33:58 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 2 Mar 2021 12:33:58 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Tue, 2 Mar 2021 12:21:53 GMT, Dong Bo wrote: >> No. Try this with and without the acquire: >> >> Jon1 >> { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; } >> P0 | P1 ; >> MOV W0,#1 | LDR W0,[X1]; >> STR W0,[X1] | CASAL W5,W6,[X4]; >> MOV W2,#1 | LDR W2,[X3]; >> STLR W2,[X3] | ; >> exists >> (1:X0=1 /\ 1:X2=0)``` > > Without the acquire, `1:X0=1 /\ 1:X2=0` exists, then we do have a problem. > But for `memory_order_conservative`, I guess there should be a trailing DMB right after the CASAL under current `gen_cas_entry` code: > __ lse_cas(prev, exchange_val, ptr, size, acquire, release, /*not_pair*/true); > if (order == memory_order_conservative) { > __ membar(Assembler::StoreStore|Assembler::StoreLoad); > } > That is: > AArch64 exper > { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; } > P0 | P1 ; > MOV W0,#1 | LDR W0,[X1] ; > STR W0,[X1] | CASAL W5,W6,[X4] ; > MOV W2,#1 | DMB ISH ; > STLR W2,[X3] | LDR W2,[X3] ; > exists > (1:X0=1 /\ 1:X2=0) > With the DMB, herd7 produces same results with or without acquire. > > Thanks. I don't want to see this go in, for two reasons. Firstly, barrier-ordered-before only applies to atomic instructions with both acquire *and* release semantics, as the comment says. So we cannot rely on that guarantee, and we'd need some much more detailed analysis to make this changes. Secondly, the architecture specification is being revised, and the result will hopefully be that we can get a better version than this. So please, leave this alone for now. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From dongbo at openjdk.java.net Tue Mar 2 12:55:59 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 2 Mar 2021 12:55:59 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Tue, 2 Mar 2021 12:30:29 GMT, Andrew Haley wrote: >> Without the acquire, `1:X0=1 /\ 1:X2=0` exists, then we do have a problem. >> But for `memory_order_conservative`, I guess there should be a trailing DMB right after the CASAL under current `gen_cas_entry` code: >> __ lse_cas(prev, exchange_val, ptr, size, acquire, release, /*not_pair*/true); >> if (order == memory_order_conservative) { >> __ membar(Assembler::StoreStore|Assembler::StoreLoad); >> } >> That is: >> AArch64 exper >> { 0:X1=x; 0:X3=y; 1:X1=y; 1:X3=x; 1:X4=z; } >> P0 | P1 ; >> MOV W0,#1 | LDR W0,[X1] ; >> STR W0,[X1] | CASAL W5,W6,[X4] ; >> MOV W2,#1 | DMB ISH ; >> STLR W2,[X3] | LDR W2,[X3] ; >> exists >> (1:X0=1 /\ 1:X2=0) >> With the DMB, herd7 produces same results with or without acquire. >> >> Thanks. > > I don't want to see this go in, for two reasons. Firstly, barrier-ordered-before only applies to atomic instructions with both acquire *and* release semantics, as the comment says. So we cannot rely on that guarantee, and we'd need some much more detailed analysis to make this changes. Secondly, the architecture specification is being revised, and the result will hopefully be that we can get a better version than this. So please, leave this alone for now. OKAY, this make sense to us. If it is OK to keep the exclusive part of this patch? :-) As far as we know, the exclusive instructions are not being revised. And we see `ldxr+stxlr+dmb` have been used in linux kernel since 2014 [1], and still used by now [2]. [1] https://patchwork.kernel.org/project/linux-arm-kernel/patch/1391516953-14541-1-git-send-email-will.deacon at arm.com/ [2] https://github.com/torvalds/linux/blob/7a7fd0de4a9804299793e564a555a49c1fc924cb/arch/arm64/include/asm/atomic_ll_sc.h#L102 ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From dongbo at openjdk.java.net Tue Mar 2 13:22:57 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 2 Mar 2021 13:22:57 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Tue, 2 Mar 2021 12:52:59 GMT, Dong Bo wrote: >> I don't want to see this go in, for two reasons. Firstly, barrier-ordered-before only applies to atomic instructions with both acquire *and* release semantics, as the comment says. So we cannot rely on that guarantee, and we'd need some much more detailed analysis to make this changes. Secondly, the architecture specification is being revised, and the result will hopefully be that we can get a better version than this. So please, leave this alone for now. > > OKAY, this make sense to us. > > If it is OK to keep the exclusive part of this patch? :-) > As far as we know, the exclusive instructions are not being revised. > And we see `ldxr+stxlr+dmb` have been used in linux kernel since 2014 [1], and still used by now [2]. > > [1] https://patchwork.kernel.org/project/linux-arm-kernel/patch/1391516953-14541-1-git-send-email-will.deacon at arm.com/ > [2] https://github.com/torvalds/linux/blob/7a7fd0de4a9804299793e564a555a49c1fc924cb/arch/arm64/include/asm/atomic_ll_sc.h#L102 BTW, the barrier-ordered-before applies with stlxr according to the architecture specification: ... any of the following cases apply: ... RW2 is a write W2 and either: ? RW1 is a write W1 appearing in program order before a DMB ST that appears in program order before W2. ? **W2 is generated by an instruction with Release semantics.** ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From dongbo at openjdk.java.net Tue Mar 2 13:46:05 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 2 Mar 2021 13:46:05 GMT Subject: RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v12] In-Reply-To: References: Message-ID: > In vectorAPI, when right-shifting a vector with a shift equals to the element width, the shift is transformed to zero, > see `src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java`: > /** Produce {@code a>>>(n&(ESIZE*8-1))}. Integral only. */ > public static final /*bitwise*/ Binary LSHR = binary("LSHR", ">>>", VectorSupport.VECTOR_OP_URSHIFT, VO_SHIFT); > > The aarch64 assembler generates wrong or illegal instructions in this case, e.g. for the JAVA code below on aarch64, > assembler call `__ ushr(dst, __ T8B, src, 0)`, the instruction generated is not `ushr dst.8B, src.8B, 0`, but `ushr dst.4H, src.4H, 16` instead. > According to local tests, JVM gives wrong results for byte/short and crashes with SIGILL for integer/long. > ByteVector vba = ByteVector.fromArray(byte64SPECIES, bytesA, 8 * i); > vbb.lanewise(VectorOperators.ASHR, 8).intoArray(arrBytes, 8 * i); > > The legal right shift amount should be in the range 1 to the element width in bits on aarch64: > https://developer.arm.com/documentation/dui0801/f/A64-SIMD-Vector-Instructions/USHR--vector-?lang=en > > This fix handles zero shift separately. If the shift is zero, it generates `orr` for right shift, `addv` for right shift and accumulate. > Verified with linux-aarch64-server-fastdebug, tier1. Also created a jtreg to reproduce the issue and for regression tests. Dong Bo has updated the pull request incrementally with one additional commit since the last revision: make zero shift amount obviously ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2472/files - new: https://git.openjdk.java.net/jdk/pull/2472/files/24d6e9f8..00745e33 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2472&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2472&range=10-11 Stats: 9 lines in 1 file changed: 8 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2472.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2472/head:pull/2472 PR: https://git.openjdk.java.net/jdk/pull/2472 From dongbo at openjdk.java.net Tue Mar 2 13:49:57 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 2 Mar 2021 13:49:57 GMT Subject: RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v11] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 11:14:39 GMT, Eric Liu wrote: >> Dong Bo has updated the pull request incrementally with one additional commit since the last revision: >> >> refactor tests > > test/hotspot/jtreg/compiler/vectorapi/TestVectorShiftImm.java line 133: > >> 131: static int shift_with_op(VectorOperators.Binary op, ByteVector vbb, >> 132: byte arr[][], int end, int ind) { >> 133: vbb.lanewise(op, 1).intoArray(arr[end++], ind); > > How about adding case 0 in this test? Those codes were changed. Thank you for watching this. It can be covered by `lanewise(op, 8)`, `lanewise(op, 16)` and `lanewise(op, 24)`. But I still added it, it looks clearer. @theRealAph could you please take a look at the newest version? The tests work well on our two very different aarch64 platforms. Or the tests still looks too complicated? Please let me know if you have any suggestions. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/2472 From jbachorik at openjdk.java.net Tue Mar 2 14:33:14 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 14:33:14 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Add tests for the heap usage summary event ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/6c6f8a8a..f6954186 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=07-08 Stats: 24 lines in 5 files changed: 24 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 2 14:33:14 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 14:33:14 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8Q_OEFTu-Npp_Pr74VPWQ6DrQUzcaDtVZy7nru2RCbU=.7e80a5b9-be94-4442-beaf-9593b067241d@github.com> <850jkqk912ORnuVtTM7JhVB4gP5BlxuggFktpkmJVvU=.f076b561-4b3a-4378-83b7-99b14b75e099@github.com> Message-ID: On Tue, 2 Mar 2021 11:41:19 GMT, Jaroslav Bachorik wrote: >> Shenandoah parts are better like this (applies on top of your PR): https://cr.openjdk.java.net/~shade/8258431/shenandoah.patch > > @shipilev Thanks! The patch has been applied. @pliden @tschatzl @albertnetymk @egahlin I believe I addressed all review comments. Can you, please, take a second look? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 2 14:33:15 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 2 Mar 2021 14:33:15 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 18 Feb 2021 10:25:34 GMT, Aleksey Shipilev wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Add tests for the heap usage summary event > > src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp line 511: > >> 509: >> 510: ShenandoahCollectLiveSizeClosure cl; >> 511: heap_region_iterate(&cl); > > I think you want `parallel_heap_region_iterate` on this path, and do `Atomic::add(&_live, r->get_live_data_bytes())` in the closure. We shall see if this makes sense to make fully concurrently... Outdated by the patch provided by @shipilev ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From akozlov at openjdk.java.net Tue Mar 2 14:34:51 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 14:34:51 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v10] In-Reply-To: References: Message-ID: On Thu, 4 Feb 2021 22:15:47 GMT, Gerard Ziemski wrote: >> Anton Kozlov has updated the pull request incrementally with six additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/jdk/jdk-macos' into jdk-macos >> - Add comments to WX transitions >> >> + minor change of placements >> - Use macro conditionals instead of empty functions >> - Add W^X to tests >> - Do not require known W^X state >> - Revert w^x in gtests > > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 322: > >> 320: #ifdef __APPLE__ >> 321: } else if (sig == SIGFPE && info->si_code == FPE_NOOP) { >> 322: Unimplemented(); > > Is there a follow up issue for this? Thanks, this is a leftover from the development phase, it will be removed. In macos/x86, this looks like a workaround. We've never met with this condition and it looks recent darwin kernel should correctly report the cause in si_code: https://github.com/apple/darwin-xnu/blob/33eb9835cd948dbbcdd8741aa52457cbe507c765/bsd/dev/arm/unix_signal.c#L436. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From aph at openjdk.java.net Tue Mar 2 14:56:41 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 2 Mar 2021 14:56:41 GMT Subject: RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v12] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 13:46:05 GMT, Dong Bo wrote: >> In vectorAPI, when right-shifting a vector with a shift equals to the element width, the shift is transformed to zero, >> see `src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java`: >> /** Produce {@code a>>>(n&(ESIZE*8-1))}. Integral only. */ >> public static final /*bitwise*/ Binary LSHR = binary("LSHR", ">>>", VectorSupport.VECTOR_OP_URSHIFT, VO_SHIFT); >> >> The aarch64 assembler generates wrong or illegal instructions in this case, e.g. for the JAVA code below on aarch64, >> assembler call `__ ushr(dst, __ T8B, src, 0)`, the instruction generated is not `ushr dst.8B, src.8B, 0`, but `ushr dst.4H, src.4H, 16` instead. >> According to local tests, JVM gives wrong results for byte/short and crashes with SIGILL for integer/long. >> ByteVector vba = ByteVector.fromArray(byte64SPECIES, bytesA, 8 * i); >> vbb.lanewise(VectorOperators.ASHR, 8).intoArray(arrBytes, 8 * i); >> >> The legal right shift amount should be in the range 1 to the element width in bits on aarch64: >> https://developer.arm.com/documentation/dui0801/f/A64-SIMD-Vector-Instructions/USHR--vector-?lang=en >> >> This fix handles zero shift separately. If the shift is zero, it generates `orr` for right shift, `addv` for right shift and accumulate. >> Verified with linux-aarch64-server-fastdebug, tier1. Also created a jtreg to reproduce the issue and for regression tests. > > Dong Bo has updated the pull request incrementally with one additional commit since the last revision: > > make zero shift amount obviously Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2472 From aph at openjdk.java.net Tue Mar 2 16:03:50 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 2 Mar 2021 16:03:50 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Tue, 2 Mar 2021 12:52:59 GMT, Dong Bo wrote: > OKAY, this make sense to us. > > If it is OK to keep the exclusive part of this patch? :-) > As far as we know, the exclusive instructions are not being revised. > And we see `ldxr+stxlr+dmb` have been used in linux kernel since 2014 [1], and still used by now [2]. I know that, but the Linux definition of a "full barrier" isn't quite as strong as HotSpot's ```memory_order_conservative```, so we'd need a much more detailed analysis of what behaviours we can permit. Also, we'd have to find a strong reason to invest time in AArch64 without LSE instructions. > BTW, the barrier-ordered-before applies with stlxr according to the architecture specification: Sure, but so what? This is about the entire ldxr/stlxr combination and ```memory_order_conservative``` , in which we try to mimic Intel's "Loads and Stores Are Not Reordered with Locked Instructions" specification. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From ayang at openjdk.java.net Tue Mar 2 16:04:40 2021 From: ayang at openjdk.java.net (Albert Mingkun Yang) Date: Tue, 2 Mar 2021 16:04:40 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Tue, 2 Mar 2021 14:33:14 GMT, Jaroslav Bachorik wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Add tests for the heap usage summary event Marked as reviewed by ayang (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From iveresov at openjdk.java.net Tue Mar 2 17:13:01 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 17:13:01 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 08:34:29 GMT, Lutz Schmidt wrote: >> Hi @RealLucy , I didn't do an actual review just made a passing comment. > > @dholmes-ora OK then. I just didn't want to ignore anyone's opinion. > I plan to integrate the change before my EOB (GMT+1), provided there are no objections. I left a comment a while ago about the unsigned int casts (in ```Method::print_invocation_count()```) that I don't understand the reason for. Could you please comment on that? ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From kvn at openjdk.java.net Tue Mar 2 17:17:49 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 2 Mar 2021 17:17:49 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> On Thu, 25 Feb 2021 09:01:10 GMT, Lutz Schmidt wrote: >> Dear community, >> may I please request reviews for this fix, improving the usefulness of method invocation counters. >> - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). >> - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. >> - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. >> - before/after sample output is attached to the bug description. >> >> Thank you! >> Lutz > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > comment changes requested by TheRealMDoerr I have few comments. src/hotspot/cpu/x86/vtableStubs_x86_32.cpp line 159: > 157: return NULL; > 158: } > 159: Why you did not update asm instruction to update `nof_megamorphic_calls` in this file? src/hotspot/share/oops/method.cpp line 530: > 528: > 529: if (method_data() != NULL) { > 530: unsigned int dcc = (unsigned int)method_data()->decompile_count(); decompile_count() returns `uint` why do cast and why you check decompile_count for overflow? It is very rare updated and limited by `PerMethodRecompilationCutoff` flag (400 by default): https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/methodData.hpp#L2391 src/hotspot/share/oops/method.cpp line 518: > 516: // Print a "overflow" notification to create awareness. > 517: const char* addMsg; > 518: unsigned int maxInt = (1U<<31) - 1; Why not use INT_MAX? ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Tue Mar 2 17:26:42 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 17:26:42 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: On Tue, 2 Mar 2021 16:34:26 GMT, Vladimir Kozlov wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> comment changes requested by TheRealMDoerr > > src/hotspot/cpu/x86/vtableStubs_x86_32.cpp line 159: > >> 157: return NULL; >> 158: } >> 159: > > Why you did not update asm instruction to update `nof_megamorphic_calls` in this file? The reason is plain simple: there is no incrementq() for x86_32. I could emulate that with a few lines like address ctrAddr = (address)SharedRuntime::nof_megamorphic_calls_addr(); __ lea(rscratch1, ExternalAddress(ctrAddr)); __ addl(Address(rscratch1, 0), 1); __ adcl(Address(rscratch1, 4), 0); Not sure if that would be desirable here. Just let me know. As is, the code just updates the less significant half of the 8-byte counter. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Tue Mar 2 17:33:52 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 17:33:52 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: On Tue, 2 Mar 2021 16:44:44 GMT, Vladimir Kozlov wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> comment changes requested by TheRealMDoerr > > src/hotspot/share/oops/method.cpp line 530: > >> 528: >> 529: if (method_data() != NULL) { >> 530: unsigned int dcc = (unsigned int)method_data()->decompile_count(); > > decompile_count() returns `uint` why do cast and why you check decompile_count for overflow? It is very rare updated and limited by `PerMethodRecompilationCutoff` flag (400 by default): > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/oops/methodData.hpp#L2391 That uint slipped my attention. And with the cutoff parameter, overflow is no issue. I can remove the cast and the check. > src/hotspot/share/oops/method.cpp line 518: > >> 516: // Print a "overflow" notification to create awareness. >> 517: const char* addMsg; >> 518: unsigned int maxInt = (1U<<31) - 1; > > Why not use INT_MAX? Will change. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Tue Mar 2 17:37:43 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 17:37:43 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: On Tue, 2 Mar 2021 17:14:34 GMT, Vladimir Kozlov wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> comment changes requested by TheRealMDoerr > > I have few comments. @veresov I can't see your comment re. the casts. The only comment I see is re. the *64 suffixes. Anyway, is your question/comment directly related to Vladimir's annotations? Or do you need further reasoning? Please let me know. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From shade at openjdk.java.net Tue Mar 2 17:40:44 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 2 Mar 2021 17:40:44 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Tue, 2 Mar 2021 14:33:14 GMT, Jaroslav Bachorik wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Add tests for the heap usage summary event Shenandoah parts look good. I have a few minor stylistic comments. src/hotspot/share/gc/shared/space.inline.hpp line 190: > 188: oop obj = oop(cur_obj); > 189: size_t obj_size = obj->size(); > 190: compact_top = cp->space->forward(obj, obj_size, cp, compact_top); This change seems superfluous now. Inline `obj_size` back? src/hotspot/share/gc/shared/space.hpp line 555: > 553: size_t live() const { > 554: return used() - _dead_space; > 555: } Move it a few lines down, so `capacity`, `used`, `live` line up? src/hotspot/share/gc/shared/collectedHeap.hpp line 218: > 216: virtual size_t capacity() const = 0; > 217: virtual size_t used() const = 0; > 218: // Returns the estimate of live set size. Because live set changes over time, I believe a blank line is in order here, look at other comments in the same header. src/hotspot/share/gc/shared/space.inline.hpp line 90: > 88: > 89: public: > 90: size_t _dead_space; Should this really be "public"? Maybe `friend`-ing with the only user is better? ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2579 From kvn at openjdk.java.net Tue Mar 2 17:52:45 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 2 Mar 2021 17:52:45 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: On Tue, 2 Mar 2021 17:23:23 GMT, Lutz Schmidt wrote: >> src/hotspot/cpu/x86/vtableStubs_x86_32.cpp line 159: >> >>> 157: return NULL; >>> 158: } >>> 159: >> >> Why you did not update asm instruction to update `nof_megamorphic_calls` in this file? > > The reason is plain simple: there is no incrementq() for x86_32. I could emulate that with a few lines like > address ctrAddr = (address)SharedRuntime::nof_megamorphic_calls_addr(); > __ lea(rscratch1, ExternalAddress(ctrAddr)); > __ addl(Address(rscratch1, 0), 1); > __ adcl(Address(rscratch1, 4), 0); > Not sure if that would be desirable here. Just let me know. As is, the code just updates the less significant half of the 8-byte counter. Okay, let keep as it is. Then revert this file back - the only change is new empty line. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From mseledtsov at openjdk.java.net Tue Mar 2 18:06:47 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Tue, 2 Mar 2021 18:06:47 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Tue, 2 Mar 2021 11:26:21 GMT, Thomas Stuefe wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Using regular brackets in initializer list instead of curly brackets > > Hi, > > Have we decided that the STL can be used now? If yes, I must have missed this. If no, could you please change the ConcurrentTestRunner to use GrowableArray or just a simply a malloced array? > > Thanks! > > ..Thomas @tstuefe @iignatev On a question: Have we decided that the STL can be used now? Hi Thomas, I do most of my JDK work in Java, hence may have missed knowing the restriction of using STL. I wonder if this restriction is only for source code, and not the tests. I did a quick search under "test/hotspot/gtest", and found many uses of std::, including the data structures. For instance, jfr/test_networkUtilization.cpp uses std::map, std::list and std::vector. Thanks, Misha ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From iveresov at openjdk.java.net Tue Mar 2 18:56:55 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 18:56:55 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: On Tue, 2 Mar 2021 17:35:18 GMT, Lutz Schmidt wrote: >> I have few comments. > > @veresov I can't see your comment re. the casts. The only comment I see is re. the *64 suffixes. > Anyway, is your question/comment directly related to Vladimir's annotations? Or do you need further reasoning? Please let me know. No it was a different question. I see my comments try clicking on the "Files changed" tab and scrolling down to method.cpp. I also left a comment about your changes to ```compare_methods()``` I don't think it's correct. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Tue Mar 2 19:28:41 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 19:28:41 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: On Tue, 2 Mar 2021 18:53:47 GMT, Igor Veresov wrote: >> @veresov I can't see your comment re. the casts. The only comment I see is re. the *64 suffixes. >> Anyway, is your question/comment directly related to Vladimir's annotations? Or do you need further reasoning? Please let me know. > > No it was a different question. I see my comments try clicking on the "Files changed" tab and scrolling down to method.cpp. > I also left a comment about your changes to ```compare_methods()``` I don't think it's correct. Igor, I'm sorry, I can't see your comments. Neither those for method.cpp nor those for java.cpp. I checked the piper mail traffic as well. No mail from you after the *64 suffix comment. For simplicity, could you just paste your comment here, please? Or send them to me via mail (off-list, first.last at sap.com)? ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 19:50:42 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 19:50:42 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: On Tue, 2 Mar 2021 19:25:26 GMT, Lutz Schmidt wrote: >> No it was a different question. I see my comments try clicking on the "Files changed" tab and scrolling down to method.cpp. >> I also left a comment about your changes to ```compare_methods()``` I don't think it's correct. > > Igor, I'm sorry, I can't see your comments. Neither those for method.cpp nor those for java.cpp. I checked the piper mail traffic as well. No mail from you after the *64 suffix comment. For simplicity, could you just paste your comment here, please? Or send them to me via mail (off-list, first.last at sap.com)? Ok, odd. I've sent you an email. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 20:04:04 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 20:04:04 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v2] In-Reply-To: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> References: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> Message-ID: <7wsUXkZzWxQxKZU2obDzcql7ADmt3pFpyaVUe3a0D_o=.d4f62585-cd23-4e58-9ccd-2c6ebe555f43@github.com> On Thu, 11 Feb 2021 21:14:04 GMT, Igor Veresov wrote: >> Lutz Schmidt has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. > > src/hotspot/share/oops/method.cpp line 516: > >> 514: // This is ok because counters are unsigned by nature, and it gives us >> 515: // another factor of 2 before the counter values become meaningless. >> 516: // Print a "overflow" notification to create awareness. > > What ```invocation_count()``` returns (which is currently equivalent to ```interpreter_invocation_count()``` btw) comes from the ```InvocationCounter::count()``` which cannot grow beyond 2^31, so all these counts are always positive. What exactly do these casts to unsigned do? So, why do we need the casts to unsigned in this method? ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 20:04:03 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 20:04:03 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v2] In-Reply-To: References: Message-ID: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> On Thu, 11 Feb 2021 17:47:54 GMT, Lutz Schmidt wrote: >> Dear community, >> may I please request reviews for this fix, improving the usefulness of method invocation counters. >> - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). >> - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. >> - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. >> - before/after sample output is attached to the bug description. >> >> Thank you! >> Lutz > > Lutz Schmidt has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. Changes requested by iveresov (Reviewer). src/hotspot/share/oops/method.cpp line 516: > 514: // This is ok because counters are unsigned by nature, and it gives us > 515: // another factor of 2 before the counter values become meaningless. > 516: // Print a "overflow" notification to create awareness. What ```invocation_count()``` returns (which is currently equivalent to ```interpreter_invocation_count()``` btw) comes from the ```InvocationCounter::count()``` which cannot grow beyond 2^31, so all these counts are always positive. What exactly do these casts to unsigned do? ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 20:04:06 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 20:04:06 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Thu, 25 Feb 2021 09:01:10 GMT, Lutz Schmidt wrote: >> Dear community, >> may I please request reviews for this fix, improving the usefulness of method invocation counters. >> - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). >> - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. >> - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. >> - before/after sample output is attached to the bug description. >> >> Thank you! >> Lutz > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > comment changes requested by TheRealMDoerr src/hotspot/share/runtime/java.cpp line 100: > 98: int compare_methods(Method** a, Method** b) { > 99: return (int32_t)(((uint32_t)(*b)->invocation_count() + (*b)->compiled_invocation_count()) > 100: - ((uint32_t)(*a)->invocation_count() + (*a)->compiled_invocation_count())); Is this correct? The arithmetic look to be: (int32_t) (uint64_t - uint64_t). If the 64 values inside don't fit in 32, you'll get a negative number which would break the sorting logic. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 20:04:06 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 20:04:06 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> Message-ID: <39PPAynJ6dWRnCwIgxerBB2IiJm0BFwt3TQ2DqcUJLk=.345fcb18-0370-4f86-ad14-787529496fcc@github.com> On Tue, 2 Mar 2021 17:30:47 GMT, Lutz Schmidt wrote: >> src/hotspot/share/oops/method.cpp line 518: >> >>> 516: // Print a "overflow" notification to create awareness. >>> 517: const char* addMsg; >>> 518: unsigned int maxInt = (1U<<31) - 1; >> >> Why not use INT_MAX? > > Will change. I'd be better to change the logic to check if the counter is ```>= InvocationCounter::count_limit``` then it's in overflow. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From akozlov at openjdk.java.net Tue Mar 2 20:32:49 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 20:32:49 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v10] In-Reply-To: <9bn80xM9g41OytCtH71-krOZtAgy27TbvTFJHmfmKrE=.e9d223d5-2922-4891-8f45-ee039d88ced6@github.com> References: <9bn80xM9g41OytCtH71-krOZtAgy27TbvTFJHmfmKrE=.e9d223d5-2922-4891-8f45-ee039d88ced6@github.com> Message-ID: On Thu, 4 Feb 2021 22:34:16 GMT, Gerard Ziemski wrote: >> Anton Kozlov has updated the pull request incrementally with six additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/jdk/jdk-macos' into jdk-macos >> - Add comments to WX transitions >> >> + minor change of placements >> - Use macro conditionals instead of empty functions >> - Add W^X to tests >> - Do not require known W^X state >> - Revert w^x in gtests > > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 237: > >> 235: os::Posix::ucontext_set_pc(uc, StubRoutines::continuation_for_safefetch_fault(pc)); >> 236: return true; >> 237: } > > Isn't this case already handled by `JVM_HANDLE_XXX_SIGNAL()` ? Why do we need it here again? Good point, thanks. We are missing a few fixes in os_cpu/bsd_aarch64. This was moved out recently. I'm going to align bsd_aarch64 with the rest of platforms. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From iveresov at openjdk.java.net Tue Mar 2 20:50:44 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 20:50:44 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Thu, 25 Feb 2021 16:36:38 GMT, Igor Veresov wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> comment changes requested by TheRealMDoerr > > src/hotspot/share/runtime/java.cpp line 100: > >> 98: int compare_methods(Method** a, Method** b) { >> 99: return (int32_t)(((uint32_t)(*b)->invocation_count() + (*b)->compiled_invocation_count()) >> 100: - ((uint32_t)(*a)->invocation_count() + (*a)->compiled_invocation_count())); > > Is this correct? The arithmetic look to be: (int32_t) (uint64_t - uint64_t). If the 64 values inside don't fit in 32, you'll get a negative number which would break the sorting logic. I see that you've fixed the types since the last comment, but it think it's still broken (and has been before). How about: int64_t diff = ((*b)->compiled_invocation_count() - (*a)->compiled_invocation_count()) + ((*b)->invocation_count() - (*a)->invocation_count()); if (diff > 0) return 1; else if (diff < 0) return -1; else return 0; It's kind of hacky too, because it assumes that compiled_invocation_count() are positive and didn't overflow. But at least we'd get rid of a possible overflow during summation. What do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Tue Mar 2 20:50:43 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 20:50:43 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v2] In-Reply-To: <7wsUXkZzWxQxKZU2obDzcql7ADmt3pFpyaVUe3a0D_o=.d4f62585-cd23-4e58-9ccd-2c6ebe555f43@github.com> References: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> <7wsUXkZzWxQxKZU2obDzcql7ADmt3pFpyaVUe3a0D_o=.d4f62585-cd23-4e58-9ccd-2c6ebe555f43@github.com> Message-ID: <3cDAX-RUCe2SA1O1Mq_Ge152dzwNUSDj3ZQB0ziYJxM=.e594bfe0-4d68-46fc-8d59-16b2dfa3abf1@github.com> On Thu, 25 Feb 2021 16:31:58 GMT, Igor Veresov wrote: >> src/hotspot/share/oops/method.cpp line 516: >> >>> 514: // This is ok because counters are unsigned by nature, and it gives us >>> 515: // another factor of 2 before the counter values become meaningless. >>> 516: // Print a "overflow" notification to create awareness. >> >> What ```invocation_count()``` returns (which is currently equivalent to ```interpreter_invocation_count()``` btw) comes from the ```InvocationCounter::count()``` which cannot grow beyond 2^31, so all these counts are always positive. What exactly do these casts to unsigned do? > > So, why do we need the casts to unsigned in this method? When you increment (2^31-1), you get 2^31 which is 0x80000000. When interpreted as signed int, it is MIN_INT. I don't want that. I want to treat the value as positive number - what it actually is. There is no negative count! ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Tue Mar 2 20:50:45 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 20:50:45 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: <39PPAynJ6dWRnCwIgxerBB2IiJm0BFwt3TQ2DqcUJLk=.345fcb18-0370-4f86-ad14-787529496fcc@github.com> References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> <39PPAynJ6dWRnCwIgxerBB2IiJm0BFwt3TQ2DqcUJLk=.345fcb18-0370-4f86-ad14-787529496fcc@github.com> Message-ID: On Tue, 2 Mar 2021 18:50:24 GMT, Igor Veresov wrote: >> Will change. > > I'd be better to change the logic to check if the counter is ```>= InvocationCounter::count_limit``` then it's in overflow. With overflow I do not mean "counter overflow" as it is used to trigger compiler activities but plain simple "range of singed int exceeded". Should I use different wording? E.g. "counter in signed int overflow"? ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 20:58:47 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 20:58:47 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> <39PPAynJ6dWRnCwIgxerBB2IiJm0BFwt3TQ2DqcUJLk=.345fcb18-0370-4f86-ad14-787529496fcc@github.com> Message-ID: On Tue, 2 Mar 2021 20:47:25 GMT, Lutz Schmidt wrote: >> I'd be better to change the logic to check if the counter is ```>= InvocationCounter::count_limit``` then it's in overflow. > > With overflow I do not mean "counter overflow" as it is used to trigger compiler activities but plain simple "range of singed int exceeded". Should I use different wording? E.g. "counter in signed int overflow"? Then it will never happen. These values come from ```InvocationCounter::count()```. And it will never return a value > 2^31 - 1. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 21:02:39 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 21:02:39 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> <39PPAynJ6dWRnCwIgxerBB2IiJm0BFwt3TQ2DqcUJLk=.345fcb18-0370-4f86-ad14-787529496fcc@github.com> Message-ID: On Tue, 2 Mar 2021 20:55:50 GMT, Igor Veresov wrote: >> With overflow I do not mean "counter overflow" as it is used to trigger compiler activities but plain simple "range of singed int exceeded". Should I use different wording? E.g. "counter in signed int overflow"? > > Then it will never happen. These values come from ```InvocationCounter::count()```. And it will never return a value > 2^31 - 1. Sorry, it can return 2^31. Which would be an indication of a counter overflow (InvocationCounter::count_limit == 2^31). Your code is correct in both senses. It is a counter overflow and a signed int overflow. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From iveresov at openjdk.java.net Tue Mar 2 21:11:51 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 21:11:51 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: <3mk54i0LQX0awCjtEnYDpf0C9QiCERjP7pOLI6GAvBU=.53cd8ad8-2237-46c9-afa4-72d5f5bbac06@github.com> <39PPAynJ6dWRnCwIgxerBB2IiJm0BFwt3TQ2DqcUJLk=.345fcb18-0370-4f86-ad14-787529496fcc@github.com> Message-ID: <8CM8mqTq7RmwR2sjk0J69O19UQXASFKghQyIy6lKduA=.101780ec-9039-4749-a27b-6dd0997bd398@github.com> On Tue, 2 Mar 2021 20:59:46 GMT, Igor Veresov wrote: >> Then it will never happen. These values come from ```InvocationCounter::count()```. And it will never return a value > 2^31 - 1. > > Sorry, it can return 2^31. Which would be an indication of a counter overflow (InvocationCounter::count_limit == 2^31). Your code is correct in both senses. It is a counter overflow and a signed int overflow. Oops, my arithmetic is bad again. InvocationCounter::count_limit is 2^30. So, I don't think there ever going to be an overflow that you're looking for. The sign bit is always 0. Or am I missing something again? :) ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Tue Mar 2 21:11:52 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 21:11:52 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 20:46:01 GMT, Igor Veresov wrote: >> src/hotspot/share/runtime/java.cpp line 100: >> >>> 98: int compare_methods(Method** a, Method** b) { >>> 99: return (int32_t)(((uint32_t)(*b)->invocation_count() + (*b)->compiled_invocation_count()) >>> 100: - ((uint32_t)(*a)->invocation_count() + (*a)->compiled_invocation_count())); >> >> Is this correct? The arithmetic look to be: (int32_t) (uint64_t - uint64_t). If the 64 values inside don't fit in 32, you'll get a negative number which would break the sorting logic. > > I see that you've fixed the types since the last comment, but it think it's still broken (and has been before). > How about: > int64_t diff = ((*b)->compiled_invocation_count() - (*a)->compiled_invocation_count()) + ((*b)->invocation_count() - (*a)->invocation_count()); > if (diff > 0) return 1; > else if (diff < 0) return -1; > else return 0; > It's kind of hacky too, because it assumes that compiled_invocation_count() are positive and didn't overflow. But at least we'd get rid of a possible overflow during summation. What do you think? Right. As soon as there is overflow, the original formula doesn't do the trick either. We can fix it as long as either (unsigned int)invocation_count() does not wrap around from 2^32-1 to 0. The entire expression is calculated as int64_t, protecting us from overflow for the next few years. If we then calculate the return value as you propose, we are good. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From akozlov at openjdk.java.net Tue Mar 2 21:14:41 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 21:14:41 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Tue, 2 Mar 2021 08:23:45 GMT, Yasumasa Suenaga wrote: >> Probably we can assume this function to optionally return a board name. IMHO ideally an empty string should a valid output. Now the output from this function is the prefix and the features string is the suffix of the complete string CPU description. What if we start with the constant `AArch64` prefix in the shared CPU code, append output from this function, and then append the features string? > > Do you mean `VM_Version::get_compatible_board()` should return `NULL` instead of `AArch64` to add it as a prefix at the caller? > I don't want to do so because we should return board name as possible like a x86. > > In Windows AArch64, it seems to have SoC name in registry. Now I do not have Windows AArch64, so I just set "AArch64" for it, but I hope someone work for it in future. > https://stackoverflow.com/questions/60588765/how-to-get-cpu-brand-information-in-arm64 Yes, I meant `NULL` or just `""`. Got it, you mean "AArch64" to be a placeholder. But description now starts with `AArch64` prefix, probably worth to keep it. Although it's fine in the current implementation. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From iveresov at openjdk.java.net Tue Mar 2 21:18:50 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 21:18:50 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v2] In-Reply-To: <3cDAX-RUCe2SA1O1Mq_Ge152dzwNUSDj3ZQB0ziYJxM=.e594bfe0-4d68-46fc-8d59-16b2dfa3abf1@github.com> References: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> <7wsUXkZzWxQxKZU2obDzcql7ADmt3pFpyaVUe3a0D_o=.d4f62585-cd23-4e58-9ccd-2c6ebe555f43@github.com> <3cDAX-RUCe2SA1O1Mq_Ge152dzwNUSDj3ZQB0ziYJxM=.e594bfe0-4d68-46fc-8d59-16b2dfa3abf1@github.com> Message-ID: On Tue, 2 Mar 2021 20:43:53 GMT, Lutz Schmidt wrote: >> So, why do we need the casts to unsigned in this method? > > When you increment (2^31-1), you get 2^31 which is 0x80000000. When interpreted as signed int, it is MIN_INT. I don't want that. I want to treat the value as positive number - what it actually is. There is no negative count! I was trying to make a point that these counters are always < MAX_INT. ```InvocationCounter::count()``` shifts the counter right by 1, ensuring that the sign bit is 0. ```Method::{invocation, backedge, interpreter_invocation}_count()``` can also return ```InvocationCounter::count_limit```, but this one is 2^30, which is also positive. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From akozlov at openjdk.java.net Tue Mar 2 21:19:18 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 21:19:18 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v23] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request incrementally with five additional commits since the last revision: - Fix after JDK-8259539, partially revert preconditions - JDK-8260471: bsd_aarch64 part - JDK-8259539: bsd_aarch64 part - JDK-8257828: bsd_aarch64 part - Cleanup os_bsd_aarch64 signal handling ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2200/files - new: https://git.openjdk.java.net/jdk/pull/2200/files/e42b82db..4c37f068 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=22 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=21-22 Stats: 31 lines in 1 file changed: 15 ins; 11 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 2 21:19:18 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 21:19:18 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v10] In-Reply-To: <-PhzrEcgREcbXuZ5GrxAfVa6Uwil9YoOkZULt1154rw=.9689a79e-cf61-4f79-9b36-a3295fecab7b@github.com> References: <-PhzrEcgREcbXuZ5GrxAfVa6Uwil9YoOkZULt1154rw=.9689a79e-cf61-4f79-9b36-a3295fecab7b@github.com> Message-ID: On Fri, 12 Feb 2021 15:21:06 GMT, Vladimir Kempik wrote: >> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 302: >> >>> 300: const uint64_t *detail_msg_ptr >>> 301: = (uint64_t*)(pc + NativeInstruction::instruction_size); >>> 302: const char *detail_msg = (const char *)*detail_msg_ptr; >> >> Where is `detail_msg` used? > > Came from linux_arm64. was used in os_linux_aarch64.cpp on line 246 in report_and_die > But became unused on bsd_arm64. I agree this needs to be removed It seems we have merged master branch before JDK-8259539 was integrated. It brings back use of detail_msg. I reverted details_msg as well its following use. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From lucy at openjdk.java.net Tue Mar 2 21:29:46 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 2 Mar 2021 21:29:46 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v2] In-Reply-To: References: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> <7wsUXkZzWxQxKZU2obDzcql7ADmt3pFpyaVUe3a0D_o=.d4f62585-cd23-4e58-9ccd-2c6ebe555f43@github.com> <3cDAX-RUCe2SA1O1Mq_Ge152dzwNUSDj3ZQB0ziYJxM=.e594bfe0-4d68-46fc-8d59-16b2dfa3abf1@github.com> Message-ID: On Tue, 2 Mar 2021 21:15:46 GMT, Igor Veresov wrote: >> When you increment (2^31-1), you get 2^31 which is 0x80000000. When interpreted as signed int, it is MIN_INT. I don't want that. I want to treat the value as positive number - what it actually is. There is no negative count! > > I was trying to make a point that these counters are always < MAX_INT. ```InvocationCounter::count()``` shifts the counter right by 1, ensuring that the sign bit is 0. ```Method::{invocation, backedge, interpreter_invocation}_count()``` can also return ```InvocationCounter::count_limit```, but this one is 2^30, which is also positive. Slowly, but surely, we are coming to a common understanding. Thanks for educating me. I hadn't seen the range limitation for the counters. Now that I know, I recognise there is some knowledge in my almost faded memory. OK, these three counters will never get dangerously close to 2^31-1. I will rework the section, remove some checks and casts. This will only happen tomorrow (Wednesday) which starts in 90 minutes in my time zone (GMT+1). ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From akozlov at openjdk.java.net Tue Mar 2 21:29:52 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 2 Mar 2021 21:29:52 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v10] In-Reply-To: References: Message-ID: On Thu, 4 Feb 2021 22:58:39 GMT, Gerard Ziemski wrote: >> Anton Kozlov has updated the pull request incrementally with six additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/jdk/jdk-macos' into jdk-macos >> - Add comments to WX transitions >> >> + minor change of placements >> - Use macro conditionals instead of empty functions >> - Add W^X to tests >> - Do not require known W^X state >> - Revert w^x in gtests > > src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 420: > >> 418: size_t os::Posix::_compiler_thread_min_stack_allowed = 72 * K; >> 419: size_t os::Posix::_java_thread_min_stack_allowed = 72 * K; >> 420: size_t os::Posix::_vm_internal_thread_min_stack_allowed = 72 * K; > > Those are slightly larger than their x86_64 counter parts. Are they conservative/aggressive values? How did we arrive at those? These values were copied from linux_aarch64. The motivation is that clang on macos/aarch64 will likely to produce stack frames for C++ functions similar to frames generates by gcc on linux/aarch64. And sizes of java stack frames should not change. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Tue Mar 2 22:03:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 2 Mar 2021 22:03:42 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Tue, 2 Mar 2021 00:55:16 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Using regular brackets in initializer list instead of curly brackets test/hotspot/gtest/concurrent_test_runner.inline.hpp line 74: > 72: Semaphore done(0); > 73: > 74: std::vector t; Regarding the STL question, I think using std in a limited fashion in the tests seems ok before we start allowing it in the main sources, despite where Misha pointed out it already exists. It's going to be allowed pretty soon anyway. One question about this: where does it allocate memory for 't' ? Are the elements leaked here? ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From dholmes at openjdk.java.net Tue Mar 2 23:25:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 2 Mar 2021 23:25:08 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v23] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 21:19:18 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request incrementally with five additional commits since the last revision: > > - Fix after JDK-8259539, partially revert preconditions > - JDK-8260471: bsd_aarch64 part > - JDK-8259539: bsd_aarch64 part > - JDK-8257828: bsd_aarch64 part > - Cleanup os_bsd_aarch64 signal handling src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 207: > 205: // Enable WXWrite: this function is called by the signal handler at arbitrary > 206: // point of execution. > 207: ThreadWXEnable wx(WXWrite, thread); Note that `thread` can be NULL here if the signal handler is running in a non-attached thread. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From iveresov at openjdk.java.net Tue Mar 2 23:45:52 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Tue, 2 Mar 2021 23:45:52 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v8] In-Reply-To: References: Message-ID: <1rk73X6FOYTSZ353R5Qb1QytcIwwtZg5341d-vTqvhQ=.9e21e448-b4bb-4ded-9544-ef44b56375c5@github.com> On Tue, 2 Mar 2021 21:08:38 GMT, Lutz Schmidt wrote: >> I see that you've fixed the types since the last comment, but it think it's still broken (and has been before). >> How about: >> int64_t diff = ((*b)->compiled_invocation_count() - (*a)->compiled_invocation_count()) + ((*b)->invocation_count() - (*a)->invocation_count()); >> if (diff > 0) return 1; >> else if (diff < 0) return -1; >> else return 0; >> It's kind of hacky too, because it assumes that compiled_invocation_count() are positive and didn't overflow. But at least we'd get rid of a possible overflow during summation. What do you think? > > Right. As soon as there is overflow, the original formula doesn't do the trick either. > We can fix it as long as either (unsigned int)invocation_count() does not wrap around from 2^32-1 to 0. The entire expression is calculated as int64_t, protecting us from overflow for the next few years. If we then calculate the return value as you propose, we are good. In your new code here casts to uint32_t are probably unnecessary. ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From mseledtsov at openjdk.java.net Tue Mar 2 23:57:01 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Tue, 2 Mar 2021 23:57:01 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Tue, 2 Mar 2021 22:00:59 GMT, Coleen Phillimore wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Using regular brackets in initializer list instead of curly brackets > > test/hotspot/gtest/concurrent_test_runner.inline.hpp line 74: > >> 72: Semaphore done(0); >> 73: >> 74: std::vector t; > > Regarding the STL question, I think using std in a limited fashion in the tests seems ok before we start allowing it in the main sources, despite where Misha pointed out it already exists. It's going to be allowed pretty soon anyway. > One question about this: where does it allocate memory for 't' ? Are the elements leaked here? On a question: where does it allocate memory for 't' ? Are the elements leaked here? The vector will allocate and free the memory for the containers (the memory containing pointers to UnitTestThread). As for the UnitTestThread objects/instances themselves: - JavaTestThread. I checked couple of other examples of uses of JavaTestThread (test_symbolTable.cpp and oops/test_markWord.cpp), and neither is calling explicit delete on the objects of this type - I presume the thread will free its resources once it exits (and we wait for done semafore to ensure this) - adding code ?delete t[i]? results in crash, presuming due to freeing an already released memory Hence, I believe, no changes needed to this code to free resources. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From ysuenaga at openjdk.java.net Wed Mar 3 02:59:15 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 02:59:15 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v4] In-Reply-To: References: Message-ID: <5qzI46aqrPsJuyaI19wCAiFYL3m4jVsGcbvBVgDqZ5g=.cedd48f2-0046-4ef0-887f-b40567d4a910@github.com> > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Add "AArch64" as prefix in CPU description ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2759/files - new: https://git.openjdk.java.net/jdk/pull/2759/files/f62fa768..2e9ae8fe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=02-03 Stats: 10 lines in 3 files changed: 1 ins; 3 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2759.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2759/head:pull/2759 PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 3 02:59:15 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 02:59:15 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Tue, 2 Mar 2021 21:11:32 GMT, Anton Kozlov wrote: >> Do you mean `VM_Version::get_compatible_board()` should return `NULL` instead of `AArch64` to add it as a prefix at the caller? >> I don't want to do so because we should return board name as possible like a x86. >> >> In Windows AArch64, it seems to have SoC name in registry. Now I do not have Windows AArch64, so I just set "AArch64" for it, but I hope someone work for it in future. >> https://stackoverflow.com/questions/60588765/how-to-get-cpu-brand-information-in-arm64 > > Yes, I meant `NULL` or just `""`. Got it, you mean "AArch64" to be a placeholder. But description now starts with `AArch64` prefix, probably worth to keep it. Although it's fine in the current implementation. Ok, I added "AArch64" as a prefix in CPU description in new commit. We can see it as following: jdk.CPUInformation { startTime = 11:51:44.147 cpu = "AArch64" description = "AArch64 raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" sockets = 4 cores = 4 hwThreads = 4 }``` ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From ngasson at openjdk.java.net Wed Mar 3 03:05:40 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Wed, 3 Mar 2021 03:05:40 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Mon, 1 Mar 2021 14:12:53 GMT, Anton Kozlov wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> refactoring > > Looks better, thanks for addressing. Please consider few notes from someone not in the reviewer role. > > What file should we refer to detect SoC on server-class machine? Can we detect SoC in same way? (e.g. sysfs) > If we cannot implement it in same way, I want to fix it for device tree at first. Maybe you could use the files under `/sys/devices/virtual/dmi/id/` like `board_vendor` and `board_name`? ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 3 04:36:13 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 04:36:13 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v5] In-Reply-To: References: Message-ID: > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Fix compile error in Windows AArch64 ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2759/files - new: https://git.openjdk.java.net/jdk/pull/2759/files/2e9ae8fe..55ccd442 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2759.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2759/head:pull/2759 PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 3 04:36:13 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 04:36:13 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Wed, 3 Mar 2021 03:03:21 GMT, Nick Gasson wrote: > > What file should we refer to detect SoC on server-class machine? Can we detect SoC in same way? (e.g. sysfs) > > If we cannot implement it in same way, I want to fix it for device tree at first. > > Maybe you could use the files under `/sys/devices/virtual/dmi/id/` like `board_vendor` and `board_name`? It does not exist on Raspberry Pi OS. pi at raspberrypi:~/github-forked/jdk $ lsb_release -a No LSB modules are available. Distributor ID: Debian Description: Debian GNU/Linux 10 (buster) Release: 10 Codename: buster pi at raspberrypi:~/github-forked/jdk $ uname -a Linux raspberrypi 5.10.11-v8+ #1399 SMP PREEMPT Thu Jan 28 12:14:03 GMT 2021 aarch64 GNU/Linux pi at raspberrypi:~/github-forked/jdk $ ls /sys/devices/virtual/dmi/id/ ls: cannot access '/sys/devices/virtual/dmi/id/': No such file or directory pi at raspberrypi:~/github-forked/jdk $ ls /sys/devices/virtual/dmi/ ls: cannot access '/sys/devices/virtual/dmi/': No such file or directory pi at raspberrypi:~/github-forked/jdk $ ls /sys/devices/virtual/ bcm2708_vcio block graphics mem raw rpivid-intcmem thermal vc-mem bcm2835-gpiomem devlink input misc rpivid-h264mem rpivid-vp9mem tty vtconsole bdi dma_heap leds net rpivid-hevcmem sound vc workqueue ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From ngasson at openjdk.java.net Wed Mar 3 05:00:42 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Wed, 3 Mar 2021 05:00:42 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Wed, 3 Mar 2021 04:31:28 GMT, Yasumasa Suenaga wrote: > > > > Maybe you could use the files under `/sys/devices/virtual/dmi/id/` like `board_vendor` and `board_name`? > > It does not exist on Raspberry Pi OS. > That's because the default Raspberry Pi firmware uses Device Tree. You'll only get those files if the system was booted from UEFI (x86-style firmware). ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From stuefe at openjdk.java.net Wed Mar 3 05:07:40 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 3 Mar 2021 05:07:40 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Tue, 2 Mar 2021 23:53:53 GMT, Mikhailo Seledtsov wrote: >> test/hotspot/gtest/concurrent_test_runner.inline.hpp line 74: >> >>> 72: Semaphore done(0); >>> 73: >>> 74: std::vector t; >> >> Regarding the STL question, I think using std in a limited fashion in the tests seems ok before we start allowing it in the main sources, despite where Misha pointed out it already exists. It's going to be allowed pretty soon anyway. >> One question about this: where does it allocate memory for 't' ? Are the elements leaked here? > > On a question: where does it allocate memory for 't' ? Are the elements leaked here? > The vector will allocate and free the memory for the containers (the memory containing pointers to UnitTestThread). > > As for the UnitTestThread objects/instances themselves: > - JavaTestThread. I checked couple of other examples of uses of JavaTestThread (test_symbolTable.cpp and oops/test_markWord.cpp), and neither is calling explicit delete on the objects of this type > - I presume the thread will free its resources once it exits (and we wait for done semafore to ensure this) > - adding code ?delete t[i]? results in crash, presuming due to freeing an already released memory > > Hence, I believe, no changes needed to this code to free resources. > Regarding the STL question, I think using std in a limited fashion in the tests seems ok before we start allowing it in the main sources, despite where Misha pointed out it already exists. It's going to be allowed pretty soon anyway. Really? I recently had a private discussion with Kim Barret about this and understood that this is a contentious point. I was hoping we would have a public discussion about this before deciding on this. My fears are increased build times (which seem to get worse and worse) and stability- and compiler issues. I am not completely against it, but I'm a bit of a burned child wrt to STL. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From ysuenaga at openjdk.java.net Wed Mar 3 05:21:46 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 05:21:46 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Wed, 3 Mar 2021 04:57:40 GMT, Nick Gasson wrote: > > > Maybe you could use the files under `/sys/devices/virtual/dmi/id/` like `board_vendor` and `board_name`? > > > > > > It does not exist on Raspberry Pi OS. > > That's because the default Raspberry Pi firmware uses Device Tree. You'll only get those files if the system was booted from UEFI (x86-style firmware). As I said in before comment, I want to fix it for device tree at first if we cannot refer board name in same way between devce tree and ACPI. If we cannot refer device tree, "AArch64" still uses for CPU description - it is same behavior with current implementation. Maybe I can improve this change to refer `/sys/devices/virtual/dmi/id/` if I re-install UEFI supported OS (e.g. Fedora) to my Pi 4, but I cannot do it now. So I want to work for it in another issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From stuefe at openjdk.java.net Wed Mar 3 05:46:50 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 3 Mar 2021 05:46:50 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Tue, 2 Mar 2021 00:55:16 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Using regular brackets in initializer list instead of curly brackets Hi Misha, > Hi Thomas, > I do most of my JDK work in Java, hence may have missed knowing the restriction of using STL. > I wonder if this restriction is only for source code, and not the tests. I did a quick search under "test/hotspot/gtest", and found many uses of std::, including the data structures. For instance, jfr/test_networkUtilization.cpp uses std::map, std::list and std::vector. See my answer to Coleen. I think using the STL would have a number of repercussions which should be discussed before doing this step. About the patch itself: I am not sure moving more and more test control down into gtest is a good thing. gtests have mostly be single threaded until now and could be run within a make. AFAIU gtests offer way less run control than jtreg does, e.g. you cannot disable the test without recompiling the hotspot (there is no ProblemList equivalent for gtest), you have no test groupings (e.g. to separate stress tests which need a whole machine for themselves) etc. Also, we will duplicate more and more thread control in C++. I understand the wish to remove the test coding from the hotspot implementation. Its so ugly and should not live there. But just moving them to separate implementation files, possibly within a clearly marked "tests" folder, would be a first good step. I will not block this if you have decided to go this way. Just wanted to understand the direction you guys plan to go with gtests in the future. If they get more complex and powerful we may need more control, eg the possibility to problemlist tests. Cheers, Thomas test/hotspot/gtest/concurrent_test_runner.inline.hpp line 27: > 25: #define GTEST_CONCURRENT_TEST_RUNNER_INLINE_HPP > 26: > 27: #include "threadHelper.inline.hpp" Make sure you include all headers needed for this file. Includes should be self-contained, so pull everything they need (basically, you should be able to include it into an empty cpp file and it should build fine). You use Semaphore and some os::xxx functions, so you'd need at least os.hpp and wherever Semaphore lives. test/hotspot/gtest/concurrent_test_runner.inline.hpp line 69: > 67: testDurationMillis(testDurationMillisArg) {} > 68: > 69: virtual ~ConcurrentTestRunner() {} Do we derive from this class? And delete via base pointers? If not, I'd remove this. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Wed Mar 3 05:46:50 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 3 Mar 2021 05:46:50 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Wed, 3 Mar 2021 05:04:46 GMT, Thomas Stuefe wrote: >> On a question: where does it allocate memory for 't' ? Are the elements leaked here? >> The vector will allocate and free the memory for the containers (the memory containing pointers to UnitTestThread). >> >> As for the UnitTestThread objects/instances themselves: >> - JavaTestThread. I checked couple of other examples of uses of JavaTestThread (test_symbolTable.cpp and oops/test_markWord.cpp), and neither is calling explicit delete on the objects of this type >> - I presume the thread will free its resources once it exits (and we wait for done semafore to ensure this) >> - adding code ?delete t[i]? results in crash, presuming due to freeing an already released memory >> >> Hence, I believe, no changes needed to this code to free resources. > >> Regarding the STL question, I think using std in a limited fashion in the tests seems ok before we start allowing it in the main sources, despite where Misha pointed out it already exists. It's going to be allowed pretty soon anyway. > > Really? I recently had a private discussion with Kim Barret about this and understood that this is a contentious point. I was hoping we would have a public discussion about this before deciding on this. My fears are increased build times (which seem to get worse and worse) and stability- and compiler issues. I am not completely against it, but I'm a bit of a burned child wrt to STL. > One question about this: where does it allocate memory for 't' ? Are the elements leaked here? Backing buffer for vector lives in C-heap. Its a dynamically growing array, basically like our GrowableArray. Only its not under our control and we cannot easily debug it if something goes wrong (take a look at the STL sources, they are not easy to understand). And it sidesteps our own os::malloc, so we cannot account it, and our guards won't work. Personally I would like to keep the test harness as simple stupid as possible, to avoid interfering with the actual test. Especially with tests which test memory allocation. I know that is a vague reason, but I also cannot find a bit advantage in using std::vector, compared with a simple calloced array. We dont need the dynamically-growing part here, we know right upfront how many threads we start, so its really a fixed sized array. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From dongbo at openjdk.java.net Wed Mar 3 06:16:57 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Wed, 3 Mar 2021 06:16:57 GMT Subject: RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v12] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 14:53:57 GMT, Andrew Haley wrote: >> Dong Bo has updated the pull request incrementally with one additional commit since the last revision: >> >> make zero shift amount obviously > > Marked as reviewed by aph (Reviewer). @theRealAph Thanks for the review. @nsjian @theRealELiu Hi, are you also ok with newest version? Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/2472 From github.com+10482586+therealeliu at openjdk.java.net Wed Mar 3 06:24:41 2021 From: github.com+10482586+therealeliu at openjdk.java.net (Eric Liu) Date: Wed, 3 Mar 2021 06:24:41 GMT Subject: RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v12] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 14:53:57 GMT, Andrew Haley wrote: >> Dong Bo has updated the pull request incrementally with one additional commit since the last revision: >> >> make zero shift amount obviously > > Marked as reviewed by aph (Reviewer). > @theRealAph Thanks for the review. > > @nsjian @theRealELiu Hi, are you also ok with newest version? Thanks. Thanks for your fix! Looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/2472 From njian at openjdk.java.net Wed Mar 3 06:35:50 2021 From: njian at openjdk.java.net (Ningsheng Jian) Date: Wed, 3 Mar 2021 06:35:50 GMT Subject: RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v12] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 13:46:05 GMT, Dong Bo wrote: >> In vectorAPI, when right-shifting a vector with a shift equals to the element width, the shift is transformed to zero, >> see `src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java`: >> /** Produce {@code a>>>(n&(ESIZE*8-1))}. Integral only. */ >> public static final /*bitwise*/ Binary LSHR = binary("LSHR", ">>>", VectorSupport.VECTOR_OP_URSHIFT, VO_SHIFT); >> >> The aarch64 assembler generates wrong or illegal instructions in this case, e.g. for the JAVA code below on aarch64, >> assembler call `__ ushr(dst, __ T8B, src, 0)`, the instruction generated is not `ushr dst.8B, src.8B, 0`, but `ushr dst.4H, src.4H, 16` instead. >> According to local tests, JVM gives wrong results for byte/short and crashes with SIGILL for integer/long. >> ByteVector vba = ByteVector.fromArray(byte64SPECIES, bytesA, 8 * i); >> vbb.lanewise(VectorOperators.ASHR, 8).intoArray(arrBytes, 8 * i); >> >> The legal right shift amount should be in the range 1 to the element width in bits on aarch64: >> https://developer.arm.com/documentation/dui0801/f/A64-SIMD-Vector-Instructions/USHR--vector-?lang=en >> >> This fix handles zero shift separately. If the shift is zero, it generates `orr` for right shift, `addv` for right shift and accumulate. >> Verified with linux-aarch64-server-fastdebug, tier1. Also created a jtreg to reproduce the issue and for regression tests. > > Dong Bo has updated the pull request incrementally with one additional commit since the last revision: > > make zero shift amount obviously Marked as reviewed by njian (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2472 From dongbo at openjdk.java.net Wed Mar 3 06:46:54 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Wed, 3 Mar 2021 06:46:54 GMT Subject: Integrated: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width In-Reply-To: References: Message-ID: On Tue, 9 Feb 2021 06:55:50 GMT, Dong Bo wrote: > In vectorAPI, when right-shifting a vector with a shift equals to the element width, the shift is transformed to zero, > see `src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java`: > /** Produce {@code a>>>(n&(ESIZE*8-1))}. Integral only. */ > public static final /*bitwise*/ Binary LSHR = binary("LSHR", ">>>", VectorSupport.VECTOR_OP_URSHIFT, VO_SHIFT); > > The aarch64 assembler generates wrong or illegal instructions in this case, e.g. for the JAVA code below on aarch64, > assembler call `__ ushr(dst, __ T8B, src, 0)`, the instruction generated is not `ushr dst.8B, src.8B, 0`, but `ushr dst.4H, src.4H, 16` instead. > According to local tests, JVM gives wrong results for byte/short and crashes with SIGILL for integer/long. > ByteVector vba = ByteVector.fromArray(byte64SPECIES, bytesA, 8 * i); > vbb.lanewise(VectorOperators.ASHR, 8).intoArray(arrBytes, 8 * i); > > The legal right shift amount should be in the range 1 to the element width in bits on aarch64: > https://developer.arm.com/documentation/dui0801/f/A64-SIMD-Vector-Instructions/USHR--vector-?lang=en > > This fix handles zero shift separately. If the shift is zero, it generates `orr` for right shift, `addv` for right shift and accumulate. > Verified with linux-aarch64-server-fastdebug, tier1. Also created a jtreg to reproduce the issue and for regression tests. This pull request has now been integrated. Changeset: c15801e9 Author: Dong Bo Committer: Fei Yang URL: https://git.openjdk.java.net/jdk/commit/c15801e9 Stats: 486 lines in 3 files changed: 486 ins; 0 del; 0 mod 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width Reviewed-by: njian, aph ------------- PR: https://git.openjdk.java.net/jdk/pull/2472 From dongbo at openjdk.java.net Wed Mar 3 08:11:05 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Wed, 3 Mar 2021 08:11:05 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Tue, 2 Mar 2021 16:00:55 GMT, Andrew Haley wrote: > > OKAY, this make sense to us. > > If it is OK to keep the exclusive part of this patch? :-) > > As far as we know, the exclusive instructions are not being revised. > > And we see `ldxr+stxlr+dmb` have been used in linux kernel since 2014 [1], and still used by now [2]. > > I know that, but the Linux definition of a "full barrier" isn't quite as strong as HotSpot's `memory_order_conservative`, so we'd need a much more detailed analysis of what behaviours we can permit. Also, we'd have to find a strong reason to invest time in AArch64 without LSE instructions. > > > BTW, the barrier-ordered-before applies with stlxr according to the architecture specification: > > Sure, but so what? This is about the entire ldxr/stlxr combination and `memory_order_conservative` , in which we try to mimic Intel's "Loads and Stores Are Not Reordered with Locked Instructions" specification. Hi, For us, we still have servers used by our customers that does not support LSE extension. Hm, from our point of view, `ldaxr+stlxr+dmb` and `ldxr+stlxr+dmb` provide the same order semantics. The acquire are used to ensure all loads/stores that are after an `ldaxr` (actually loads/stores after the `dmb` of `atomic_*default*_impl` in this case) in program order, while the `dmb` has already guaranteed this for us. Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. Remove the acquire does not change the order between preceding loads/stores and `stlxr`. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From lkorinth at openjdk.java.net Wed Mar 3 09:49:07 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Wed, 3 Mar 2021 09:49:07 GMT Subject: RFR: 8262000: jdk/jfr/event/gc/detailed/TestPromotionFailedEventWithParallelScavenge.java failed with "OutOfMemoryError: Java heap space" Message-ID: By relaxing the matching of OOM error slightly, the test case can catch OOM errors not starting with "Exception: " ------------- Commit messages: - 8262000: jdk/jfr/event/gc/detailed/TestPromotionFailedEventWithParallelScavenge.java failed with "OutOfMemoryError: Java heap space" Changes: https://git.openjdk.java.net/jdk/pull/2806/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2806&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262000 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2806.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2806/head:pull/2806 PR: https://git.openjdk.java.net/jdk/pull/2806 From aph at openjdk.java.net Wed Mar 3 10:53:56 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 3 Mar 2021 10:53:56 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v5] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 04:36:13 GMT, Yasumasa Suenaga wrote: >> HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: >> >> $ jfr print --events jdk.CPUInformation raspi4.jfr >> jdk.CPUInformation { >> startTime = 22:57:13.521 >> cpu = "AArch64" >> description = "AArch64 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). >> >> In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. >> >> After this change, we can get the description as below: >> >> jdk.CPUInformation { >> startTime = 00:32:49.767 >> cpu = "AArch64" >> description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. >> >> jdk.CPUInformation { >> startTime = 17:28:03.907 >> cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" >> description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD >> Family: (0x17), Model: (0x71), Stepping: 0x0 >> Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 >> Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff >> Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff >> Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Di sable Bit, RDTSCP, Intel 64 Architecture" >> sockets = 1 >> cores = 2 >> hwThreads = 2 >> } > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Fix compile error in Windows AArch64 src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 178: > 176: struct stat statbuf; > 177: fstat(fd, &statbuf); > 178: ssize_t read_sz = read(fd, buf, statbuf.st_size); This looks wrong: the read() call should use buflen. src/hotspot/os_cpu/windows_aarch64/vm_version_windows_aarch64.cpp line 103: > 101: void VM_Version::get_compatible_board(char *buf, int buflen) { > 102: assert(buf != NULL, "invalid argument"); > 103: *buf = '\0'; This is wrong too: it should check buflen. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From tschatzl at openjdk.java.net Wed Mar 3 12:19:50 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 3 Mar 2021 12:19:50 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Tue, 2 Mar 2021 14:33:14 GMT, Jaroslav Bachorik wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Add tests for the heap usage summary event Fwiw, the change still does not capture G1 full gc `live_estimate()`. src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1070: > 1068: > 1069: uint num_selected_for_rebuild() const { return _num_regions_selected_for_rebuild; } > 1070: size_t live_estimate() const { return _live; } Please sync the member name with the getter name. I.e. `_live` -> `_live_estimate` src/hotspot/share/gc/parallel/psAdaptiveSizePolicy.hpp line 60: > 58: class PSAdaptiveSizePolicy : public AdaptiveSizePolicy { > 59: friend class PSGCAdaptivePolicyCounters; > 60: friend class ParallelScavengeHeap; Delete this apparently unneeded friend declaration (compiled successfully without here) src/hotspot/share/gc/parallel/parallelScavengeHeap.hpp line 87: > 85: > 86: // in order to provide accurate estimate this method must be called only when the heap has just been collected and compacted > 87: inline void capture_live(); Sentences should start with upper case in the comment. Also I'd prefer to name the method `update_live_estimate()` instead. src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 182: > 180: G1BlockOffsetTable* _bot; > 181: > 182: volatile size_t _live; I'm not happy with naming this `_live`, better use `_live_estimate`. The contents are not continuously updated and basically out of date after the first following allocation. This includes the naming in all other instances too. src/hotspot/share/gc/serial/serialHeap.hpp line 44: > 42: MemoryPool* _old_pool; > 43: > 44: size_t _live_size; Please rename to `_live_estimate` like the others. Avoid having different names in different collectors for the same thing. src/hotspot/share/gc/shared/space.inline.hpp line 128: > 126: p2i(dead_start), p2i(dead_end), dead_length * HeapWordSize); > 127: > 128: _dead_space += dead_length; I do not think adding this to the counter here instead of the other method for every object makes a difference performance-wise. As mentioned before, `_allowed_deadspace_words` counts *down* from `(space->capacity() * ratio / 100) / HeapWordSize;` to whatever end value. So at the end of collection, `(space->capacity() * ratio / 100) / HeapWordSize - _allowed_deadspace_words` should be equal to what `_dead_space` is now. Please add a getter to `DeadSpacer` that calculates this (factoring out the calculation of the maximum allowed deadspace). src/hotspot/share/gc/shared/space.hpp line 553: > 551: size_t capacity() const { return byte_size(bottom(), end()); } > 552: size_t used() const { return byte_size(bottom(), top()); } > 553: size_t live() const { The code for serial gc, contrary to others, tries to give some resemblance of tracking actual liveness. I.e. calculating this anew every call to `SerialHeap::live()`. However if calling an `update_live_estimate()` in parallel and G1 (and the other collectors) is fine at certain places, this should be as good for serial gc. Doing so would reduce the footprint of this change quite a bit (for serial gc) ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From ysuenaga at openjdk.java.net Wed Mar 3 12:25:09 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 12:25:09 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: Message-ID: > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Fix comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2759/files - new: https://git.openjdk.java.net/jdk/pull/2759/files/55ccd442..28edb130 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=04-05 Stats: 9 lines in 2 files changed: 3 ins; 1 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/2759.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2759/head:pull/2759 PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 3 12:25:10 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 12:25:10 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v5] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 10:49:41 GMT, Andrew Haley wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix compile error in Windows AArch64 > > src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 178: > >> 176: struct stat statbuf; >> 177: fstat(fd, &statbuf); >> 178: ssize_t read_sz = read(fd, buf, statbuf.st_size); > > This looks wrong: the read() call should use buflen. Fixed it in new commit. > src/hotspot/os_cpu/windows_aarch64/vm_version_windows_aarch64.cpp line 103: > >> 101: void VM_Version::get_compatible_board(char *buf, int buflen) { >> 102: assert(buf != NULL, "invalid argument"); >> 103: *buf = '\0'; > > This is wrong too: it should check buflen. I added assert for buflen in new commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From thartmann at openjdk.java.net Wed Mar 3 12:43:52 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Wed, 3 Mar 2021 12:43:52 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> Message-ID: On Mon, 22 Feb 2021 20:33:58 GMT, Evgeny Nikitin wrote: >> hm... that can mean that there is a product bug (or my recollections about code heaps aren't as good as I thought). >> >> @TobiHartmann , @iwanowww, could you please take a look? Evgeny's observations suggest that method handle intrinsics use `non-profiled nmethods` and `profiled nmethods` heaps and not `non-nmethods` heap despite the fact that the last one has plenty of free space. my understanding is/was that we should have used `non-nmethods` heap for MH intrinsic 1st and if it's exhausted start to use the other heaps. >> >> Thanks, >> -- Igor > > I inspected sample built up cache with 'Compiler.CodeHeap_Analytics' diagnostic command. The vast majority of the 'non-profiled nmethods' heap are zillions of `invokeBasic`, `linkToStatic` and similar, with different signatures. Dump shows something like this: > > nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; > nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJD)Ljava/lang/Object; > nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJDLjava/lang/Object;)Ljava/lang/Object; > nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJDLjava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; > > ... with their signatures marching to the right screen border and beyond. Given that their arguments are mish-mashed in all possible combinations, there are really many of them (I've been able to build up cashes up to 300MB without a pair signatures repeating). They are nmethods, and should be in the nmethods cache, aren't they? Sorry for missing the @TobiHartmann (I had github notifications disabled). Unlike VM internal adapters/buffers/runtime stubs, Method handle intrinsics are actual Java methods that are compiled by the JIT and therefore need to go into the profiled or non-profiled code cache segment. They can not go into the non-nmethod segment. ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From aph at openjdk.java.net Wed Mar 3 12:48:38 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 3 Mar 2021 12:48:38 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Wed, 3 Mar 2021 08:07:35 GMT, Dong Bo wrote: > Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. > Remove the acquire does not change the order between preceding loads/stores and `stlxr`. Looks like it. I tried this example, which makes sure that a preceding store does not pass the load in LDXR;STLXR;DMB : MOConservative { 0:X0=a; 0:X1=b; 1:X0=a; 1:X1=b; a=0; b=0; } P0 | P1; MOV W3, #1 | MOV W3, #1; STLR W3, [X0] | STR W3, [X1]; LDAR W1, [X1] | LDXR W1, [X0]; | STLXR W5, W4, [X0]; | CBZ W5, FOO; | MOV W1, #99; | FOO: ; | DMB ISH; exists (0:X1=0 /\ 1:X1 = 0) I don't think a preceding load can be reordered with the ```ldxr``` either, but I haven't written that test. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From tschatzl at openjdk.java.net Wed Mar 3 13:01:53 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 3 Mar 2021 13:01:53 GMT Subject: RFR: 8262000: jdk/jfr/event/gc/detailed/TestPromotionFailedEventWithParallelScavenge.java failed with "OutOfMemoryError: Java heap space" In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 09:43:55 GMT, Leo Korinth wrote: > By relaxing the matching of OOM error slightly, the test case can catch OOM errors not starting with "Exception: " Lgtm. It shouldn't matter for this test which Java thread threw the OOME. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2806 From lucy at openjdk.java.net Wed Mar 3 13:22:17 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 3 Mar 2021 13:22:17 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v9] In-Reply-To: References: Message-ID: > Dear community, > may I please request reviews for this fix, improving the usefulness of method invocation counters. > - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). > - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. > - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. > - before/after sample output is attached to the bug description. > > Thank you! > Lutz Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: Changes requested by veresov ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2511/files - new: https://git.openjdk.java.net/jdk/pull/2511/files/e8af119b..5c27640f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2511&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2511&range=07-08 Stats: 27 lines in 3 files changed: 3 ins; 12 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/2511.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2511/head:pull/2511 PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Wed Mar 3 13:28:21 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 3 Mar 2021 13:28:21 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v10] In-Reply-To: References: Message-ID: > Dear community, > may I please request reviews for this fix, improving the usefulness of method invocation counters. > - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). > - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. > - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. > - before/after sample output is attached to the bug description. > > Thank you! > Lutz Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: revert copyright change to get rid of unchanged file ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2511/files - new: https://git.openjdk.java.net/jdk/pull/2511/files/5c27640f..0faea5aa Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2511&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2511&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2511.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2511/head:pull/2511 PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Wed Mar 3 13:28:25 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 3 Mar 2021 13:28:25 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v2] In-Reply-To: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> References: <8B4V-lAbFWqYJzdcbVyz69U05GX14vFRx6nY-tKorNU=.4dff6316-f85e-4578-ae59-2dab8d804627@github.com> Message-ID: On Tue, 2 Mar 2021 20:01:22 GMT, Igor Veresov wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> 8261447: requested changes by TobiHartmann > > Changes requested by iveresov (Reviewer). @veresov @vnkozlov Thank you again for the discussion yesterday. I have pushed the resulting changes for you to review. Awaiting your comments... ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From aph at openjdk.java.net Wed Mar 3 14:00:52 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 3 Mar 2021 14:00:52 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 12:25:09 GMT, Yasumasa Suenaga wrote: >> HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: >> >> $ jfr print --events jdk.CPUInformation raspi4.jfr >> jdk.CPUInformation { >> startTime = 22:57:13.521 >> cpu = "AArch64" >> description = "AArch64 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). >> >> In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. >> >> After this change, we can get the description as below: >> >> jdk.CPUInformation { >> startTime = 00:32:49.767 >> cpu = "AArch64" >> description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. >> >> jdk.CPUInformation { >> startTime = 17:28:03.907 >> cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" >> description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD >> Family: (0x17), Model: (0x71), Stepping: 0x0 >> Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 >> Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff >> Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff >> Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Di sable Bit, RDTSCP, Intel 64 Architecture" >> sockets = 1 >> cores = 2 >> hwThreads = 2 >> } > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 189: > 187: close(fd); > 188: } > 189: } It's still wrong. If the read() call has filled the whole buffer, the string is not zero-terminated when this function returns. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 3 14:07:52 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 3 Mar 2021 14:07:52 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 13:58:04 GMT, Andrew Haley wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > src/hotspot/os_cpu/linux_aarch64/vm_version_linux_aarch64.cpp line 189: > >> 187: close(fd); >> 188: } >> 189: } > > It's still wrong. If the read() call has filled the whole buffer, the string is not zero-terminated when this function returns. So I set `\0` to the tail of `buf` at L178. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From coleenp at openjdk.java.net Wed Mar 3 14:42:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 3 Mar 2021 14:42:42 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Wed, 3 Mar 2021 05:11:16 GMT, Thomas Stuefe wrote: >>> Regarding the STL question, I think using std in a limited fashion in the tests seems ok before we start allowing it in the main sources, despite where Misha pointed out it already exists. It's going to be allowed pretty soon anyway. >> >> Really? I recently had a private discussion with Kim Barret about this and understood that this is a contentious point. I was hoping we would have a public discussion about this before deciding on this. My fears are increased build times (which seem to get worse and worse) and stability- and compiler issues. I am not completely against it, but I'm a bit of a burned child wrt to STL. > >> One question about this: where does it allocate memory for 't' ? Are the elements leaked here? > > Backing buffer for vector lives in C-heap. Its a dynamically growing array, basically like our GrowableArray. Only its not under our control and we cannot easily debug it if something goes wrong (take a look at the STL sources, they are not easy to understand). And it sidesteps our own os::malloc, so we cannot account it, and our guards won't work. > > Personally I would like to keep the test harness as simple stupid as possible, to avoid interfering with the actual test. Especially with tests which test memory allocation. I know that is a vague reason, but I also cannot find a bit advantage in using std::vector, compared with a simple calloced array. We dont need the dynamically-growing part here, we know right upfront how many threads we start, so its really a fixed sized array. Okay so you're right that we can't account in NMT and with our malloc guards for the memory that is allocated with std::vector, so I'm convinced that Misha should change this. Hopefully we can figure out how to make std::vector work in the future, because Misha's 2 lines for it look quite nice. Do you think using std:: would increase build times? My suspicion is that the Access barrier templates are to blame, but building hotspot is a fraction of building the whole jdk, which does seem to take a long time. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From coleenp at openjdk.java.net Wed Mar 3 14:48:45 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 3 Mar 2021 14:48:45 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Wed, 3 Mar 2021 05:43:32 GMT, Thomas Stuefe wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Using regular brackets in initializer list instead of curly brackets > > Hi Misha, > >> Hi Thomas, >> I do most of my JDK work in Java, hence may have missed knowing the restriction of using STL. >> I wonder if this restriction is only for source code, and not the tests. I did a quick search under "test/hotspot/gtest", and found many uses of std::, including the data structures. For instance, jfr/test_networkUtilization.cpp uses std::map, std::list and std::vector. > > See my answer to Coleen. I think using the STL would have a number of repercussions which should be discussed before doing this step. > > About the patch itself: > > I am not sure moving more and more test control down into gtest is a good thing. gtests have mostly be single threaded until now and could be run within a make. AFAIU gtests offer way less run control than jtreg does, e.g. you cannot disable the test without recompiling the hotspot (there is no ProblemList equivalent for gtest), you have no test groupings (e.g. to separate stress tests which need a whole machine for themselves) etc. Also, we will duplicate more and more thread control in C++. > > I understand the wish to remove the test coding from the hotspot implementation. Its so ugly and should not live there. But just moving them to separate implementation files, possibly within a clearly marked "tests" folder, would be a first good step. > > I will not block this if you have decided to go this way. Just wanted to understand the direction you guys plan to go with gtests in the future. If they get more complex and powerful we may need more control, eg the possibility to problemlist tests. > > Cheers, Thomas Moving these tests out of our source code is a good change. We already have a couple multithreaded tests in gtest, and adding this one isn't going to make the gtests unreliable. For future multithreaded tests, it makes sense to make them jtreg tests so that we can control their execution. So I'd like to see this PR integrated, with suggested changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From kbarrett at openjdk.java.net Wed Mar 3 14:48:46 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 3 Mar 2021 14:48:46 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 03:45:07 GMT, Mikhailo Seledtsov wrote: >> test/hotspot/gtest/concurrentTestRunner.inline.hpp line 1: >> >>> 1: /* >> >> for c++, we don't use camelCase in filenames, but rather use small_snake_case > > OK. I saw gtestMain.cpp, gtestLauncher.cpp and a few others, and just followed that. I also see a number of test_camelCase.cpp: test_primitiveConversions.cpp, test_logSelectionList.cpp and so on. In fact, it seems the most prevalent pattern for gtests is test_camelCase.cpp. > > Anyway, no problem, I can rename this file to concurrent_test_runner.inline.hpp This statement that we use small_snake_case for filenames is mostly wrong. Most HotSpot files contain some primary class, whose name is CamelCased, and the associated files are camelCased. There are a fairly small number of exceptions to that. There are some suffixes and prefixes that get added on, such as cpu and os suffixes (`_linux`, `_windows`, &etc, suffixes, `test_` gtest prefix, component prefixes like `c1_` and `gc_`), but those are mostly add-ons to the basic camelCase convention. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From egahlin at openjdk.java.net Wed Mar 3 14:50:44 2021 From: egahlin at openjdk.java.net (Erik Gahlin) Date: Wed, 3 Mar 2021 14:50:44 GMT Subject: RFR: 8262000: jdk/jfr/event/gc/detailed/TestPromotionFailedEventWithParallelScavenge.java failed with "OutOfMemoryError: Java heap space" In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 09:43:55 GMT, Leo Korinth wrote: > By relaxing the matching of OOM error slightly, the test case can catch OOM errors not starting with "Exception: " Marked as reviewed by egahlin (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2806 From iveresov at openjdk.java.net Wed Mar 3 15:25:41 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Wed, 3 Mar 2021 15:25:41 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v10] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 13:28:21 GMT, Lutz Schmidt wrote: >> Dear community, >> may I please request reviews for this fix, improving the usefulness of method invocation counters. >> - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). >> - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. >> - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. >> - before/after sample output is attached to the bug description. >> >> Thank you! >> Lutz > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > revert copyright change to get rid of unchanged file This looks good to me. ------------- Marked as reviewed by iveresov (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2511 From kbarrett at openjdk.java.net Wed Mar 3 15:37:46 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 3 Mar 2021 15:37:46 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Wed, 3 Mar 2021 14:39:33 GMT, Coleen Phillimore wrote: >>> One question about this: where does it allocate memory for 't' ? Are the elements leaked here? >> >> Backing buffer for vector lives in C-heap. Its a dynamically growing array, basically like our GrowableArray. Only its not under our control and we cannot easily debug it if something goes wrong (take a look at the STL sources, they are not easy to understand). And it sidesteps our own os::malloc, so we cannot account it, and our guards won't work. >> >> Personally I would like to keep the test harness as simple stupid as possible, to avoid interfering with the actual test. Especially with tests which test memory allocation. I know that is a vague reason, but I also cannot find a bit advantage in using std::vector, compared with a simple calloced array. We dont need the dynamically-growing part here, we know right upfront how many threads we start, so its really a fixed sized array. > > Okay so you're right that we can't account in NMT and with our malloc guards for the memory that is allocated with std::vector, so I'm convinced that Misha should change this. Hopefully we can figure out how to make std::vector work in the future, because Misha's 2 lines for it look quite nice. > Do you think using std:: would increase build times? My suspicion is that the Access barrier templates are to blame, but building hotspot is a fraction of building the whole jdk, which does seem to take a long time. Using the C++ Standard Library has long been, and as yet remains, forbidden in HotSpot code. In particular, using standard containers hasn't played nicely with HotSpot's native memory tracking. A small number of uses of standard library facilities has crept into some of the gtests; I think jfr is the only "offender" at present. That shouldn't have been allowed, and there are some configuration and build kludges that were required to make that work. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Wed Mar 3 15:37:47 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 3 Mar 2021 15:37:47 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Wed, 3 Mar 2021 14:51:24 GMT, Kim Barrett wrote: >> Okay so you're right that we can't account in NMT and with our malloc guards for the memory that is allocated with std::vector, so I'm convinced that Misha should change this. Hopefully we can figure out how to make std::vector work in the future, because Misha's 2 lines for it look quite nice. >> Do you think using std:: would increase build times? My suspicion is that the Access barrier templates are to blame, but building hotspot is a fraction of building the whole jdk, which does seem to take a long time. > > Using the C++ Standard Library has long been, and as yet remains, forbidden in HotSpot code. In particular, using standard containers hasn't played nicely with HotSpot's native memory tracking. A small number of uses of standard library facilities has crept into some of the gtests; I think jfr is the only "offender" at present. That shouldn't have been allowed, and there are some configuration and build kludges that were required to make that work. > Okay so you're right that we can't account in NMT and with our malloc guards for the memory that is allocated with std::vector, so I'm convinced that Misha should change this. Hopefully we can figure out how to make std::vector work in the future, because Misha's 2 lines for it look quite nice. > Do you think using std:: would increase build times? My suspicion is that the Access barrier templates are to blame, but building hotspot is a fraction of building the whole jdk, which does seem to take a long time. Well, yesterday I saw to my annoyance that upgrading from g++-7 to g++-9 increased jdk build time by about 30-40%, so I today I am very aware of build times :-/ But you are right of course, its just the hotspot, so maybe its not such a big deal. I am a bit apprehensive about introducing STL (had to port the STL to outlier platforms in the past and god that was terrible) but mainly I'd like to have a public discussion before starting to use it. And if the majority thinks its okay then I am okay with it too oc. I have been wrong before. Kim said STL quality is very high nowadays. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Wed Mar 3 15:50:41 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 3 Mar 2021 15:50:41 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: <7LLqn-GqIlUi0R4mlO4LXvd0TTTRJB07koKe8jP8HWw=.3cb2b6ae-9b62-47c1-abe2-a1a4c92d20bb@github.com> On Wed, 3 Mar 2021 14:43:26 GMT, Coleen Phillimore wrote: > Moving these tests out of our source code is a good change. Yes I agree. The tests were an eyesore. > We already have a couple multithreaded tests in gtest, and adding this one isn't going to make the gtests unreliable. For future multithreaded tests, it makes sense to make them jtreg tests so that we can control their execution. So I'd like to see this PR integrated, with suggested changes. Okay then. I am fine with this change too. I recently learned that tests can be disabled in the source code by preceding them with "DISABLED_". Not as good as ProblemLists, since it requires recompilation, but better than nothing. --- @mseledts : Would it be possible to rename the gtests to make them start with "os"? We recently added jtreg tests which run the "os" part of the gtests with various large page options (see https://bugs.openjdk.java.net/browse/JDK-8257959), and if you name your tests "os" too they get tested automatically with various large page settings. Which would be a nice benefit from making them gtests. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From iignatyev at openjdk.java.net Wed Mar 3 15:59:40 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 3 Mar 2021 15:59:40 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> Message-ID: <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> On Wed, 3 Mar 2021 12:40:30 GMT, Tobias Hartmann wrote: >> I inspected sample built up cache with 'Compiler.CodeHeap_Analytics' diagnostic command. The vast majority of the 'non-profiled nmethods' heap are zillions of `invokeBasic`, `linkToStatic` and similar, with different signatures. Dump shows something like this: >> >> nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; >> nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJD)Ljava/lang/Object; >> nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJDLjava/lang/Object;)Ljava/lang/Object; >> nMethod (active) invokeBasic(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;DFJDLjava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; >> >> ... with their signatures marching to the right screen border and beyond. Given that their arguments are mish-mashed in all possible combinations, there are really many of them (I've been able to build up cashes up to 300MB without a pair signatures repeating). They are nmethods, and should be in the nmethods cache, aren't they? > > Sorry for missing the @TobiHartmann (I had github notifications disabled). > Unlike VM internal adapters/buffers/runtime stubs, Method handle intrinsics are actual Java methods that are compiled by the JIT and therefore need to go into the profiled or non-profiled code cache segment. They can not go into the non-nmethod segment. Hi @TobiHartmann, method handle intrinsics aren't regular java methods, otherwise, the failure to compile them won't result in `VirtualMachineError`. and, IIRC, their lifecycle is also different from that of nmethods, i.e. we don't sweep them. so I'm wondering why they can't go into non-nmethod segment. -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From lucy at openjdk.java.net Wed Mar 3 15:59:48 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 3 Mar 2021 15:59:48 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v10] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 15:22:58 GMT, Igor Veresov wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> revert copyright change to get rid of unchanged file > > This looks good to me. Thank you, Igor! ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From gziemski at openjdk.java.net Wed Mar 3 16:00:00 2021 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 3 Mar 2021 16:00:00 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v23] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 23:21:28 GMT, David Holmes wrote: > Note that `thread` can be NULL here if the signal handler is running in a non-attached thread. If we then perform: > `ThreadWXEnable(WXMode new_mode, Thread* thread = NULL) : _thread(thread ? thread : Thread::current()),` > we call Thread::current() on a non-attached thread and that will assert/crash if we get NULL. Either avoid using WX when the thread is NULL, or else change to use Thread::current_or_null_safe() and ensure all uses have a NULL check. https://bugs.openjdk.java.net/browse/JDK-8262903 tracks this issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From iignatyev at openjdk.java.net Wed Mar 3 16:08:45 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 3 Mar 2021 16:08:45 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 03:45:07 GMT, Mikhailo Seledtsov wrote: >> test/hotspot/gtest/concurrentTestRunner.inline.hpp line 1: >> >>> 1: /* >> >> for c++, we don't use camelCase in filenames, but rather use small_snake_case > > OK. I saw gtestMain.cpp, gtestLauncher.cpp and a few others, and just followed that. I also see a number of test_camelCase.cpp: test_primitiveConversions.cpp, test_logSelectionList.cpp and so on. In fact, it seems the most prevalent pattern for gtests is test_camelCase.cpp. > > Anyway, no problem, I can rename this file to concurrent_test_runner.inline.hpp apparently, I had a false memory, we indeed have more `camelCase` than `small_snake_case` files, so @mseledts, you will need to revert that. sorry for wasting your time. in my defense, most of the files are just `case` hence can be interpreted either way :) ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From gziemski at openjdk.java.net Wed Mar 3 17:56:15 2021 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Wed, 3 Mar 2021 17:56:15 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 11:05:20 GMT, Anton Kozlov wrote: >> For platform files that were copied from other ports to this port, if the file wasn't >> changed I presume the copyright years are left alone. If the file required changes >> for this port, I expect the year to be updated to 2021. How are you verifying that >> these copyright years are being properly managed on the new files? >> >> For the new W^X helpers, e.g., WXWriteFromExecSetter, a short comment >> explaining why one was landed in a particular place would help reviewers. >> Also see my comment about creating a new ThreadToNativeWithWXExecFromVM >> helper. >> >> I'm stopping my review with all the src/hotspot files done for now. > >> For platform files that were copied from other ports to this port, if the file wasn't >> changed I presume the copyright years are left alone. If the file required changes >> for this port, I expect the year to be updated to 2021. How are you verifying that >> these copyright years are being properly managed on the new files? > > There are no exact copies, based on > git -c diff.renameLimit=10000000 diff --find-copies-harder -C75% --name-status upstream/master... > > So every file changed in the branch potentially needs the copyright update. All file diffs are not trivial, IMHO. > > I'll run the copyright update after we fix a few remaining issues with the PR, to avoid updating copyright and changing/reverting the actual content. A list of the bugs that our internal testing revealed so far: https://bugs.openjdk.java.net/browse/JDK-8262952 https://bugs.openjdk.java.net/browse/JDK-8262894 https://bugs.openjdk.java.net/browse/JDK-8262895 https://bugs.openjdk.java.net/browse/JDK-8262896 https://bugs.openjdk.java.net/browse/JDK-8262897 https://bugs.openjdk.java.net/browse/JDK-8262898 https://bugs.openjdk.java.net/browse/JDK-8262899 https://bugs.openjdk.java.net/browse/JDK-8262900 https://bugs.openjdk.java.net/browse/JDK-8262901 https://bugs.openjdk.java.net/browse/JDK-8262903 https://bugs.openjdk.java.net/browse/JDK-8262904 ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From aph at openjdk.java.net Wed Mar 3 17:56:15 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 3 Mar 2021 17:56:15 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> On Wed, 3 Mar 2021 17:39:28 GMT, Gerard Ziemski wrote: > A list of the bugs that our internal testing revealed so far: Are any of these blockers for integration? Some of them are to do with things like features that aren't yet supported, and we can't fix what we can't see. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From mseledtsov at openjdk.java.net Wed Mar 3 18:26:41 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Wed, 3 Mar 2021 18:26:41 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 16:05:45 GMT, Igor Ignatyev wrote: >> OK. I saw gtestMain.cpp, gtestLauncher.cpp and a few others, and just followed that. I also see a number of test_camelCase.cpp: test_primitiveConversions.cpp, test_logSelectionList.cpp and so on. In fact, it seems the most prevalent pattern for gtests is test_camelCase.cpp. >> >> Anyway, no problem, I can rename this file to concurrent_test_runner.inline.hpp > > apparently, I had a false memory, we indeed have more `camelCase` than `small_snake_case` files, so @mseledts, you will need to revert that. sorry for wasting your time. in my defense, most of the files are just `case` hence can be interpreted either way :) No worries. I will 'git rename' it back to the original name, I hope this will work. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Wed Mar 3 18:35:49 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Wed, 3 Mar 2021 18:35:49 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Wed, 3 Mar 2021 15:34:51 GMT, Thomas Stuefe wrote: >> Using the C++ Standard Library has long been, and as yet remains, forbidden in HotSpot code. In particular, using standard containers hasn't played nicely with HotSpot's native memory tracking. A small number of uses of standard library facilities has crept into some of the gtests; I think jfr is the only "offender" at present. That shouldn't have been allowed, and there are some configuration and build kludges that were required to make that work. > >> Okay so you're right that we can't account in NMT and with our malloc guards for the memory that is allocated with std::vector, so I'm convinced that Misha should change this. Hopefully we can figure out how to make std::vector work in the future, because Misha's 2 lines for it look quite nice. >> Do you think using std:: would increase build times? My suspicion is that the Access barrier templates are to blame, but building hotspot is a fraction of building the whole jdk, which does seem to take a long time. > > Well, yesterday I saw to my annoyance that upgrading from g++-7 to g++-9 increased jdk build time by about 30-40%, so I today I am very aware of build times :-/ But you are right of course, its just the hotspot, so maybe its not such a big deal. > > I am a bit apprehensive about introducing STL (had to port the STL to outlier platforms in the past and god that was terrible) but mainly I'd like to have a public discussion before starting to use it. And if the majority thinks its okay then I am okay with it too oc. I have been wrong before. Kim said STL quality is very high nowadays. Thanks for your feedback and discussion. I will refrain from using STL, and use malloc to allocate the array. I presume I should use os::malloc() in this case - please correct me if I am wrong. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From github.com+7991079+danlemmond at openjdk.java.net Wed Mar 3 18:42:57 2021 From: github.com+7991079+danlemmond at openjdk.java.net (Dan) Date: Wed, 3 Mar 2021 18:42:57 GMT Subject: RFR: 8239386: handle ContendedPaddingWidth in vm_version_aarch64 Message-ID: Handle ContendedPaddingWidth the same way other architectures do Passes Hotspot Tier1 ------------- Commit messages: - 8239386: handle ContendedPaddingWidth in vm_version_aarch64 Changes: https://git.openjdk.java.net/jdk/pull/2814/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2814&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8239386 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2814.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2814/head:pull/2814 PR: https://git.openjdk.java.net/jdk/pull/2814 From iignatyev at openjdk.java.net Wed Mar 3 18:43:41 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 3 Mar 2021 18:43:41 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: <7LLqn-GqIlUi0R4mlO4LXvd0TTTRJB07koKe8jP8HWw=.3cb2b6ae-9b62-47c1-abe2-a1a4c92d20bb@github.com> References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> <7LLqn-GqIlUi0R4mlO4LXvd0TTTRJB07koKe8jP8HWw=.3cb2b6ae-9b62-47c1-abe2-a1a4c92d20bb@github.com> Message-ID: On Wed, 3 Mar 2021 15:48:09 GMT, Thomas Stuefe wrote: >> Moving these tests out of our source code is a good change. We already have a couple multithreaded tests in gtest, and adding this one isn't going to make the gtests unreliable. For future multithreaded tests, it makes sense to make them jtreg tests so that we can control their execution. So I'd like to see this PR integrated, with suggested changes. > >> Moving these tests out of our source code is a good change. > > Yes I agree. The tests were an eyesore. > >> We already have a couple multithreaded tests in gtest, and adding this one isn't going to make the gtests unreliable. For future multithreaded tests, it makes sense to make them jtreg tests so that we can control their execution. So I'd like to see this PR integrated, with suggested changes. > > Okay then. I am fine with this change too. I recently learned that tests can be disabled in the source code by preceding them with "DISABLED_". Not as good as ProblemLists, since it requires recompilation, but better than nothing. > > --- > @mseledts : > > Would it be possible to rename the gtests to make them start with "os"? We recently added jtreg tests which run the "os" part of the gtests with various large page options (see https://bugs.openjdk.java.net/browse/JDK-8257959), and if you name your tests "os" too they get tested automatically with various large page settings. Which would be a nice benefit from making them gtests. > > Cheers, Thomas Hi Thomas, > <... >Just wanted to understand the direction you guys plan to go with gtests in the future. If they get more complex and powerful we may need more control, eg the possibility to problemlist tests. <...> as you have discovered yourself, gtest already has support for the temporary exclusion of tests. I agree that it inferior to problemlists and I'd like to let you know that there is a plan to make it possible to use a problemlist-like solution to exclude gtests. however, it's not on the top of my to-do list as **currently** there is no big need for that. should it change I'll reprioritize that work. -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Wed Mar 3 18:43:44 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Wed, 3 Mar 2021 18:43:44 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: <6KsHsZN0E9evdjCQucq8lI09b5aNen9Rny8AbsNuRLM=.d4c7248e-daee-4054-8dc7-ea4f578e4c0b@github.com> On Wed, 3 Mar 2021 05:13:22 GMT, Thomas Stuefe wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Using regular brackets in initializer list instead of curly brackets > > test/hotspot/gtest/concurrent_test_runner.inline.hpp line 27: > >> 25: #define GTEST_CONCURRENT_TEST_RUNNER_INLINE_HPP >> 26: >> 27: #include "threadHelper.inline.hpp" > > Make sure you include all headers needed for this file. Includes should be self-contained, so pull everything they need (basically, you should be able to include it into an empty cpp file and it should build fine). You use Semaphore and some os::xxx functions, so you'd need at least os.hpp and wherever Semaphore lives. Thank you for pointing this out. I will make the necessary updates. > test/hotspot/gtest/concurrent_test_runner.inline.hpp line 69: > >> 67: testDurationMillis(testDurationMillisArg) {} >> 68: >> 69: virtual ~ConcurrentTestRunner() {} > > Do we derive from this class? And delete via base pointers? If not, I'd remove this. Not deriving from this class, I will remove the virtual destructor. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From kbarrett at openjdk.java.net Wed Mar 3 19:05:40 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 3 Mar 2021 19:05:40 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 18:23:59 GMT, Mikhailo Seledtsov wrote: >> apparently, I had a false memory, we indeed have more `camelCase` than `small_snake_case` files, so @mseledts, you will need to revert that. sorry for wasting your time. in my defense, most of the files are just `case` hence can be interpreted either way :) > > No worries. I will 'git rename' it back to the original name, I hope this will work. See also "Naming and Grouping" here: https://github.com/openjdk/jdk/blob/master/doc/hotspot-unit-tests.md ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From kbarrett at openjdk.java.net Wed Mar 3 19:10:46 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 3 Mar 2021 19:10:46 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> Message-ID: On Wed, 3 Mar 2021 18:32:52 GMT, Mikhailo Seledtsov wrote: >>> Okay so you're right that we can't account in NMT and with our malloc guards for the memory that is allocated with std::vector, so I'm convinced that Misha should change this. Hopefully we can figure out how to make std::vector work in the future, because Misha's 2 lines for it look quite nice. >>> Do you think using std:: would increase build times? My suspicion is that the Access barrier templates are to blame, but building hotspot is a fraction of building the whole jdk, which does seem to take a long time. >> >> Well, yesterday I saw to my annoyance that upgrading from g++-7 to g++-9 increased jdk build time by about 30-40%, so I today I am very aware of build times :-/ But you are right of course, its just the hotspot, so maybe its not such a big deal. >> >> I am a bit apprehensive about introducing STL (had to port the STL to outlier platforms in the past and god that was terrible) but mainly I'd like to have a public discussion before starting to use it. And if the majority thinks its okay then I am okay with it too oc. I have been wrong before. Kim said STL quality is very high nowadays. > > Thanks for your feedback and discussion. I will refrain from using STL, and use malloc to allocate the array. I presume I should use os::malloc() in this case - please correct me if I am wrong. re: "there are some configuration and build kludges that were required to make that work." - Thinking about it more, it might be that those kludges were needed even without the jfr uses of the Standard Library. I think that may have been needed because of uses of the Standard Library by the gtest framework itself. The jfr tests then took advantage of that support. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From akozlov at openjdk.java.net Wed Mar 3 19:13:59 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 3 Mar 2021 19:13:59 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> References: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> Message-ID: <_eNysRZporueV4wITFZra8ng_O6dvxAIc0moIWOh95U=.bd3e669d-8efc-41e1-8fb2-e9f2f8e4d1f8@github.com> On Wed, 3 Mar 2021 17:46:41 GMT, Andrew Haley wrote: > A list of the bugs that our internal testing revealed so far ... Thank you! Some of them look like test issues, but a few need more serious consideration. I want to resolve https://bugs.openjdk.java.net/browse/JDK-8262903 at least, along with a few remaining comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From kvn at openjdk.java.net Wed Mar 3 19:39:51 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 3 Mar 2021 19:39:51 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v10] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 13:28:21 GMT, Lutz Schmidt wrote: >> Dear community, >> may I please request reviews for this fix, improving the usefulness of method invocation counters. >> - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). >> - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. >> - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. >> - before/after sample output is attached to the bug description. >> >> Thank you! >> Lutz > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > revert copyright change to get rid of unchanged file Good. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Wed Mar 3 20:07:42 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 3 Mar 2021 20:07:42 GMT Subject: RFR: 8261447: MethodInvocationCounters frequently run into overflow [v10] In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 19:36:41 GMT, Vladimir Kozlov wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> revert copyright change to get rid of unchanged file > > Good. Thank you, Vladimir! ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From lucy at openjdk.java.net Wed Mar 3 20:07:42 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Wed, 3 Mar 2021 20:07:42 GMT Subject: Integrated: 8261447: MethodInvocationCounters frequently run into overflow In-Reply-To: References: Message-ID: <14vfJddqmWgxR50iw7eBSCMzNqTdxvHvkbHJm-7Xo4Q=.667af6eb-0a18-4a2a-8bce-b26facf856ca@github.com> On Wed, 10 Feb 2021 16:28:29 GMT, Lutz Schmidt wrote: > Dear community, > may I please request reviews for this fix, improving the usefulness of method invocation counters. > - aggregation counters are retyped as uint64_t, shifting the overflow probability way out (> 500 years in case of a 1 GHz counter update frequency). > - counters for individual methods are interpreted as (unsigned int), in contrast to their declaration as int. This gives us a factor of two before the counters overflow. > - as a special case, "compiled_invocation_counter" is retyped as long, because it has a higher update frequency than other counters. > - before/after sample output is attached to the bug description. > > Thank you! > Lutz This pull request has now been integrated. Changeset: 268d9b79 Author: Lutz Schmidt URL: https://git.openjdk.java.net/jdk/commit/268d9b79 Stats: 164 lines in 11 files changed: 51 ins; 4 del; 109 mod 8261447: MethodInvocationCounters frequently run into overflow Reviewed-by: thartmann, mdoerr, kvn, iveresov ------------- PR: https://git.openjdk.java.net/jdk/pull/2511 From mseledtsov at openjdk.java.net Wed Mar 3 20:12:46 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Wed, 3 Mar 2021 20:12:46 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> <7LLqn-GqIlUi0R4mlO4LXvd0TTTRJB07koKe8jP8HWw=.3cb2b6ae-9b62-47c1-abe2-a1a4c92d20bb@github.com> Message-ID: <7mZianQB7xwsLfwgAp-YDlTEfZP0JCwqL55_Rs9hKss=.5a22cd7b-9792-41d1-98ee-0630ab5ecde6@github.com> On Wed, 3 Mar 2021 18:36:38 GMT, Igor Ignatyev wrote: >>> Moving these tests out of our source code is a good change. >> >> Yes I agree. The tests were an eyesore. >> >>> We already have a couple multithreaded tests in gtest, and adding this one isn't going to make the gtests unreliable. For future multithreaded tests, it makes sense to make them jtreg tests so that we can control their execution. So I'd like to see this PR integrated, with suggested changes. >> >> Okay then. I am fine with this change too. I recently learned that tests can be disabled in the source code by preceding them with "DISABLED_". Not as good as ProblemLists, since it requires recompilation, but better than nothing. >> >> --- >> @mseledts : >> >> Would it be possible to rename the gtests to make them start with "os"? We recently added jtreg tests which run the "os" part of the gtests with various large page options (see https://bugs.openjdk.java.net/browse/JDK-8257959), and if you name your tests "os" too they get tested automatically with various large page settings. Which would be a nice benefit from making them gtests. >> >> Cheers, Thomas > > Hi Thomas, > >> <... >Just wanted to understand the direction you guys plan to go with gtests in the future. If they get more complex and powerful we may need more control, eg the possibility to problemlist tests. <...> > > as you have discovered yourself, gtest already has support for the temporary exclusion of tests. I agree that it inferior to problemlists and I'd like to let you know that there is a plan to make it possible to use a problemlist-like solution to exclude gtests. however, it's not on the top of my to-do list as **currently** there is no big need for that. should it change I'll reprioritize that work. > > -- Igor On question: "Would it be possible to rename the gtests to make them start with "os"? " OK. I can rename reserve_space_concurrent to os_reserve_space_concurrent and virtual_space_concurrent to os_virtual_space_concurrent, if I do not hear any objections. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Wed Mar 3 20:15:41 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Wed, 3 Mar 2021 20:15:41 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v6] In-Reply-To: <7mZianQB7xwsLfwgAp-YDlTEfZP0JCwqL55_Rs9hKss=.5a22cd7b-9792-41d1-98ee-0630ab5ecde6@github.com> References: <5Qb2C49jolvd9gJ0pX4SmMhB0uBvcnCsPsqv2runCoE=.44ee59a4-9e32-4b70-ad05-efbb55088fe4@github.com> <7LLqn-GqIlUi0R4mlO4LXvd0TTTRJB07koKe8jP8HWw=.3cb2b6ae-9b62-47c1-abe2-a1a4c92d20bb@github.com> <7mZianQB7xwsLfwgAp-YDlTEfZP0JCwqL55_Rs9hKss=.5a22cd7b-9792-41d1-98ee-0630ab5ecde6@github.com> Message-ID: On Wed, 3 Mar 2021 20:10:00 GMT, Mikhailo Seledtsov wrote: >> Hi Thomas, >> >>> <... >Just wanted to understand the direction you guys plan to go with gtests in the future. If they get more complex and powerful we may need more control, eg the possibility to problemlist tests. <...> >> >> as you have discovered yourself, gtest already has support for the temporary exclusion of tests. I agree that it inferior to problemlists and I'd like to let you know that there is a plan to make it possible to use a problemlist-like solution to exclude gtests. however, it's not on the top of my to-do list as **currently** there is no big need for that. should it change I'll reprioritize that work. >> >> -- Igor > > On question: "Would it be possible to rename the gtests to make them start with "os"? " > OK. I can rename reserve_space_concurrent to os_reserve_space_concurrent and virtual_space_concurrent to os_virtual_space_concurrent, if I do not hear any objections. Thank you to all the participants for your feedback, discussion and comments. To summarize I will make the following changes: - rename concurrent_test_runner.inline.hpp to concurrentTestRunner.inline.hpp - replace use of std::vector by os:malloced array - include all the necessary includes into concurrentTestRunner.inline.hpp so it is self-sufficient - remove "virtual ~ConcurrentTestRunner() {}" - rename reserve_space_concurrent to os_reserve_space_concurrent and virtual_space_concurrent to os_virtual_space_concurrent ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From github.com+168222+mgkwill at openjdk.java.net Wed Mar 3 22:35:00 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Wed, 3 Mar 2021 22:35:00 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v17] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: Cast os::vm_page_size to size_t, fix build Signed-off-by: Marcus G K Williams ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/1153/files - new: https://git.openjdk.java.net/jdk/pull/1153/files/f2e44ac7..30e01e17 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=16 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=15-16 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From github.com+168222+mgkwill at openjdk.java.net Wed Mar 3 23:23:40 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Wed, 3 Mar 2021 23:23:40 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v16] In-Reply-To: References: <5Hmhp7S8616Kfbdsu5ObzFNy2uUFgJPCp0kvHr-U310=.3cabbe74-fe65-436b-973d-d6f3e64cd743@github.com> Message-ID: On Wed, 24 Feb 2021 15:59:36 GMT, Stefan Johansson wrote: >>> > What do you think? I think this would be a bit easier to read and understand, and we have that clear separation between scanning OS info and deciding what we do with it. >>> >>> I think what you propose Thomas looks good. One additional thing to keep in mind and think about here is how we should do the "sanity checking" when allowing multiple large page sizes. I think the best thing would be to sanity check all and if none succeeds disable `UseLargePages`. >> >> Oh, sure. I made this not explicit but implied this under "post processing and deciding". Presumably in the context of setup_large_page_type(). >> >>> >>> > Still a small nit is that we let the user override the OS info with LargePageSizeInBytes. I rather would have a variable containing unmodified OS info, and a separate variable for whatever we make up. But thats just a small issue. >>> >>> I think we need to rethink exactly what `LargePageSizeInBytes` means when allowing multiple large page sizes. I've poked around in this area quite a bit lately and I'm not sure this flag is needed when we scan for available page sizes. But to allow it to go away we would have to change the APIs a bit to start passing down the page size we want to use for a certain mapping rather than using `os::large_page_size()` to get the page size. >> >> If we could do without this flag this would be fine for me too. But how would you let the user specify that the VM is to use a different default page size than is set on system level? > >> > > What do you think? I think this would be a bit easier to read and understand, and we have that clear separation between scanning OS info and deciding what we do with it. >> > >> > >> > I think what you propose Thomas looks good. One additional thing to keep in mind and think about here is how we should do the "sanity checking" when allowing multiple large page sizes. I think the best thing would be to sanity check all and if none succeeds disable `UseLargePages`. >> >> Oh, sure. I made this not explicit but implied this under "post processing and deciding". Presumably in the context of setup_large_page_type(). >> > Sure, got that, just wanted to highlight that we need to figure out how to handle the sanity check for multiple sizes. Should a size that fail the sanity check be removed from the `_page_sizes` member. Maybe `_page_sizes` should include all page sizes, and then we have an additional member for "useable large page sizes". As I said, not sure how to best handle this. > >> > > Still a small nit is that we let the user override the OS info with LargePageSizeInBytes. I rather would have a variable containing unmodified OS info, and a separate variable for whatever we make up. But thats just a small issue. >> > >> > >> > I think we need to rethink exactly what `LargePageSizeInBytes` means when allowing multiple large page sizes. I've poked around in this area quite a bit lately and I'm not sure this flag is needed when we scan for available page sizes. But to allow it to go away we would have to change the APIs a bit to start passing down the page size we want to use for a certain mapping rather than using `os::large_page_size()` to get the page size. >> >> If we could do without this flag this would be fine for me too. But how would you let the user specify that the VM is to use a different default page size than is set on system level? > > I agree, it's not obvious how to make this work in a good way. But using the `os::page_size_for_region*` functions in the upper layers to request a page size could be one solution. But we probably need to have a way to change the "default" value for some cases. > > Another thing to think about/discuss is what should be done if a reservation-request within the VM for 4G with 1G pages fail, should we fall straight back to 4k page, should we try 2M page or possible fail hard to show something is probably wrong with the config. Hi @kstefanj and @tstuefe . Trying to resolve your comments and working through your suggestions. I will be responding more over the next day or so as I try to implement and understand what you are proposing. Thanks again for your review and suggestions. ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From mseledtsov at openjdk.java.net Thu Mar 4 00:00:09 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Thu, 4 Mar 2021 00:00:09 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v7] In-Reply-To: References: Message-ID: <02htql4KtxmQwHyjQy6zk3D2O0M9qsyANiNJECVyv6c=.717cda30-0863-45c8-bc57-ad8efa57be37@github.com> > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Renamed concurrent_test_runner.inline.hpp back to concurrentTestRunner.inline.hpp per review feedback ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/d9f618c2..bed410bf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=05-06 Stats: 3 lines in 4 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Thu Mar 4 00:07:08 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Thu, 4 Mar 2021 00:07:08 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v8] In-Reply-To: References: Message-ID: > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Addressing review feedback ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/bed410bf..4a9d00dd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=06-07 Stats: 4 lines in 1 file changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Thu Mar 4 02:44:15 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Thu, 4 Mar 2021 02:44:15 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v9] In-Reply-To: References: Message-ID: > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with two additional commits since the last revision: - Adding os_ prefix to concurrent virtual space tests - Using os:malloc instead of std::vector ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/4a9d00dd..1963a7e7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=08 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=07-08 Stats: 9 lines in 2 files changed: 5 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Thu Mar 4 07:24:41 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 4 Mar 2021 07:24:41 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v16] In-Reply-To: References: <5Hmhp7S8616Kfbdsu5ObzFNy2uUFgJPCp0kvHr-U310=.3cabbe74-fe65-436b-973d-d6f3e64cd743@github.com> Message-ID: On Wed, 24 Feb 2021 15:59:36 GMT, Stefan Johansson wrote: >>> > What do you think? I think this would be a bit easier to read and understand, and we have that clear separation between scanning OS info and deciding what we do with it. >>> >>> I think what you propose Thomas looks good. One additional thing to keep in mind and think about here is how we should do the "sanity checking" when allowing multiple large page sizes. I think the best thing would be to sanity check all and if none succeeds disable `UseLargePages`. >> >> Oh, sure. I made this not explicit but implied this under "post processing and deciding". Presumably in the context of setup_large_page_type(). >> >>> >>> > Still a small nit is that we let the user override the OS info with LargePageSizeInBytes. I rather would have a variable containing unmodified OS info, and a separate variable for whatever we make up. But thats just a small issue. >>> >>> I think we need to rethink exactly what `LargePageSizeInBytes` means when allowing multiple large page sizes. I've poked around in this area quite a bit lately and I'm not sure this flag is needed when we scan for available page sizes. But to allow it to go away we would have to change the APIs a bit to start passing down the page size we want to use for a certain mapping rather than using `os::large_page_size()` to get the page size. >> >> If we could do without this flag this would be fine for me too. But how would you let the user specify that the VM is to use a different default page size than is set on system level? > >> > > What do you think? I think this would be a bit easier to read and understand, and we have that clear separation between scanning OS info and deciding what we do with it. >> > >> > >> > I think what you propose Thomas looks good. One additional thing to keep in mind and think about here is how we should do the "sanity checking" when allowing multiple large page sizes. I think the best thing would be to sanity check all and if none succeeds disable `UseLargePages`. >> >> Oh, sure. I made this not explicit but implied this under "post processing and deciding". Presumably in the context of setup_large_page_type(). >> > Sure, got that, just wanted to highlight that we need to figure out how to handle the sanity check for multiple sizes. Should a size that fail the sanity check be removed from the `_page_sizes` member. Maybe `_page_sizes` should include all page sizes, and then we have an additional member for "useable large page sizes". As I said, not sure how to best handle this. > >> > > Still a small nit is that we let the user override the OS info with LargePageSizeInBytes. I rather would have a variable containing unmodified OS info, and a separate variable for whatever we make up. But thats just a small issue. >> > >> > >> > I think we need to rethink exactly what `LargePageSizeInBytes` means when allowing multiple large page sizes. I've poked around in this area quite a bit lately and I'm not sure this flag is needed when we scan for available page sizes. But to allow it to go away we would have to change the APIs a bit to start passing down the page size we want to use for a certain mapping rather than using `os::large_page_size()` to get the page size. >> >> If we could do without this flag this would be fine for me too. But how would you let the user specify that the VM is to use a different default page size than is set on system level? > > I agree, it's not obvious how to make this work in a good way. But using the `os::page_size_for_region*` functions in the upper layers to request a page size could be one solution. But we probably need to have a way to change the "default" value for some cases. > > Another thing to think about/discuss is what should be done if a reservation-request within the VM for 4G with 1G pages fail, should we fall straight back to 4k page, should we try 2M page or possible fail hard to show something is probably wrong with the config. > Hi @kstefanj and @tstuefe . Trying to resolve your comments and working through your suggestions. I will be responding more over the next day or so as I try to implement and understand what you are proposing. Thanks again for your review and suggestions. Well, thanks for your patience :) ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From stuefe at openjdk.java.net Thu Mar 4 07:50:41 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 4 Mar 2021 07:50:41 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v9] In-Reply-To: References: Message-ID: On Thu, 4 Mar 2021 02:44:15 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with two additional commits since the last revision: > > - Adding os_ prefix to concurrent virtual space tests > - Using os:malloc instead of std::vector Hi Misha, two small remarks. As I wrote yesterday, I think this is a good cleanup despite my doubts. I don't see any GAs running. Can you enable them please? Under "Checks" we should see that all platforms build; also, since the gtests are part of tier1, they run too and we see that they work in all configurations. (eg like this: https://github.com/openjdk/jdk/pull/2751/checks) You may have to enable github actions if you have never done so and this is your first work in your personal jdk fork (see "Actions" tab under your repo). Cheers, Thomas test/hotspot/gtest/concurrentTestRunner.inline.hpp line 76: > 74: size_t sz = sizeof(UnitTestThread*) * nrOfThreads; > 75: UnitTestThread** t = (UnitTestThread**) os::malloc(sz, mtInternal); > 76: memset((void*)t, 0, sz); The memset is not needed. And if you wanted you could shorten this to one line using: UnitTestThread** t = NEW_C_HEAP_ARRAY(UnitTestThread*, nrOfThreads, mtInternal); which would take care of size calculation and the cast for you. Also, it gives you automatic OOM handling should this ever be a problem. If you do that, aesthetically it would make sense to change the 'os::free' below to: 'FREE_C_HEAP_ARRAY(UnitTestThread**, t);` though its just a wrapper around os::free. And you'd need to include memory/allocation.hpp. If you keep using os::malloc, which would be fine too, you need runtime/os.hpp. ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Thu Mar 4 07:50:42 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 4 Mar 2021 07:50:42 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v9] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 00:42:46 GMT, Igor Ignatyev wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Adding os_ prefix to concurrent virtual space tests >> - Using os:malloc instead of std::vector > > test/hotspot/gtest/concurrentTestRunner.inline.hpp line 100: > >> 98: }; >> 99: >> 100: #endif // include guard > > we tend to use the expression used in the corresponding `#if` / `#ifdef` as a comment in `#endif` "include guard" is self evident. We usually write the name of it, like this: #endif // GTEST_CONCURRENT_TEST_RUNNER_INLINE_HPP ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From thartmann at openjdk.java.net Thu Mar 4 08:57:52 2021 From: thartmann at openjdk.java.net (Tobias Hartmann) Date: Thu, 4 Mar 2021 08:57:52 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> Message-ID: On Wed, 3 Mar 2021 15:56:51 GMT, Igor Ignatyev wrote: >> Sorry for missing the @TobiHartmann (I had github notifications disabled). >> Unlike VM internal adapters/buffers/runtime stubs, Method handle intrinsics are actual Java methods that are compiled by the JIT and therefore need to go into the profiled or non-profiled code cache segment. They can not go into the non-nmethod segment. > > Hi @TobiHartmann, > > method handle intrinsics aren't regular java methods, otherwise, the failure to compile them won't result in `VirtualMachineError`. and, IIRC, their lifecycle is also different from that of nmethods, i.e. we don't sweep them. so I'm wondering why they can't go into non-nmethod segment. > > -- Igor Yes, they are native wrappers but, in contrast to c2i/i2c adapters, they are still implemented as `nmethods`. I'm not a JSR292 expert but I think this is because they are potentially containing (meta)data that needs to be discovered when walking the code cache and iterator methods like `CodeCache::metadata_do` will only walk the `nmethod` heaps. They might also use other properties of `nmethods`. So I think the question would be more like "could native wrappers be implemented as `BufferBlobs`, similar to i2c/c2i adapters?" @iwanowww, what do you think? ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From dongbo at openjdk.java.net Thu Mar 4 09:35:43 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Thu, 4 Mar 2021 09:35:43 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Wed, 3 Mar 2021 12:45:35 GMT, Andrew Haley wrote: >>> > OKAY, this make sense to us. >>> > If it is OK to keep the exclusive part of this patch? :-) >>> > As far as we know, the exclusive instructions are not being revised. >>> > And we see `ldxr+stxlr+dmb` have been used in linux kernel since 2014 [1], and still used by now [2]. >>> >>> I know that, but the Linux definition of a "full barrier" isn't quite as strong as HotSpot's `memory_order_conservative`, so we'd need a much more detailed analysis of what behaviours we can permit. Also, we'd have to find a strong reason to invest time in AArch64 without LSE instructions. >>> >>> > BTW, the barrier-ordered-before applies with stlxr according to the architecture specification: >>> >>> Sure, but so what? This is about the entire ldxr/stlxr combination and `memory_order_conservative` , in which we try to mimic Intel's "Loads and Stores Are Not Reordered with Locked Instructions" specification. >> >> Hi, >> >> For us, we still have servers used by our customers that does not support LSE extension. >> >> Hm, from our point of view, `ldaxr+stlxr+dmb` and `ldxr+stlxr+dmb` provide the same order semantics. >> The acquire are used to ensure all loads/stores that are after an `ldaxr` (actually loads/stores after the `dmb` of `atomic_*default*_impl` in this case) in program order, while the `dmb` has already guaranteed this for us. >> Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. >> Remove the acquire does not change the order between preceding loads/stores and `stlxr`. > >> Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. >> Remove the acquire does not change the order between preceding loads/stores and `stlxr`. > > Looks like it. I tried this example, which makes sure that a preceding store does not pass the load in LDXR;STLXR;DMB : > > MOConservative > { 0:X0=a; 0:X1=b; 1:X0=a; 1:X1=b; a=0; b=0; } > P0 | P1; > MOV W3, #1 | MOV W3, #1; > STLR W3, [X0] | STR W3, [X1]; > LDAR W1, [X1] | LDXR W1, [X0]; > | STLXR W5, W4, [X0]; > | CBZ W5, FOO; > | MOV W1, #99; > | FOO: ; > | DMB ISH; > exists > (0:X1=0 /\ 1:X1 = 0) > > I don't think a preceding load can be reordered with the ```ldxr``` either, but I haven't written that test. > > Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. > > Remove the acquire does not change the order between preceding loads/stores and `stlxr`. > > Looks like it. I tried this example, which makes sure that a preceding store does not pass the load in LDXR;STLXR;DMB : > > ``` > { 0:X0=a; 0:X1=b; 1:X0=a; 1:X1=b; a=0; b=0; } > P0 | P1; > MOV W3, #1 | MOV W3, #1; > STLR W3, [X0] | STR W3, [X1]; > LDAR W1, [X1] | LDXR W1, [X0]; > | STLXR W5, W4, [X0]; > | CBZ W5, FOO; > | MOV W1, #99; > | FOO: ; > | DMB ISH; > exists > (0:X1=0 /\ 1:X1 = 0) > ``` > > I don't think a preceding load can be reordered with the `ldxr` either, but I haven't written that test. Hi, I tried the example below to verify the order of preceding load as mentioned: AArch64 ExclusiveRead { 0:X8=a; 0:X9=b; 1:X8=a; 1:X9=b; a=1; b=1; } P0 | P1 ; MOV W3, #3 | LDR W3, [X9] ; MOV W4, #5 | LDXR W1, [X8] ; STR W3, [X8] | STLXR W5, W4, [X8] ; STLR W4, [X9] | CBZ W5, FOO ; | MOV W1, #99 ; | FOO: ; | DMB ISH ; exists (1:X1=1 /\ 1:X3=5) No `1:X1=1 /\ 1:X3=5` witnessed. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From aph at openjdk.java.net Thu Mar 4 10:01:39 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 4 Mar 2021 10:01:39 GMT Subject: RFR: 8239386: handle ContendedPaddingWidth in vm_version_aarch64 In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 18:38:05 GMT, Dan wrote: > Handle ContendedPaddingWidth the same way other architectures do > > Passes Hotspot Tier1 Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2814 From vlivanov at openjdk.java.net Thu Mar 4 13:18:50 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 4 Mar 2021 13:18:50 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> Message-ID: On Thu, 4 Mar 2021 08:54:44 GMT, Tobias Hartmann wrote: >> Hi @TobiHartmann, >> >> method handle intrinsics aren't regular java methods, otherwise, the failure to compile them won't result in `VirtualMachineError`. and, IIRC, their lifecycle is also different from that of nmethods, i.e. we don't sweep them. so I'm wondering why they can't go into non-nmethod segment. >> >> -- Igor > > Yes, they are native wrappers but, in contrast to c2i/i2c adapters, they are still implemented as `nmethods`. I'm not a JSR292 expert but I think this is because they are potentially containing (meta)data that needs to be discovered when walking the code cache and iterator methods like `CodeCache::metadata_do` will only walk the `nmethod` heaps. They might also use other properties of `nmethods`. So I think the question would be more like "could native wrappers be implemented as `BufferBlobs`, similar to i2c/c2i adapters?" > @iwanowww, what do you think? I don't see a compelling reason why method handle linkers have to be nmethods and live in 'profiled'/'non-profiled' code heaps. I think the reason why it works that way now is the linkers are treated as ordinary native wrappers (since linker methods are just signature-polymorphic native static methods declared on `java.lang.invoke.MethodHandle` class). But native wrappers are represented as `nmethod`s for a reason: they can be unloaded along with the class. It's not the case with MH linkers which aren't unloaded at all. Please, file an RFE if you find it desirable to put MH linkers into 'non-nmethods' heap. ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From lkorinth at openjdk.java.net Thu Mar 4 13:23:41 2021 From: lkorinth at openjdk.java.net (Leo Korinth) Date: Thu, 4 Mar 2021 13:23:41 GMT Subject: Integrated: 8262000: jdk/jfr/event/gc/detailed/TestPromotionFailedEventWithParallelScavenge.java failed with "OutOfMemoryError: Java heap space" In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 09:43:55 GMT, Leo Korinth wrote: > By relaxing the matching of OOM error slightly, the test case can catch OOM errors not starting with "Exception: " This pull request has now been integrated. Changeset: d2c4ed08 Author: Leo Korinth URL: https://git.openjdk.java.net/jdk/commit/d2c4ed08 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod 8262000: jdk/jfr/event/gc/detailed/TestPromotionFailedEventWithParallelScavenge.java failed with "OutOfMemoryError: Java heap space" Reviewed-by: tschatzl, egahlin ------------- PR: https://git.openjdk.java.net/jdk/pull/2806 From coleenp at openjdk.java.net Thu Mar 4 13:24:21 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 4 Mar 2021 13:24:21 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error Message-ID: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. I didn't squash the commits so it would be easier to see the different changes, but they all go together. The test description: Two Threads T1, T2 Three definitions of class A, defined by user defined class loader Class A extends B extends A (CCE) Class A extends B Class A extends C Five modes: Sequential Concurrent loading with user defined class loader Concurrent loading parallelCapable class loader Wait when loading the superclass with parallelCapable class loader Wait when loading the superclass with user defined class loader In all cases, after A is parsed and calls resolve_super_or_fail to load B and loading B waits. Classes ClassInLoader, CP1 and CP2 provide constant pool references to A. In all cases, when B waits, A is replaced with bytes so A extends C. Two tests x 3 modes (both threads do the same): (CCE) First test A extends B, which throws CCE. -- All three modes: first constant pool reference throws CCE, second reference A extends C (B) Second test A extends B which doesn't throw CCE. -- All three modes: both references A extends B. The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is currently loading. Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. ------------- Commit messages: - Vladimir Ivanov's compiler patch. - Save Throwable::cause also to the resolution error table. - Fix deoptimization and compiler to preserve and recognize constant pool class loading errors. - 8262377: Parallel class resolution loses constant pool error - Add test for parallel class loading Changes: https://git.openjdk.java.net/jdk/pull/2718/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262377 Stats: 833 lines in 22 files changed: 727 ins; 44 del; 62 mod Patch: https://git.openjdk.java.net/jdk/pull/2718.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2718/head:pull/2718 PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Thu Mar 4 13:53:46 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 4 Mar 2021 13:53:46 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 24 Feb 2021 23:51:58 GMT, Coleen Phillimore wrote: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. try again ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Thu Mar 4 13:57:38 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 4 Mar 2021 13:57:38 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Thu, 4 Mar 2021 13:50:57 GMT, Coleen Phillimore wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > try again wonder if I have to add myself now. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From aph at openjdk.java.net Thu Mar 4 14:26:46 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 4 Mar 2021 14:26:46 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Thu, 4 Mar 2021 09:33:18 GMT, Dong Bo wrote: >>> Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. >>> Remove the acquire does not change the order between preceding loads/stores and `stlxr`. >> >> Looks like it. I tried this example, which makes sure that a preceding store does not pass the load in LDXR;STLXR;DMB : >> >> MOConservative >> { 0:X0=a; 0:X1=b; 1:X0=a; 1:X1=b; a=0; b=0; } >> P0 | P1; >> MOV W3, #1 | MOV W3, #1; >> STLR W3, [X0] | STR W3, [X1]; >> LDAR W1, [X1] | LDXR W1, [X0]; >> | STLXR W5, W4, [X0]; >> | CBZ W5, FOO; >> | MOV W1, #99; >> | FOO: ; >> | DMB ISH; >> exists >> (0:X1=0 /\ 1:X1 = 0) >> >> I don't think a preceding load can be reordered with the ```ldxr``` either, but I haven't written that test. > >> > Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. >> > Remove the acquire does not change the order between preceding loads/stores and `stlxr`. >> >> Looks like it. I tried this example, which makes sure that a preceding store does not pass the load in LDXR;STLXR;DMB : >> >> ``` >> { 0:X0=a; 0:X1=b; 1:X0=a; 1:X1=b; a=0; b=0; } >> P0 | P1; >> MOV W3, #1 | MOV W3, #1; >> STLR W3, [X0] | STR W3, [X1]; >> LDAR W1, [X1] | LDXR W1, [X0]; >> | STLXR W5, W4, [X0]; >> | CBZ W5, FOO; >> | MOV W1, #99; >> | FOO: ; >> | DMB ISH; >> exists >> (0:X1=0 /\ 1:X1 = 0) >> ``` >> >> I don't think a preceding load can be reordered with the `ldxr` either, but I haven't written that test. > > Hi, I tried the example below to verify the order of preceding load as mentioned: > AArch64 ExclusiveRead > { 0:X8=a; 0:X9=b; 1:X8=a; 1:X9=b; a=1; b=1; } > P0 | P1 ; > MOV W3, #3 | LDR W3, [X9] ; > MOV W4, #5 | LDXR W1, [X8] ; > STR W3, [X8] | STLXR W5, W4, [X8] ; > STLR W4, [X9] | CBZ W5, FOO ; > | MOV W1, #99 ; > | FOO: ; > | DMB ISH ; > exists > (1:X1=1 /\ 1:X3=5) > No `1:X1=1 /\ 1:X3=5` witnessed. And this one (I think) makes sure that there is a total order of stores: AArch64 SeqCst { 0:X0=a; 0:X1=b; 1:X0=a; 1:X1=b; a=0; b=0; } P0 | P1; MOV W3, #1 | MOV W3, #1; MOV W8, #99 | MOV W8, #99 ; STR W3, [X0] | STR W3, [X1]; LDXR W2, [X1] | LDXR W2, [X0]; STLXR W5, W8, [X1] | STLXR W5, W8, [X0] ; CBZ W5, FOO | CBZ W5, FOO1 ; MOV W2, #1234 | MOV W2, #1234 ; FOO: | FOO1: ; DMB ST | DMB ST ; exists (0:X2=0 /\ 1:X2 = 0) Test SeqCst Allowed States 8 0:X2=0; 1:X2=1; 0:X2=0; 1:X2=1234; 0:X2=1; 1:X2=0; 0:X2=1; 1:X2=1; 0:X2=1; 1:X2=1234; 0:X2=1234; 1:X2=0; 0:X2=1234; 1:X2=1; 0:X2=1234; 1:X2=1234; No Witnesses Positive: 0 Negative: 15 Condition exists (0:X2=0 /\ 1:X2=0) Observation SeqCst Never 0 15 ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From hseigel at openjdk.java.net Thu Mar 4 15:23:44 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 4 Mar 2021 15:23:44 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 24 Feb 2021 23:51:58 GMT, Coleen Phillimore wrote: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. src/hotspot/share/oops/constantPool.cpp line 565: > 563: (jbyte)JVM_CONSTANT_Class); > 564: > 565: if (old_tag != JVM_CONSTANT_UnresolvedClass) { I think this check at line 565 can be removed. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From gziemski at openjdk.java.net Thu Mar 4 15:29:59 2021 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Thu, 4 Mar 2021 15:29:59 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> References: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> Message-ID: On Wed, 3 Mar 2021 17:46:41 GMT, Andrew Haley wrote: > > A list of the bugs that our internal testing revealed so far: > > Are any of these blockers for integration? Some of them are to do with things like features that aren't yet supported, and we can't fix what we can't see. I don't personally think any of these issues are blockers. It's a great effort as it is and very much appreciated. Anything else can be fixed as a followup. There might be some legal requirements (i.e. JCK) that I'm not in position to comment on, however, so someone else might need to chime in here. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Thu Mar 4 15:47:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 4 Mar 2021 15:47:40 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Thu, 4 Mar 2021 15:21:07 GMT, Harold Seigel wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > src/hotspot/share/oops/constantPool.cpp line 565: > >> 563: (jbyte)JVM_CONSTANT_Class); >> 564: >> 565: if (old_tag != JVM_CONSTANT_UnresolvedClass) { > > I think this check at line 565 can be removed. Yes, it can. I was checking if the race failed but the following if statement does that too. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From github.com+4146708+a74nh at openjdk.java.net Thu Mar 4 17:39:01 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Thu, 4 Mar 2021 17:39:01 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> Message-ID: On Thu, 4 Mar 2021 15:27:25 GMT, Gerard Ziemski wrote: >>> A list of the bugs that our internal testing revealed so far: >> >> Are any of these blockers for integration? Some of them are to do with things like features that aren't yet supported, and we can't fix what we can't see. > >> > A list of the bugs that our internal testing revealed so far: >> >> Are any of these blockers for integration? Some of them are to do with things like features that aren't yet supported, and we can't fix what we can't see. > > I don't personally think any of these issues are blockers. It's a great effort as it is and very much appreciated. Anything else can be fixed as a followup. > > There might be some legal requirements (i.e. JCK) that I'm not in position to comment on, however, so someone else might need to chime in here. I was building this PR on a new machine, and I now get the following error: > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:258:31: error: cast to smaller integer type 'MIDIClientRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > static MIDIClientRef client = (MIDIClientRef) NULL; > ^~~~~~~~~~~~~~~~~~~~ > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:259:29: error: cast to smaller integer type 'MIDIPortRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > static MIDIPortRef inPort = (MIDIPortRef) NULL; > ^~~~~~~~~~~~~~~~~~ > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:260:30: error: cast to smaller integer type 'MIDIPortRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > static MIDIPortRef outPort = (MIDIPortRef) NULL; > ^~~~~~~~~~~~~~~~~~ > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:466:32: error: cast to smaller integer type 'MIDIEndpointRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > MIDIEndpointRef endpoint = (MIDIEndpointRef) NULL; > ^~~~~~~~~~~~~~~~~~~~~~ > 4 errors generated. As far as I can tell the only difference between the two systems is the xcode version: New system (failing) % xcodebuild -version Xcode 12.5 Build version 12E5244e Old system (working) % xcodebuild -version Xcode 12.4 Build version 12D4e Looks like the newer version of Xcode is being a little stricter with casting? Replacing the NULL with 0 seems to fix the issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From vkempik at openjdk.java.net Thu Mar 4 18:22:51 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Thu, 4 Mar 2021 18:22:51 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> Message-ID: On Thu, 4 Mar 2021 17:36:22 GMT, Alan Hayward wrote: > I was building this PR on a new machine, and I now get the following error: > > > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:258:31: error: cast to smaller integer type 'MIDIClientRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > > static MIDIClientRef client = (MIDIClientRef) NULL; > > ^~~~~~~~~~~~~~~~~~~~ > > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:259:29: error: cast to smaller integer type 'MIDIPortRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > > static MIDIPortRef inPort = (MIDIPortRef) NULL; > > ^~~~~~~~~~~~~~~~~~ > > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:260:30: error: cast to smaller integer type 'MIDIPortRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > > static MIDIPortRef outPort = (MIDIPortRef) NULL; > > ^~~~~~~~~~~~~~~~~~ > > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:466:32: error: cast to smaller integer type 'MIDIEndpointRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] > > MIDIEndpointRef endpoint = (MIDIEndpointRef) NULL; > > ^~~~~~~~~~~~~~~~~~~~~~ > > 4 errors generated. > > As far as I can tell the only difference between the two systems is the xcode version: > > New system (failing) > % xcodebuild -version > Xcode 12.5 > Build version 12E5244e > > Old system (working) > % xcodebuild -version > Xcode 12.4 > Build version 12D4e > > Looks like the newer version of Xcode is being a little stricter with casting? > Replacing the NULL with 0 seems to fix the issue. Hello there is one issue with the info you provided, it's from Xcode12.5 beta. And beta license agreement forbids sharing output of beta version of compiler&co So we can't say we have issue with newer xcode beta until that beta went public & released. Fixing issues you found now will mean someone have violated xcode beta license agreement and made these issues public. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Thu Mar 4 18:41:42 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 4 Mar 2021 18:41:42 GMT Subject: RFR: 8261075: Create stubRoutines.inline.hpp with SafeFetch implementation [v2] In-Reply-To: References: Message-ID: On Tue, 16 Feb 2021 16:11:36 GMT, Stefan Karlsson wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> stubRoutines.inline.hpp -> safefetch.hpp > > Marked as reviewed by stefank (Reviewer). I've messed up with this :( I'm trying to fix @stefank 's note https://github.com/openjdk/jdk/pull/2200#discussion_r572707505. I need to rename a file.hpp included in the safepoint.hpp to file.inline.hpp. This will require renaming safepoint.hpp to safepoint.inline.hpp, polluting the PR #2200 again and defeating the purpose of having this patch extracted. I think the better alternative will be a follow-up patch just renaming safepoint.hpp to safepoint.inline.hpp. I'm going to do the follow-up. Or is there a better alternative? Sorry and thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/2542 From phh at openjdk.java.net Thu Mar 4 18:57:40 2021 From: phh at openjdk.java.net (Paul Hohensee) Date: Thu, 4 Mar 2021 18:57:40 GMT Subject: RFR: 8239386: handle ContendedPaddingWidth in vm_version_aarch64 In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 18:38:05 GMT, Dan wrote: > Handle ContendedPaddingWidth the same way other architectures do > > Passes Hotspot Tier1 Marked as reviewed by phh (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2814 From github.com+7991079+danlemmond at openjdk.java.net Thu Mar 4 18:57:41 2021 From: github.com+7991079+danlemmond at openjdk.java.net (Dan) Date: Thu, 4 Mar 2021 18:57:41 GMT Subject: Integrated: 8239386: handle ContendedPaddingWidth in vm_version_aarch64 In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 18:38:05 GMT, Dan wrote: > Handle ContendedPaddingWidth the same way other architectures do > > Passes Hotspot Tier1 This pull request has now been integrated. Changeset: e61a3ba2 Author: EC2 Default User Committer: Paul Hohensee URL: https://git.openjdk.java.net/jdk/commit/e61a3ba2 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod 8239386: handle ContendedPaddingWidth in vm_version_aarch64 Reviewed-by: aph, phh ------------- PR: https://git.openjdk.java.net/jdk/pull/2814 From stefank at openjdk.java.net Thu Mar 4 19:15:39 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 4 Mar 2021 19:15:39 GMT Subject: RFR: 8261075: Create stubRoutines.inline.hpp with SafeFetch implementation [v2] In-Reply-To: References: Message-ID: <4z7djCLhn7GE7upEUtWJ1iTrQXJmF3gpSghQfTqd1WE=.5abf8ca8-591c-4701-8ce2-35da0f141241@github.com> On Thu, 4 Mar 2021 18:39:02 GMT, Anton Kozlov wrote: >> Marked as reviewed by stefank (Reviewer). > > I've messed up with this :( I'm trying to fix @stefank 's note https://github.com/openjdk/jdk/pull/2200#discussion_r572707505. I need to rename a file.hpp included in the safepoint.hpp to file.inline.hpp. This will require renaming safepoint.hpp to safepoint.inline.hpp, polluting the PR #2200 again and defeating the purpose of having this patch extracted. I think the better alternative will be a follow-up patch just renaming safepoint.hpp to safepoint.inline.hpp. I'm going to do the follow-up. Or is there a better alternative? Sorry and thanks. Just so that I understand. Did you really mean safepoint.hpp and not safefetch.hpp? I assume this all has to do with the fact that safefetch.hpp is going to include threadWXSetter.hpp, which you are going to change to threadWXSsetter.inline.hpp, because it includes thread.inline.hpp. If that's the case then I think the easy fix is to just rename safefetch.hpp to safefetch.inline.hpp. I don't think that will be problematic, or defeat the purpose of this patch. The previous change from stubRoutines.hpp to stubRoutines.inline.hpp has already been successfully been removed from your #2200 patch. If you create a new PR with a safefetch.hpp to safefetch.inline.hpp rename, I can review it as a trivial change and you will be able to push it immediately. Or did I misunderstand anything? Maybe you could point me to the problematic files? ------------- PR: https://git.openjdk.java.net/jdk/pull/2542 From mseledtsov at openjdk.java.net Thu Mar 4 20:18:44 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Thu, 4 Mar 2021 20:18:44 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v9] In-Reply-To: References: Message-ID: On Thu, 4 Mar 2021 07:48:11 GMT, Thomas Stuefe wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Adding os_ prefix to concurrent virtual space tests >> - Using os:malloc instead of std::vector > > Hi Misha, > > two small remarks. As I wrote yesterday, I think this is a good cleanup despite my doubts. > > I don't see any GAs running. Can you enable them please? Under "Checks" we should see that all platforms build; also, since the gtests are part of tier1, they run too and we see that they work in all configurations. (eg like this: https://github.com/openjdk/jdk/pull/2751/checks) > > You may have to enable github actions if you have never done so and this is your first work in your personal jdk fork (see "Actions" tab under your repo). > > Cheers, Thomas Thomas, Thank you again for reviewing this change. - I will figure out and enable GitHub actions - will update allocation/free to use NEW_C_HEAP_ARRAY/FREE_C_HEAP_ARRAY from memory/allocation.hpp - #endif // GTEST_CONCURRENT_TEST_RUNNER_INLINE_HPP - will run internal testing, including available builds, tier1 and gtests on all usual platforms - if I do not receive additional feedback, my plan is to integrate on Monday Thanks, Misha ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Thu Mar 4 21:52:18 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Thu, 4 Mar 2021 21:52:18 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v10] In-Reply-To: References: Message-ID: > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: Using C_HEAP_ARRAY macros plus a minor fix ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2436/files - new: https://git.openjdk.java.net/jdk/pull/2436/files/1963a7e7..35bd90f4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2436&range=08-09 Stats: 6 lines in 1 file changed: 1 ins; 2 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/2436.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2436/head:pull/2436 PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Fri Mar 5 06:16:40 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 5 Mar 2021 06:16:40 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v10] In-Reply-To: References: Message-ID: On Thu, 4 Mar 2021 21:52:18 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Using C_HEAP_ARRAY macros plus a minor fix Looks good to me now. I see the GAs did run okay on all platforms, including the gtests (see hotspot tier1 common tests). Thanks! ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2436 From stuefe at openjdk.java.net Fri Mar 5 06:22:10 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 5 Mar 2021 06:22:10 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms Message-ID: `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. In addition to that, this patch does a number of small changes: 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. 2) I added a comment to the function to not use it outside of fatal error situations. 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. 4) consistently used global scope :: for posix APIs. Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. ---- Tests: GAs, manual tests using -XX:ShowMessageBoxOnError ------------- Commit messages: - start Changes: https://git.openjdk.java.net/jdk/pull/2810/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262955 Stats: 293 lines in 8 files changed: 85 ins; 202 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2810.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2810/head:pull/2810 PR: https://git.openjdk.java.net/jdk/pull/2810 From dholmes at openjdk.java.net Fri Mar 5 06:24:03 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 5 Mar 2021 06:24:03 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 24 Feb 2021 23:51:58 GMT, Coleen Phillimore wrote: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. src/hotspot/share/oops/constantPool.cpp line 555: > 553: throw_resolution_error(this_cp, which, CHECK_NULL); > 554: } > 555: I'm unclear how this race is resolved. There must be a serialization point between the two threads otherwise this re-check is just as racy as the original. We need to know that the CP entry is now stable and was resolved either by this thread or the other one. But I can't see where this serialization point arises. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From dholmes at openjdk.java.net Fri Mar 5 06:44:45 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 5 Mar 2021 06:44:45 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 24 Feb 2021 23:51:58 GMT, Coleen Phillimore wrote: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. More comments. Sorry I'm too confused by the details to actually review this in depth. src/hotspot/share/oops/constantPool.cpp line 545: > 543: // To preserve old behavior, we return the resolved class. > 544: Klass* klass = this_cp->resolved_klasses()->at(resolved_klass_index); > 545: assert(klass != NULL, "must be resolved if exception was cleared"); Existing code but this seems a bizarre thing to do. The only way the two resolving threads can disagree about the resolution is if we have a bad classloader. With a bad classloader it is hard to justify any argument about what should happen, so it would have seemed preferrable to me to always report the error. It is a race so the outcome is arbitrary to begin with. src/hotspot/share/oops/constantPool.cpp line 555: > 553: > 554: Klass** adr = this_cp->resolved_klasses()->adr_at(resolved_klass_index); > 555: Atomic::release_store(adr, k); If we are racing then isn't it the case that we may not have an entry in resolved_klasses()? src/hotspot/share/oops/constantPool.cpp line 569: > 567: if (old_tag == JVM_CONSTANT_UnresolvedClassInError) { > 568: // Remove klass. > 569: Atomic::release_store(adr, (Klass*)NULL); This is all very unclear to me. The CAS above provides a serialization point for the racing threads, but prior to that we have already published the klass into adr and now we are saying "oops I'd better undo that", but surely it is too late as we have let it escape. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From david.holmes at oracle.com Fri Mar 5 06:46:26 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Mar 2021 16:46:26 +1000 Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: <954a8d3c-ad2a-cdd1-8e34-a4c36ddcfecc@oracle.com> Please ignore this comment as it was made when only looking at one commit. (And the email has put it in the wrong context anyway.) Thanks, David On 5/03/2021 4:24 pm, David Holmes wrote: > On Wed, 24 Feb 2021 23:51:58 GMT, Coleen Phillimore wrote: > >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > src/hotspot/share/oops/constantPool.cpp line 555: > >> 553: throw_resolution_error(this_cp, which, CHECK_NULL); >> 554: } >> 555: > > I'm unclear how this race is resolved. There must be a serialization point between the two threads otherwise this re-check is just as racy as the original. We need to know that the CP entry is now stable and was resolved either by this thread or the other one. But I can't see where this serialization point arises. ?? > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/2718 > From david.holmes at oracle.com Fri Mar 5 06:55:11 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 5 Mar 2021 16:55:11 +1000 Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms In-Reply-To: References: Message-ID: Hi Thomas, On 5/03/2021 4:22 pm, Thomas Stuefe wrote: > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) > > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. > > In addition to that, this patch does a number of small changes: > > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. Is it? The reason we use fork() for the error/crash case is because it can get launched from a signal handling context and vfork is not async-signal-safe. There is some commentary in: https://bugs.openjdk.java.net/browse/JDK-8027434 Cheers, David > - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. > 2) I added a comment to the function to not use it outside of fatal error situations. > 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. > 4) consistently used global scope :: for posix APIs. > > Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. > > ---- > > Tests: GAs, manual tests using -XX:ShowMessageBoxOnError > > ------------- > > Commit messages: > - start > > Changes: https://git.openjdk.java.net/jdk/pull/2810/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8262955 > Stats: 293 lines in 8 files changed: 85 ins; 202 del; 6 mod > Patch: https://git.openjdk.java.net/jdk/pull/2810.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/2810/head:pull/2810 > > PR: https://git.openjdk.java.net/jdk/pull/2810 > From stuefe at openjdk.java.net Fri Mar 5 07:27:40 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 5 Mar 2021 07:27:40 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms In-Reply-To: References: Message-ID: On Wed, 3 Mar 2021 14:16:19 GMT, Thomas Stuefe wrote: > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) > > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. > > In addition to that, this patch does a number of small changes: > > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. > - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. > 2) I added a comment to the function to not use it outside of fatal error situations. > 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. > 4) consistently used global scope :: for posix APIs. > > Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. > > ---- > > Tests: GAs, manual tests using -XX:ShowMessageBoxOnError > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > Hi Thomas, > > On 5/03/2021 4:22 pm, Thomas Stuefe wrote: > > > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: > > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) > > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) > > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. > > In addition to that, this patch does a number of small changes: > > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: > > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. > > Is it? The reason we use fork() for the error/crash case is because it > can get launched from a signal handling context and vfork is not > async-signal-safe. > > There is some commentary in: > > https://bugs.openjdk.java.net/browse/JDK-8027434 > > Cheers, > David Hi David, My estimate of vfork being safe here comes from experience. At SAP, at one time we replaced the jdk's Runtime.exec() implementation completely with our own; it lived in hotspot and was used by both the jdk and the hotspot, in and out of signal contexts. Our implementation mainly used vfork(), with a lot of safe guards of course (mainly a forkhelper binary, similarly to what Runtime.exec() does today). I cannot completely exclude the possibility of problems here, but calling vfork()->exec()->_exit() is as safe as it gets. Seeing that we only use it for starting a debugger in case the original VM crashed, I think the possibility of problems is remote. OTOH when starting a debugger spawned from a fat process, you may run into the same problems as with https://bugs.openjdk.java.net/browse/JDK-8027434. You don't want the machine to go into swap when trying to start the debugger for you. I won't fight you on this though if you insist; mainly what I disliked was the introduction of Posix terminology in Windows code ("fork_and_exec" "use_vfork") and that can be straightened out by a separate layer (e,g. os::start_child_process(bool from_error_handler) -> os::Posix::os_fork_and_exec(can_use_vfork)). Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From dongbo at openjdk.java.net Fri Mar 5 08:58:39 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Fri, 5 Mar 2021 08:58:39 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Thu, 4 Mar 2021 14:23:34 GMT, Andrew Haley wrote: > ``` > AArch64 SeqCst > { 0:X0=a; 0:X1=b; 1:X0=a; 1:X1=b; a=0; b=0; } > P0 | P1; > MOV W3, #1 | MOV W3, #1; > MOV W8, #99 | MOV W8, #99 ; > STR W3, [X0] | STR W3, [X1]; > LDXR W2, [X1] | LDXR W2, [X0]; > STLXR W5, W8, [X1] | STLXR W5, W8, [X0] ; > CBZ W5, FOO | CBZ W5, FOO1 ; > MOV W2, #1234 | MOV W2, #1234 ; > FOO: | FOO1: ; > DMB ST | DMB ST ; > exists > (0:X2=0 /\ 1:X2 = 0) > > Test SeqCst Allowed > States 8 > 0:X2=0; 1:X2=1; > 0:X2=0; 1:X2=1234; > 0:X2=1; 1:X2=0; > 0:X2=1; 1:X2=1; > 0:X2=1; 1:X2=1234; > 0:X2=1234; 1:X2=0; > 0:X2=1234; 1:X2=1; > 0:X2=1234; 1:X2=1234; > No > Witnesses > Positive: 0 Negative: 15 > Condition exists (0:X2=0 /\ 1:X2=0) > Observation SeqCst Never 0 15 > ``` Yes, if either preceding store in the two threads can reorder with `ldxr`, the exist condition would be satisfied. Seems this test can also be extended to verify the order of preceding loads: AArch64 SeqCst { 0:X0=a; 0:X1=b; 0:X10=c; 0:X11=d; 1:X0=a; 1:X1=b; 1:X10=c; 1:X11=d; a=0; b=0; c=0; d=0 } P0 | P1 ; MOV W3, #1 | MOV W3, #1 ; MOV W8, #99 | MOV W8, #99 ; MOV W9, #9 | MOV W9, #9 ; STR W3, [X0] | STR W3, [X1] ; STLR W9, [X10] | STLR W9, [X11] ; LDR W7, [X11] | LDR W7, [X10] ; LDXR W2, [X1] | LDXR W2, [X0] ; STLXR W5, W8, [X1] | STLXR W5, W8, [X0] ; CBZ W5, FOO | CBZ W5, FOO1 ; MOV W2, #1234 | MOV W2, #1234 ; FOO: | FOO1: ; DMB ST | DMB ST ; exists((0:X2=0 /\ 0:X7=9) / (1:X2=0 /\ 1:X7=9)) Test SeqCst Allowed States 24 0:X2=0; 0:X7=0; 1:X2=1; 1:X7=0; 0:X2=0; 0:X7=0; 1:X2=1; 1:X7=9; 0:X2=0; 0:X7=0; 1:X2=1234; 1:X7=0; 0:X2=0; 0:X7=0; 1:X2=1234; 1:X7=9; 0:X2=1; 0:X7=0; 1:X2=0; 1:X7=0; 0:X2=1; 0:X7=0; 1:X2=1; 1:X7=0; 0:X2=1; 0:X7=0; 1:X2=1; 1:X7=9; 0:X2=1; 0:X7=0; 1:X2=1234; 1:X7=0; 0:X2=1; 0:X7=0; 1:X2=1234; 1:X7=9; 0:X2=1; 0:X7=9; 1:X2=0; 1:X7=0; 0:X2=1; 0:X7=9; 1:X2=1; 1:X7=0; 0:X2=1; 0:X7=9; 1:X2=1; 1:X7=9; 0:X2=1; 0:X7=9; 1:X2=1234; 1:X7=0; 0:X2=1; 0:X7=9; 1:X2=1234; 1:X7=9; 0:X2=1234; 0:X7=0; 1:X2=0; 1:X7=0; 0:X2=1234; 0:X7=0; 1:X2=1; 1:X7=0; 0:X2=1234; 0:X7=0; 1:X2=1; 1:X7=9; 0:X2=1234; 0:X7=0; 1:X2=1234; 1:X7=0; 0:X2=1234; 0:X7=0; 1:X2=1234; 1:X7=9; 0:X2=1234; 0:X7=9; 1:X2=0; 1:X7=0; 0:X2=1234; 0:X7=9; 1:X2=1; 1:X7=0; 0:X2=1234; 0:X7=9; 1:X2=1; 1:X7=9; 0:X2=1234; 0:X7=9; 1:X2=1234; 1:X7=0; 0:X2=1234; 0:X7=9; 1:X2=1234; 1:X7=9; No Witnesses Positive: 0 Negative: 48 Condition exists (0:X2=0 /\ 0:X7=9 / 1:X2=0 /\ 1:X7=9) Observation SeqCst Never 0 48 With all the tests we already have, do you think we are good enough to keep the exclusive part of this PR? ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From github.com+4146708+a74nh at openjdk.java.net Fri Mar 5 11:14:49 2021 From: github.com+4146708+a74nh at openjdk.java.net (Alan Hayward) Date: Fri, 5 Mar 2021 11:14:49 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> Message-ID: On Thu, 4 Mar 2021 18:19:33 GMT, Vladimir Kempik wrote: > Hello > there is one issue with the info you provided, it's from Xcode12.5 beta. > And beta license agreement forbids sharing output of beta version of compiler&co > So we can't say we have issue with newer xcode beta until that beta went public & released. > Fixing issues you found now will mean someone have violated xcode beta license agreement and made these issues public. Ok, I wasn't aware of that. I'll downgrade back to the non-beta version. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From adinn at openjdk.java.net Fri Mar 5 11:48:41 2021 From: adinn at openjdk.java.net (Andrew Dinn) Date: Fri, 5 Mar 2021 11:48:41 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Wed, 3 Mar 2021 08:07:35 GMT, Dong Bo wrote: > For us, we still have servers used by our customers that does not support LSE extension. > Hm, from our point of view, ldaxr+stlxr+dmb and ldxr+stlxr+dmb provide the same order semantics. > The acquire are used to ensure all loads/stores that are after an ldaxr (actually loads/stores after the dmb of atomic_*default*_impl in this case) in program order, while the dmb has already guaranteed this for us. > Without the acquire, the loads/stores after the atomic operations still can not pass the dmb. Remove the acquire does not change the order between preceding loads/stores and stlxr. I agree that the code will still be correct if you change the ldaxr to ldar. While this may make some difference on machines which do not support LSE I would not expect it to be significant for anything other than a very carefully crafted benchmark or an extremely specialized parallel algorithm. Is this change request motivated by an actual real-world use case? ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From tobias.hartmann at oracle.com Fri Mar 5 13:27:29 2021 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 5 Mar 2021 14:27:29 +0100 Subject: CFV: New HotSpot Group Member: Christian Hagedorn Message-ID: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Hi, I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. Christian is a member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. He contributed over 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, acquiring expert knowledge in key areas (for example, loop unswitching and superword optimizations). He investigated and fixed several highly complex and long-standing issues in the code base and improved maintainability of the JITs. All the while, Christian is constantly updating and extending the sparse documentation, making life easier for other engineers. HotSpot Group membership would allow Christian to continue to do so by adding to the OpenJDK wiki pages. Votes are due by Friday, 19 March 2021 at 13:30 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From coleen.phillimore at oracle.com Fri Mar 5 13:30:40 2021 From: coleen.phillimore at oracle.com (Coleen Phillimore) Date: Fri, 5 Mar 2021 08:30:40 -0500 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Message-ID: <9660ef6c-7fa8-c096-d2c8-07a37b289ae1@oracle.com> Vote: yes On 3/5/21 8:27 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. > > Christian is a member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. He contributed over > 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, acquiring expert > knowledge in key areas (for example, loop unswitching and superword optimizations). He investigated > and fixed several highly complex and long-standing issues in the code base and improved > maintainability of the JITs. All the while, Christian is constantly updating and extending the > sparse documentation, making life easier for other engineers. HotSpot Group membership would allow > Christian to continue to do so by adding to the OpenJDK wiki pages. > > Votes are due by Friday, 19 March 2021 at 13:30 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote From tobias.hartmann at oracle.com Fri Mar 5 13:31:28 2021 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Fri, 5 Mar 2021 14:31:28 +0100 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Message-ID: <65af6c68-5e66-b1ed-9b76-943c101e08c0@oracle.com> Vote: yes Best regards, Tobias On 05.03.21 14:27, Tobias Hartmann wrote: > Hi, > > I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. > > Christian is a member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. He contributed over > 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, acquiring expert > knowledge in key areas (for example, loop unswitching and superword optimizations). He investigated > and fixed several highly complex and long-standing issues in the code base and improved > maintainability of the JITs. All the while, Christian is constantly updating and extending the > sparse documentation, making life easier for other engineers. HotSpot Group membership would allow > Christian to continue to do so by adding to the OpenJDK wiki pages. > > Votes are due by Friday, 19 March 2021 at 13:30 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From aph at redhat.com Fri Mar 5 14:17:46 2021 From: aph at redhat.com (Andrew Haley) Date: Fri, 5 Mar 2021 14:17:46 +0000 Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: <89d002db-793c-8ab7-d9d0-1403e4990139@redhat.com> On 3/5/21 11:48 AM, Andrew Dinn wrote: > I agree that the code will still be correct if you change the ldaxr to ldar. No, it's a change from ldaxr to ldxr! But we all know that. From ChrisPhi at LGonQn.Org Fri Mar 5 14:35:08 2021 From: ChrisPhi at LGonQn.Org ("Chris Phillips"@T O) Date: Fri, 5 Mar 2021 09:35:08 -0500 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Message-ID: Hi Vote: Yes Cheers! ChrisPhi On 05/03/21 08:27 AM, Tobias Hartmann wrote: > Hi, > > I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. > > Christian is a member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. He contributed over > 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, acquiring expert > knowledge in key areas (for example, loop unswitching and superword optimizations). He investigated > and fixed several highly complex and long-standing issues in the code base and improved > maintainability of the JITs. All the while, Christian is constantly updating and extending the > sparse documentation, making life easier for other engineers. HotSpot Group membership would allow > Christian to continue to do so by adding to the OpenJDK wiki pages. > > Votes are due by Friday, 19 March 2021 at 13:30 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > > From igor.ignatyev at oracle.com Fri Mar 5 14:54:33 2021 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Fri, 5 Mar 2021 14:54:33 +0000 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Message-ID: Vote: yes ? Igor > On Mar 5, 2021, at 5:27 AM, Tobias Hartmann wrote: > > I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. From volker.simonis at gmail.com Fri Mar 5 14:56:55 2021 From: volker.simonis at gmail.com (Volker Simonis) Date: Fri, 5 Mar 2021 15:56:55 +0100 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Message-ID: Vote: yes Tobias Hartmann schrieb am Fr., 5. M?rz 2021, 14:27: > Hi, > > I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. > > Christian is a member of the HotSpot Compiler Team at Oracle and a JDK > Reviewer. He contributed over > 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, > acquiring expert > knowledge in key areas (for example, loop unswitching and superword > optimizations). He investigated > and fixed several highly complex and long-standing issues in the code base > and improved > maintainability of the JITs. All the while, Christian is constantly > updating and extending the > sparse documentation, making life easier for other engineers. HotSpot > Group membership would allow > Christian to continue to do so by adding to the OpenJDK wiki pages. > > Votes are due by Friday, 19 March 2021 at 13:30 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this > nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > > https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote > From vladimir.kozlov at oracle.com Fri Mar 5 14:59:36 2021 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Fri, 5 Mar 2021 14:59:36 +0000 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Message-ID: <0EA6B336-5964-4B08-BBC8-79E6D8CDF027@oracle.com> Vote: yes Thanks Vladimir > On Mar 5, 2021, at 5:27 AM, Tobias Hartmann wrote: > > ?Hi, > > I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. > > Christian is a member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. He contributed over > 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, acquiring expert > knowledge in key areas (for example, loop unswitching and superword optimizations). He investigated > and fixed several highly complex and long-standing issues in the code base and improved > maintainability of the JITs. All the while, Christian is constantly updating and extending the > sparse documentation, making life easier for other engineers. HotSpot Group membership would allow > Christian to continue to do so by adding to the OpenJDK wiki pages. > > Votes are due by Friday, 19 March 2021 at 13:30 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote From martin.doerr at sap.com Fri Mar 5 15:24:41 2021 From: martin.doerr at sap.com (Doerr, Martin) Date: Fri, 5 Mar 2021 15:24:41 +0000 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <0EA6B336-5964-4B08-BBC8-79E6D8CDF027@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> <0EA6B336-5964-4B08-BBC8-79E6D8CDF027@oracle.com> Message-ID: Vote: yes > -----Original Message----- > From: hotspot-dev On Behalf Of > Vladimir Kozlov > Sent: Freitag, 5. M?rz 2021 16:00 > Cc: hotspot-dev Source Developers > Subject: Re: CFV: New HotSpot Group Member: Christian Hagedorn > > Vote: yes > > Thanks > Vladimir > > > On Mar 5, 2021, at 5:27 AM, Tobias Hartmann > wrote: > > > > ?Hi, > > > > I hereby nominate Christian Hagedorn to Membership in the HotSpot > Group. > > > > Christian is a member of the HotSpot Compiler Team at Oracle and a JDK > Reviewer. He contributed over > > 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, > acquiring expert > > knowledge in key areas (for example, loop unswitching and superword > optimizations). He investigated > > and fixed several highly complex and long-standing issues in the code base > and improved > > maintainability of the JITs. All the while, Christian is constantly updating and > extending the > > sparse documentation, making life easier for other engineers. HotSpot > Group membership would allow > > Christian to continue to do so by adding to the OpenJDK wiki pages. > > > > Votes are due by Friday, 19 March 2021 at 13:30 UTC. > > > > Only current Members of the HotSpot Group [2] are eligible to vote on this > nomination. Votes must > > be cast in the open by replying to this mailing list. > > > > For Lazy Consensus voting instructions, see [3]. > > > > Best regards, > > Tobias > > > > [1] > > https://github.com/search?q=committer- > name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=com > mits > > [2] https://openjdk.java.net/census > > [3] https://openjdk.java.net/groups/#member-vote From adinn at openjdk.java.net Fri Mar 5 16:02:11 2021 From: adinn at openjdk.java.net (Andrew Dinn) Date: Fri, 5 Mar 2021 16:02:11 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: On Fri, 5 Mar 2021 11:46:07 GMT, Andrew Dinn wrote: >>> > OKAY, this make sense to us. >>> > If it is OK to keep the exclusive part of this patch? :-) >>> > As far as we know, the exclusive instructions are not being revised. >>> > And we see `ldxr+stxlr+dmb` have been used in linux kernel since 2014 [1], and still used by now [2]. >>> >>> I know that, but the Linux definition of a "full barrier" isn't quite as strong as HotSpot's `memory_order_conservative`, so we'd need a much more detailed analysis of what behaviours we can permit. Also, we'd have to find a strong reason to invest time in AArch64 without LSE instructions. >>> >>> > BTW, the barrier-ordered-before applies with stlxr according to the architecture specification: >>> >>> Sure, but so what? This is about the entire ldxr/stlxr combination and `memory_order_conservative` , in which we try to mimic Intel's "Loads and Stores Are Not Reordered with Locked Instructions" specification. >> >> Hi, >> >> For us, we still have servers used by our customers that does not support LSE extension. >> >> Hm, from our point of view, `ldaxr+stlxr+dmb` and `ldxr+stlxr+dmb` provide the same order semantics. >> The acquire are used to ensure all loads/stores that are after an `ldaxr` (actually loads/stores after the `dmb` of `atomic_*default*_impl` in this case) in program order, while the `dmb` has already guaranteed this for us. >> Without the acquire, the loads/stores after the atomic operations still can not pass the `dmb`. >> Remove the acquire does not change the order between preceding loads/stores and `stlxr`. > >> For us, we still have servers used by our customers that does not support LSE extension. >> Hm, from our point of view, ldaxr+stlxr+dmb and ldxr+stlxr+dmb provide the same order semantics. >> The acquire are used to ensure all loads/stores that are after an ldaxr (actually loads/stores after the dmb of atomic_*default*_impl in this case) in program order, while the dmb has already guaranteed this for us. >> Without the acquire, the loads/stores after the atomic operations still can not pass the dmb. > Remove the acquire does not change the order between preceding loads/stores and stlxr. > > I agree that the code will still be correct if you change the ldaxr to ldar. While this may make some difference on machines which do not support LSE I would not expect it to be significant for anything other than a very carefully crafted benchmark or an extremely specialized parallel algorithm. Is this change request motivated by an actual real-world use case? Correction: I agree that the code will still be correct if you change the ldaxr to *ldxr*. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From hseigel at openjdk.java.net Fri Mar 5 16:02:10 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 5 Mar 2021 16:02:10 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms In-Reply-To: References: Message-ID: <45xuwCNKuKGWNBnEvqq2A3frbjgADxRvWsbRIoGl3gM=.79dcfb0d-7e7f-422e-ad08-b799058969ad@github.com> On Fri, 5 Mar 2021 07:24:34 GMT, Thomas Stuefe wrote: >> `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: >> a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) >> b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) >> >> The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. >> >> In addition to that, this patch does a number of small changes: >> >> 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: >> - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. >> - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. >> 2) I added a comment to the function to not use it outside of fatal error situations. >> 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. >> 4) consistently used global scope :: for posix APIs. >> >> Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. >> >> ---- >> >> Tests: GAs, manual tests using -XX:ShowMessageBoxOnError > >> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> >> Hi Thomas, >> >> On 5/03/2021 4:22 pm, Thomas Stuefe wrote: >> >> > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: >> > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) >> > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) >> > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. >> > In addition to that, this patch does a number of small changes: >> > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: >> > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. >> >> Is it? The reason we use fork() for the error/crash case is because it >> can get launched from a signal handling context and vfork is not >> async-signal-safe. >> >> There is some commentary in: >> >> https://bugs.openjdk.java.net/browse/JDK-8027434 >> >> Cheers, >> David > > Hi David, > > My estimate of vfork being safe here comes from experience. At SAP, at one time we replaced the jdk's Runtime.exec() implementation completely with our own; it lived in hotspot and was used by both the jdk and the hotspot, in and out of signal contexts. Our implementation mainly used vfork(), with a lot of safe guards of course (mainly a forkhelper binary, similarly to what Runtime.exec() does today). > > I cannot completely exclude the possibility of problems here, but calling vfork()->exec()->_exit() is as safe as it gets. Seeing that we only use it for starting a debugger in case the original VM crashed, I think the possibility of problems is remote. > > OTOH when starting a debugger spawned from a fat process, you may run into the same problems as with https://bugs.openjdk.java.net/browse/JDK-8027434. You don't want the machine to go into swap when trying to start the debugger for you. > > I won't fight you on this though if you insist; mainly what I disliked was the introduction of Posix terminology in Windows code ("fork_and_exec" "use_vfork") and that can be straightened out by a separate layer (e,g. os::start_child_process(bool from_error_handler) -> os::Posix::os_fork_and_exec(can_use_vfork)). > > Cheers, Thomas Hi Thomas, Can the #includes of be removed from the os_aix.cpp, os_bsd.cpp, and os_linux.cpp files? Thanks, Harold ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From coleenp at openjdk.java.net Fri Mar 5 16:02:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 5 Mar 2021 16:02:42 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Fri, 5 Mar 2021 06:20:12 GMT, David Holmes wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > src/hotspot/share/oops/constantPool.cpp line 555: > >> 553: throw_resolution_error(this_cp, which, CHECK_NULL); >> 554: } >> 555: > > I'm unclear how this race is resolved. There must be a serialization point between the two threads otherwise this re-check is just as racy as the original. We need to know that the CP entry is now stable and was resolved either by this thread or the other one. But I can't see where this serialization point arises. ?? Yes this is the wrong commit. Sorry I thought it would help to have the separate commits. The first commit should only add the test. > src/hotspot/share/oops/constantPool.cpp line 555: > >> 553: >> 554: Klass** adr = this_cp->resolved_klasses()->adr_at(resolved_klass_index); >> 555: Atomic::release_store(adr, k); > > If we are racing then isn't it the case that we may not have an entry in resolved_klasses()? The order is: add the klass to resolved_klasses() and then set the tag. We need to check the tag in order to see whether the klass is correct or not. This is the way it worked before resolved_klasses() was added, but there were a couple of shortcuts to just check the klass != NULL. With the race to set UnresolvedClassInError, we need to check the tag first again, because the klass is set to null if the unresolved class has won the race. > src/hotspot/share/oops/constantPool.cpp line 569: > >> 567: if (old_tag == JVM_CONSTANT_UnresolvedClassInError) { >> 568: // Remove klass. >> 569: Atomic::release_store(adr, (Klass*)NULL); > > This is all very unclear to me. The CAS above provides a serialization point for the racing threads, but prior to that we have already published the klass into adr and now we are saying "oops I'd better undo that", but surely it is too late as we have let it escape. ?? It hasn't escaped yet for *this* location in the code, which is what my reading of the JVMS requires. The newly loaded class is in the SystemDictionary and can be used for future resolutions (see CP2 in the test). If I'm reading the JVMS incorrectly though, we've got a lot more (invasive) work to do to prevent a class loader from loading a correct class after it's tried to load one with an error. > src/hotspot/share/oops/constantPool.cpp line 545: > >> 543: // To preserve old behavior, we return the resolved class. >> 544: Klass* klass = this_cp->resolved_klasses()->at(resolved_klass_index); >> 545: assert(klass != NULL, "must be resolved if exception was cleared"); > > Existing code but this seems a bizarre thing to do. The only way the two resolving threads can disagree about the resolution is if we have a bad classloader. With a bad classloader it is hard to justify any argument about what should happen, so it would have seemed preferrable to me to always report the error. It is a race so the outcome is arbitrary to begin with. Yes, you need a non well behaved class loader. I think the key point is that the *first* outcome is the one that must be saved as the resolution for this constant pool index. This bug was that if the first was already saved as an error, we were overwriting it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From akozlov at openjdk.java.net Fri Mar 5 16:05:45 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 5 Mar 2021 16:05:45 GMT Subject: RFR: 8261075: Create stubRoutines.inline.hpp with SafeFetch implementation [v2] In-Reply-To: <4z7djCLhn7GE7upEUtWJ1iTrQXJmF3gpSghQfTqd1WE=.5abf8ca8-591c-4701-8ce2-35da0f141241@github.com> References: <4z7djCLhn7GE7upEUtWJ1iTrQXJmF3gpSghQfTqd1WE=.5abf8ca8-591c-4701-8ce2-35da0f141241@github.com> Message-ID: On Thu, 4 Mar 2021 19:13:05 GMT, Stefan Karlsson wrote: > Just so that I understand. Did you really mean safepoint.hpp and not safefetch.hpp? Definitely safefetch.hpp. Sorry, it seems I was thinking also about another problem at that moment :) > I assume this all has to do with the fact that safefetch.hpp is going to include threadWXSetter.hpp, which you are going to change to threadWXSsetter.inline.hpp, because it includes thread.inline.hpp. If that's the case then I think the easy fix is to just rename safefetch.hpp to safefetch.inline.hpp. This is correct. My main concern was the noise I'm creating in the git repository, but I also think this is the best way in this situation. Thanks for confirmation! The bug is JDK-8263068 and PR is #2844. ------------- PR: https://git.openjdk.java.net/jdk/pull/2542 From akozlov at openjdk.java.net Fri Mar 5 16:09:15 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 5 Mar 2021 16:09:15 GMT Subject: RFR: JDK-8263068: Rename safefetch.hpp to safefetch.inline.hpp Message-ID: <6pYgLK51Ll4jWSuTlGWnEOuXU_3F8uBphuM1c0TAVdI=.726e1003-6b80-4745-b901-1493a0227d4c@github.com> Please review a trivial renaming of safefetch.hpp to safefetch.inline.hpp. It is a preparation to fix for @stefank note https://github.com/openjdk/jdk/pull/2200#discussion_r572707505. I'm going to rename threadWXSetters.hpp to threadWXSetters.inline.hpp and threadWXSetters header is needed for safefetch inline functions implementation. ------------- Commit messages: - Rename safefetch.hpp -> safefetch.inline.hpp Changes: https://git.openjdk.java.net/jdk/pull/2844/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2844&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263068 Stats: 12 lines in 10 files changed: 0 ins; 0 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/2844.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2844/head:pull/2844 PR: https://git.openjdk.java.net/jdk/pull/2844 From stuefe at openjdk.java.net Fri Mar 5 16:17:21 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 5 Mar 2021 16:17:21 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms In-Reply-To: <45xuwCNKuKGWNBnEvqq2A3frbjgADxRvWsbRIoGl3gM=.79dcfb0d-7e7f-422e-ad08-b799058969ad@github.com> References: <45xuwCNKuKGWNBnEvqq2A3frbjgADxRvWsbRIoGl3gM=.79dcfb0d-7e7f-422e-ad08-b799058969ad@github.com> Message-ID: On Fri, 5 Mar 2021 14:38:55 GMT, Harold Seigel wrote: > Hi Thomas, > Can the #includes of be removed from the os_aix.cpp, os_bsd.cpp, and os_linux.cpp files? > Thanks, Harold Yes, good catch, I'll remove them. ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From stefank at openjdk.java.net Fri Mar 5 16:18:17 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 5 Mar 2021 16:18:17 GMT Subject: RFR: JDK-8263068: Rename safefetch.hpp to safefetch.inline.hpp In-Reply-To: <6pYgLK51Ll4jWSuTlGWnEOuXU_3F8uBphuM1c0TAVdI=.726e1003-6b80-4745-b901-1493a0227d4c@github.com> References: <6pYgLK51Ll4jWSuTlGWnEOuXU_3F8uBphuM1c0TAVdI=.726e1003-6b80-4745-b901-1493a0227d4c@github.com> Message-ID: On Fri, 5 Mar 2021 13:15:56 GMT, Anton Kozlov wrote: > Please review a trivial renaming of safefetch.hpp to safefetch.inline.hpp. It is a preparation to fix for @stefank note https://github.com/openjdk/jdk/pull/2200#discussion_r572707505. I'm going to rename threadWXSetters.hpp to threadWXSetters.inline.hpp and threadWXSetters header is needed for safefetch inline functions implementation. Looks good and can be considered trivial (only requires one reviewer). Thanks. ------------- Marked as reviewed by stefank (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2844 From stuefe at openjdk.java.net Fri Mar 5 16:53:32 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 5 Mar 2021 16:53:32 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v2] In-Reply-To: References: Message-ID: > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) > > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. > > In addition to that, this patch does a number of small changes: > > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. > - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. > 2) I added a comment to the function to not use it outside of fatal error situations. > 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. > 4) consistently used global scope :: for posix APIs. > > Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. > > ---- > > Tests: GAs, manual tests using -XX:ShowMessageBoxOnError Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - use vfork only outside of signal contexts - remove unnecessary header from os_xxx.cpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2810/files - new: https://git.openjdk.java.net/jdk/pull/2810/files/4d92b7ce..e7bf9f5c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=00-01 Stats: 50 lines in 8 files changed: 41 ins; 3 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2810.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2810/head:pull/2810 PR: https://git.openjdk.java.net/jdk/pull/2810 From stuefe at openjdk.java.net Fri Mar 5 16:56:14 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 5 Mar 2021 16:56:14 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms In-Reply-To: References: <45xuwCNKuKGWNBnEvqq2A3frbjgADxRvWsbRIoGl3gM=.79dcfb0d-7e7f-422e-ad08-b799058969ad@github.com> Message-ID: On Fri, 5 Mar 2021 16:14:03 GMT, Thomas Stuefe wrote: >> Hi Thomas, >> Can the #includes of be removed from the os_aix.cpp, os_bsd.cpp, and os_linux.cpp files? >> Thanks, Harold > >> Hi Thomas, >> Can the #includes of be removed from the os_aix.cpp, os_bsd.cpp, and os_linux.cpp files? >> Thanks, Harold > > Yes, good catch, I'll remove them. Hi, I added the following changes: - took care of Harold's request - modified the coding to only use vfork when we are assured outside of signal context. For that, I added a marker into Thread which marks if the Thread is inside hotspot signal handling. If we want, we can use this in the future for better, signal handler aware tests (eg assert that we don't use malloc in tests). I also added POSIX macros to macros.hpp since this is overdue. We can use those to replace constructs where we use #ifndef WINDOWS to guard posix coding. Seems cleaner that way to me. Thanks, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From stuefe at openjdk.java.net Fri Mar 5 17:01:26 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 5 Mar 2021 17:01:26 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) > > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. > > In addition to that, this patch does a number of small changes: > > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. > - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. > 2) I added a comment to the function to not use it outside of fatal error situations. > 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. > 4) consistently used global scope :: for posix APIs. > > Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. > > ---- > > Tests: GAs, manual tests using -XX:ShowMessageBoxOnError Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: tidy up ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2810/files - new: https://git.openjdk.java.net/jdk/pull/2810/files/e7bf9f5c..aa03cf40 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=01-02 Stats: 7 lines in 1 file changed: 3 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/2810.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2810/head:pull/2810 PR: https://git.openjdk.java.net/jdk/pull/2810 From rrich at openjdk.java.net Fri Mar 5 17:28:24 2021 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 5 Mar 2021 17:28:24 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: <2vF8YcQ9CgnLxMus9tHTd4HmFITeH5wddBKzH-QNNEY=.dbfb2cd5-5997-4b79-88a5-d8292fc56965@github.com> Message-ID: On Fri, 5 Mar 2021 11:11:44 GMT, Alan Hayward wrote: >>> I was building this PR on a new machine, and I now get the following error: >>> >>> > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:258:31: error: cast to smaller integer type 'MIDIClientRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] >>> > static MIDIClientRef client = (MIDIClientRef) NULL; >>> > ^~~~~~~~~~~~~~~~~~~~ >>> > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:259:29: error: cast to smaller integer type 'MIDIPortRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] >>> > static MIDIPortRef inPort = (MIDIPortRef) NULL; >>> > ^~~~~~~~~~~~~~~~~~ >>> > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:260:30: error: cast to smaller integer type 'MIDIPortRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] >>> > static MIDIPortRef outPort = (MIDIPortRef) NULL; >>> > ^~~~~~~~~~~~~~~~~~ >>> > /Users/alahay01/java/gerrit_jdk/src/java.desktop/macosx/native/libjsound/PLATFORM_API_MacOSX_MidiUtils.c:466:32: error: cast to smaller integer type 'MIDIEndpointRef' (aka 'unsigned int') from 'void *' [-Werror,-Wvoid-pointer-to-int-cast] >>> > MIDIEndpointRef endpoint = (MIDIEndpointRef) NULL; >>> > ^~~~~~~~~~~~~~~~~~~~~~ >>> > 4 errors generated. >>> >>> As far as I can tell the only difference between the two systems is the xcode version: >>> >>> New system (failing) >>> % xcodebuild -version >>> Xcode 12.5 >>> Build version 12E5244e >>> >>> Old system (working) >>> % xcodebuild -version >>> Xcode 12.4 >>> Build version 12D4e >>> >>> Looks like the newer version of Xcode is being a little stricter with casting? >>> Replacing the NULL with 0 seems to fix the issue. >> >> Hello >> there is one issue with the info you provided, it's from Xcode12.5 beta. >> And beta license agreement forbids sharing output of beta version of compiler&co >> So we can't say we have issue with newer xcode beta until that beta went public & released. >> Fixing issues you found now will mean someone have violated xcode beta license agreement and made these issues public. > >> Hello >> there is one issue with the info you provided, it's from Xcode12.5 beta. >> And beta license agreement forbids sharing output of beta version of compiler&co >> So we can't say we have issue with newer xcode beta until that beta went public & released. >> Fixing issues you found now will mean someone have violated xcode beta license agreement and made these issues public. > > Ok, I wasn't aware of that. I'll downgrade back to the non-beta version. Hi, @VladimirKempik reported failure of CompressedClassPointers.largeHeapAbove32GTest() with [JDK-8262895](https://bugs.openjdk.java.net/browse/JDK-8262895) on macos_aarch64. He also helped analyzing the issue (thanks!). Apparently the vm has difficulties mapping the heap of 31GB at one of the preferred addresses given in [`get_attach_addresses_for_disjoint_mode()`](https://github.com/openjdk/jdk/blob/8d3de4b1bdb5dc13bb7724596dc2123ba05bbb81/src/hotspot/share/memory/virtualspace.cpp#L477). This causes the test failure indirectly. IMO it could be worth the effort to find out why the heap cannot be mapped at the preferred addresses. Reviewers should decide on it. I wouldn't be able to do it myself though as I don't have access to a macos_aarch64 system. Alternatively I'd suggest to exlude macos_aarch64 from the subtest largeHeapAbove32GTest. Best regards, Richard. -- #### Details of the analysis In the following trace we see the vm trying to allocate the heap at addresses given in [`get_attach_addresses_for_disjoint_mode()`](https://github.com/openjdk/jdk/blob/8d3de4b1bdb5dc13bb7724596dc2123ba05bbb81/src/hotspot/share/memory/virtualspace.cpp#L477) without success: images/jdk/bin/java -XX:+UnlockDiagnosticVMOptions -XX:+UnlockExperimentalVMOptions -Xmx31g -XX:-UseAOT -Xlog:gc+metaspace=trace,cds=trace,heap+gc+exit=info,gc+heap+coops=trace -Xshare:off -XX:+VerifyBeforeGC -XX:HeapSearchSteps=40 -version OpenJDK 64-Bit Server VM warning: Shared spaces are not supported in this VM [0.005s][trace][gc,heap,coops] Trying to allocate at address 0x0000001000000000 heap of size 0x7c1000000 [0.005s][trace][gc,heap,coops] Trying to allocate at address 0x0000001800000000 heap of size 0x7c1000000 [0.005s][trace][gc,heap,coops] Trying to allocate at address 0x0000002000000000 heap of size 0x7c1000000 [0.005s][trace][gc,heap,coops] Trying to allocate at address 0x0000004000000000 heap of size 0x7c1000000 [0.005s][trace][gc,heap,coops] Trying to allocate at address 0x0000005000000000 heap of size 0x7c1000000 [0.005s][trace][gc,heap,coops] Trying to allocate at address 0x0008000000000000 heap of size 0x7c1000000 [0.005s][trace][gc,heap,coops] Trying to allocate at address 0x0010000000000000 heap of size 0x7c1000000 [0.006s][trace][gc,heap,coops] Trying to allocate at address 0x0018000000000000 heap of size 0x7c1000000 [0.006s][trace][gc,heap,coops] Trying to allocate at address 0x0020000000000000 heap of size 0x7c1000000 [0.006s][trace][gc,heap,coops] Trying to allocate at address 0x0080000000000000 heap of size 0x7c1000000 [0.006s][trace][gc,heap,coops] Trying to allocate at address 0x0100000000000000 heap of size 0x7c1000000 [0.006s][trace][gc,heap,coops] Trying to allocate at address 0x0110000000000000 heap of size 0x7c1000000 Finally it gives up and lets the os chose the address: [0.006s][trace][gc,heap,coops] Trying to allocate at address NULL heap of size 0x7c1000000 [0.006s][debug][gc,heap,coops] Protected page at the reserved heap base: 0x0000000280000000 / 16777216 bytes [0.006s][debug][gc,heap,coops] Heap address: 0x0000000281000000, size: 31744 MB, Compressed Oops mode: Non-zero based: 0x0000000280000000, Oop shift amount: 3 The os chooses to map the heap at 0x0000000281000000 that is at 10GB. This leaves not much room for a 4GB (*) aligned compressed class space below 32G for a zero based encoding. And indeed we get a compressed class space with an encoding base that is not zero and largeHeapAbove32GTest fails then [0.007s][info ][gc,metaspace ] Compressed class space mapped at: 0x0000007000000000-0x0000007040000000, reserved size: 1073741824 [0.007s][info ][gc,metaspace ] Narrow klass base: 0x0000007000000000, Narrow klass shift: 0, Narrow klass range: 0x40000000 On macos 10.15 (x86_64) the vm succeeds first try mapping the heap at address 0x0000001000000000. (*) On aarch64 the encoding base has to be 4GB aligned. Unfortunately the 4GB alignment is enforced to strictly on the start address of the compressed class space instead of enforcing it on the encoding base. See [JDK-8258756](https://bugs.openjdk.java.net/browse/JDK-8258756) ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Fri Mar 5 21:03:49 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 5 Mar 2021 21:03:49 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v2] In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove unnecessary tag comparison. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2718/files - new: https://git.openjdk.java.net/jdk/pull/2718/files/732bf504..b315aa70 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=00-01 Stats: 7 lines in 1 file changed: 0 ins; 2 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/2718.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2718/head:pull/2718 PR: https://git.openjdk.java.net/jdk/pull/2718 From iklam at openjdk.java.net Sun Mar 7 06:41:21 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Sun, 7 Mar 2021 06:41:21 GMT Subject: RFR: 8263002 Remove CDS MiscCode region Message-ID: The CDS MiscCode region is used for: (a) C++ vtables (b) Method trampolines (a) can be moved to the ReadWrite region (b) were introduced in JDK-8145221 so we can delay writing into Methods. This was intended to improve copy-on-write sharing to reduce memory footprint. However, this hasn't been shown to have any significant effect (footprint of metadata usually is much smaller than the Java heap), and introduces a lot of complexity in the HotSpot code. Removing (b) will make it easier to implement JDK-8026297 (Generating AdapterHandlerEntry during CDS dump), which will further improve start-up time. ============ Other benefits of removing the MiscCode region: - We no longer have a read/write/executable region. This address the concern in JDK-8262922. - We can enable CDS on macOS/AArch64, which does not allow read/write/executable regions. (JDK-8253795) ------------- Commit messages: - remove MC region - 8263002: Remove CDS MiscCode region Changes: https://git.openjdk.java.net/jdk/pull/2861/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2861&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263002 Stats: 650 lines in 38 files changed: 11 ins; 541 del; 98 mod Patch: https://git.openjdk.java.net/jdk/pull/2861.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2861/head:pull/2861 PR: https://git.openjdk.java.net/jdk/pull/2861 From iveresov at openjdk.java.net Sun Mar 7 20:55:16 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Sun, 7 Mar 2021 20:55:16 GMT Subject: RFR: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true Message-ID: Cleanup the behavior of the compilation policy with -Xcomp. ------------- Commit messages: - Cleanup behavior of compilation policy with -Xcomp Changes: https://git.openjdk.java.net/jdk/pull/2864/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2864&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8219555 Stats: 63 lines in 9 files changed: 16 ins; 27 del; 20 mod Patch: https://git.openjdk.java.net/jdk/pull/2864.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2864/head:pull/2864 PR: https://git.openjdk.java.net/jdk/pull/2864 From dholmes at openjdk.java.net Mon Mar 8 05:41:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 8 Mar 2021 05:41:08 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: On Fri, 5 Mar 2021 17:01:26 GMT, Thomas Stuefe wrote: >> `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: >> a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) >> b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) >> >> The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. >> >> In addition to that, this patch does a number of small changes: >> >> 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: >> - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. >> - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. >> 2) I added a comment to the function to not use it outside of fatal error situations. >> 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. >> 4) consistently used global scope :: for posix APIs. >> >> Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. >> >> ---- >> >> Tests: GAs, manual tests using -XX:ShowMessageBoxOnError > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > tidy up Hi Thomas, I appreciate the reworking to avoid vfork for the signal case, but have some concerns with the mechanism and I'm not sure a new mechanism is really needed. See comments below. Thanks, David src/hotspot/os/posix/os_posix.cpp line 1804: > 1802: // of signal handling. > 1803: const Thread* const t = Thread::current_or_null_safe(); > 1804: const bool use_vfork = t != NULL && t->is_in_signal_handler() == false; Nit: !t->is_in_signal_handler() src/hotspot/os/posix/signals_posix.cpp line 573: > 571: InSignalHandlerMark() : > 572: _thread(Thread::current_or_null_safe()) { > 573: if (_thread) { Style Nit: no implicit booleans - use `_thread != NULL` src/hotspot/os/posix/signals_posix.cpp line 600: > 598: assert(info != NULL && ucVoid != NULL, "sanity"); > 599: > 600: InSignalHandlerMark ishm; Is this going to behave as expected if we actually crash during signal processing and so re-enter this routine with a signal mark already active? src/hotspot/share/utilities/macros.hpp line 439: > 437: #endif > 438: #define POSIX_ONLY(code) code > 439: #define NOT_POSIX(code) Not sure this makes sense. At the moment only Windows is not-POSIX. If in the future we had another non-POSIX platform then I find it very unlikely that it would use the same code as Windows. So I would not like to see NOT_POSIX used incorrectly where we really only mean WINDOWS. src/hotspot/share/runtime/thread.hpp line 826: > 824: static void SpinRelease(volatile int * Lock); > 825: > 826: #ifdef POSIX I hate seeing this in shared code. It really belongs in platform-dependent thread code - but osThread_posix doesn't exist. :( But do we need this - can't the existing use_vfork_if_available parameter instead be renamed and interpreted as not_in_a_signal_handler? ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From stuefe at openjdk.java.net Mon Mar 8 06:07:07 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 8 Mar 2021 06:07:07 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 05:24:36 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> tidy up > > src/hotspot/share/utilities/macros.hpp line 439: > >> 437: #endif >> 438: #define POSIX_ONLY(code) code >> 439: #define NOT_POSIX(code) > > Not sure this makes sense. At the moment only Windows is not-POSIX. If in the future we had another non-POSIX platform then I find it very unlikely that it would use the same code as Windows. So I would not like to see NOT_POSIX used incorrectly where we really only mean WINDOWS. My intention was exactly that, using it to protect *Posix* code from compiling on non-Posix platforms. Using !Windows there is just as wrong and not as descriptive. Most !Windows cases in hotspot are not affected by this since they have nothing to do with Posix; but there are a few and there may be more. E.g. I am currently working on a patch to protect the program break on Unices. This means dealing with sbrk, and depending on which route I go a POSIX_ONLY macro would make perfect sense. That POSIX is defined as !Windows is only incidental. If you prefer, I can list the posix operating systems here; I did not do that for brevity. ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From stuefe at openjdk.java.net Mon Mar 8 06:12:12 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 8 Mar 2021 06:12:12 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 05:37:37 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> tidy up > > src/hotspot/share/runtime/thread.hpp line 826: > >> 824: static void SpinRelease(volatile int * Lock); >> 825: >> 826: #ifdef POSIX > > I hate seeing this in shared code. It really belongs in platform-dependent thread code - but osThread_posix doesn't exist. :( > > But do we need this - can't the existing use_vfork_if_available parameter instead be renamed and interpreted as not_in_a_signal_handler? But would having this functionality not make sense? We always argue about what is and is not allowed during signal handling; here, we may get a mechanism to actually assert it and e.g. prevent using malloc() or logging or whatever VM functionality may creep in. But I will revert my last commit completely and re-add the needs-vfork parameter. This was supposed to be mainly a cleanup; maybe it was wrong to mix in behavioral changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From david.holmes at oracle.com Mon Mar 8 06:26:40 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Mar 2021 16:26:40 +1000 Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: On 8/03/2021 4:07 pm, Thomas Stuefe wrote: > On Mon, 8 Mar 2021 05:24:36 GMT, David Holmes wrote: > >>> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >>> >>> tidy up >> >> src/hotspot/share/utilities/macros.hpp line 439: >> >>> 437: #endif >>> 438: #define POSIX_ONLY(code) code >>> 439: #define NOT_POSIX(code) >> >> Not sure this makes sense. At the moment only Windows is not-POSIX. If in the future we had another non-POSIX platform then I find it very unlikely that it would use the same code as Windows. So I would not like to see NOT_POSIX used incorrectly where we really only mean WINDOWS. > > My intention was exactly that, using it to protect *Posix* code from compiling on non-Posix platforms. Using !Windows there is just as wrong and not as descriptive. Most !Windows cases in hotspot are not affected by this since they have nothing to do with Posix; but there are a few and there may be more. E.g. I am currently working on a patch to protect the program break on Unices. This means dealing with sbrk, and depending on which route I go a POSIX_ONLY macro would make perfect sense. POSIX_ONLY is fine - as you say that categorises things correctly as opposed to !WINDOWS. My issue is with defining NOT_POSIX - that will likely be misused. Sorry it wasn't clear. David > That POSIX is defined as !Windows is only incidental. If you prefer, I can list the posix operating systems here; I did not do that for brevity. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/2810 > From david.holmes at oracle.com Mon Mar 8 06:29:41 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 8 Mar 2021 16:29:41 +1000 Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: On 8/03/2021 4:12 pm, Thomas Stuefe wrote: > On Mon, 8 Mar 2021 05:37:37 GMT, David Holmes wrote: > >>> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >>> >>> tidy up >> >> src/hotspot/share/runtime/thread.hpp line 826: >> >>> 824: static void SpinRelease(volatile int * Lock); >>> 825: >>> 826: #ifdef POSIX >> >> I hate seeing this in shared code. It really belongs in platform-dependent thread code - but osThread_posix doesn't exist. :( >> >> But do we need this - can't the existing use_vfork_if_available parameter instead be renamed and interpreted as not_in_a_signal_handler? > > But would having this functionality not make sense? We always argue about what is and is not allowed during signal handling; here, we may get a mechanism to actually assert it and e.g. prevent using malloc() or logging or whatever VM functionality may creep in. I agree having IsInSignalHandler may be generally useful, my issue was with how you actually did it - in the shared code - and so simpler solution for the currnet case would just be to reuse the existing para,eter. > > But I will revert my last commit completely and re-add the needs-vfork parameter. This was supposed to be mainly a cleanup; maybe it was wrong to mix in behavioral changes. True. Do the refactoring then look at enhancements later. Sorry for making this harder. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/2810 > From stuefe at openjdk.java.net Mon Mar 8 06:53:07 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 8 Mar 2021 06:53:07 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 05:38:40 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> tidy up > > Hi Thomas, > > I appreciate the reworking to avoid vfork for the signal case, but have some concerns with the mechanism and I'm not sure a new mechanism is really needed. See comments below. > > Thanks, > David > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > On 8/03/2021 4:12 pm, Thomas Stuefe wrote: > > > On Mon, 8 Mar 2021 05:37:37 GMT, David Holmes wrote: > > > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > > > tidy up > > > > > > > > > src/hotspot/share/runtime/thread.hpp line 826: > > > > 824: static void SpinRelease(volatile int * Lock); > > > > 825: > > > > 826: #ifdef POSIX > > > > > > > > > I hate seeing this in shared code. It really belongs in platform-dependent thread code - but osThread_posix doesn't exist. :( > > > But do we need this - can't the existing use_vfork_if_available parameter instead be renamed and interpreted as not_in_a_signal_handler? > > > > > > But would having this functionality not make sense? We always argue about what is and is not allowed during signal handling; here, we may get a mechanism to actually assert it and e.g. prevent using malloc() or logging or whatever VM functionality may creep in. > > I agree having IsInSignalHandler may be generally useful, my issue was > with how you actually did it - in the shared code - and so simpler > solution for the currnet case would just be to reuse the existing para,eter. > > > But I will revert my last commit completely and re-add the needs-vfork parameter. This was supposed to be mainly a cleanup; maybe it was wrong to mix in behavioral changes. > > True. Do the refactoring then look at enhancements later. Sorry for > making this harder. > Oh no problem, that's what reviews are for. There will be occasion enough for behavioral change RFEs later. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From dongbo at openjdk.java.net Mon Mar 8 07:26:08 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Mon, 8 Mar 2021 07:26:08 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> Message-ID: <-6qdl0SynMQ7vx-KT68Vgv7hmxq8bcBu89vfPedpYX8=.0ee3210e-efb6-44b8-ac1c-2c8ab5f53b0e@github.com> On Fri, 5 Mar 2021 11:53:07 GMT, Andrew Dinn wrote: >>> For us, we still have servers used by our customers that does not support LSE extension. >>> Hm, from our point of view, ldaxr+stlxr+dmb and ldxr+stlxr+dmb provide the same order semantics. >>> The acquire are used to ensure all loads/stores that are after an ldaxr (actually loads/stores after the dmb of atomic_*default*_impl in this case) in program order, while the dmb has already guaranteed this for us. >>> Without the acquire, the loads/stores after the atomic operations still can not pass the dmb. >> Remove the acquire does not change the order between preceding loads/stores and stlxr. >> >> I agree that the code will still be correct if you change the ldaxr to ldar. While this may make some difference on machines which do not support LSE I would not expect it to be significant for anything other than a very carefully crafted benchmark or an extremely specialized parallel algorithm. Is this change request motivated by an actual real-world use case? > > Correction: > > I agree that the code will still be correct if you change the ldaxr to *ldxr*. > > Hm, from our point of view, ldaxr+stlxr+dmb and ldxr+stlxr+dmb provide the same order semantics. > > The acquire are used to ensure all loads/stores that are after an ldaxr (actually loads/stores after the dmb of atomic__default__impl in this case) in program order, while the dmb has already guaranteed this for us. > > Without the acquire, the loads/stores after the atomic operations still can not pass the dmb. > > Remove the acquire does not change the order between preceding loads/stores and stlxr. > > I agree that the code will still be correct if you change the ldaxr to ldar. While this may make some difference on machines which do not support LSE I would not expect it to be significant for anything other than a very carefully crafted benchmark or an extremely specialized parallel algorithm. Is this change request motivated by an actual real-world use case? Thanks for the comments. We only witnessed ~4% improvements with C code below on one of our platform. The load-acquire is implemented as a full-barrier by the core in fact. // highly-contended fetch_and_add test, %3.96 improvements with 8 threads unsigned long res = 0; unsigned long sum = 0; extern unsigned long aarch64_atomic_fetch_add_8_default_impl(void *ptr, unsigned long val); void *executor (void *arg) { for (int i = 0; i < ITER; i++) { sum += aarch64_atomic_fetch_add_8_default_impl(&res, 1); } } int main(int argc, char **args) { int i; pthread_t exethreads[THREADS]; int threads = atoi(args[1]); for (i = 0; i < threads; i++) pthread_create(&exethreads[i], NULL, executor, NULL); for (i = 0; i < threads; i++) pthread_join(exethreads[i], NULL); return 0; } While we didn't have any noticeable improvements with JAVA tests we tried, e.g. `test/micro/org/openjdk/bench/vm/gc/Alloc.java`. Seems the percentages of the atomics are too low, not to mention the actual real-world use case, e.g. Spark, Tomcat. I guess we do not have to change the `ldaxr` to `ldxr` now, due to we haven't seen any significant performance enhancements yet. :-) But I feel a little inconsistent that we have code use the stronger semantics, while a weaker instruction can still provide the correct order semantics. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From rehn at openjdk.java.net Mon Mar 8 08:11:25 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 8 Mar 2021 08:11:25 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: > With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. > > In some cases we are in native while executing this method and in some in vm. > That's why there is an check for state in vm. > > Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. > This change-set passes T1 stand alone. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Comment, local JavaThread variable - Merge branch 'master' into 8262443-gen-oop-map - Go to blocked when loop ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2742/files - new: https://git.openjdk.java.net/jdk/pull/2742/files/3e4fc18b..bdb68d24 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2742&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2742&range=00-01 Stats: 9244 lines in 328 files changed: 5992 ins; 1967 del; 1285 mod Patch: https://git.openjdk.java.net/jdk/pull/2742.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2742/head:pull/2742 PR: https://git.openjdk.java.net/jdk/pull/2742 From rehn at openjdk.java.net Mon Mar 8 08:21:10 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 8 Mar 2021 08:21:10 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Fri, 26 Feb 2021 23:20:54 GMT, Coleen Phillimore wrote: > This seems legit. This can be called by the compiler thread while in native, or during GC or during the rewriter if jsr/ret is found. I assume the last case is what you observed? There are several code paths here, e.g. from has_balanced_monitors() is all those during jsr/ret? I don't know. Thanks for having a look, @coleenp, @dholmes-ora, @dcubed-ojdk. Waiting for Dan to approve. ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From rehn at openjdk.java.net Mon Mar 8 08:21:11 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 8 Mar 2021 08:21:11 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Mon, 1 Mar 2021 21:03:53 GMT, Daniel D. Daugherty wrote: >> src/hotspot/share/oops/generateOopMap.cpp line 918: >> >>> 916: ThreadBlockInVM tbivm(thread->as_Java_thread()); >>> 917: } >>> 918: } >> >> Can you add a comment as to why this is necessary please. > > Perhaps something like this above L916: > > // Since this JavaThread has looped at least once and is _thread_in_vm, > // we honor any pending blocking request. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From rehn at openjdk.java.net Mon Mar 8 08:21:13 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 8 Mar 2021 08:21:13 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Mon, 1 Mar 2021 20:58:40 GMT, Daniel D. Daugherty wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Comment, local JavaThread variable >> - Merge branch 'master' into 8262443-gen-oop-map >> - Go to blocked when loop > > src/hotspot/share/oops/generateOopMap.cpp line 914: > >> 912: int i = 0; >> 913: do { >> 914: if (i != 0 && thread->is_Java_thread()) { > > Perhaps add: > > `JavaThread* jt = thread->as_Java_thread();` > > and use it twice below: Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From stuefe at openjdk.java.net Mon Mar 8 08:34:23 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 8 Mar 2021 08:34:23 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v4] In-Reply-To: References: Message-ID: <669DggxCocAwvx_r9xqKCYlqKl81l4HJBSRnpLfcKus=.da682648-5514-4b5d-a4c7-b9392f35cde0@github.com> > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) > > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. > > In addition to that, this patch does a number of small changes: > > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. > - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. > 2) I added a comment to the function to not use it outside of fatal error situations. > 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. > 4) consistently used global scope :: for posix APIs. > > Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. > > ---- > > Tests: GAs, manual tests using -XX:ShowMessageBoxOnError Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - Restore use-vfork parameter - Revert last commit ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2810/files - new: https://git.openjdk.java.net/jdk/pull/2810/files/aa03cf40..8d30ba1e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2810&range=02-03 Stats: 57 lines in 8 files changed: 3 ins; 46 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/2810.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2810/head:pull/2810 PR: https://git.openjdk.java.net/jdk/pull/2810 From stuefe at openjdk.java.net Mon Mar 8 08:34:23 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 8 Mar 2021 08:34:23 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v3] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 06:50:18 GMT, Thomas Stuefe wrote: >> Hi Thomas, >> >> I appreciate the reworking to avoid vfork for the signal case, but have some concerns with the mechanism and I'm not sure a new mechanism is really needed. See comments below. >> >> Thanks, >> David > >> _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >> >> On 8/03/2021 4:12 pm, Thomas Stuefe wrote: >> >> > On Mon, 8 Mar 2021 05:37:37 GMT, David Holmes wrote: >> > > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> > > > tidy up >> > > >> > > >> > > src/hotspot/share/runtime/thread.hpp line 826: >> > > > 824: static void SpinRelease(volatile int * Lock); >> > > > 825: >> > > > 826: #ifdef POSIX >> > > >> > > >> > > I hate seeing this in shared code. It really belongs in platform-dependent thread code - but osThread_posix doesn't exist. :( >> > > But do we need this - can't the existing use_vfork_if_available parameter instead be renamed and interpreted as not_in_a_signal_handler? >> > >> > >> > But would having this functionality not make sense? We always argue about what is and is not allowed during signal handling; here, we may get a mechanism to actually assert it and e.g. prevent using malloc() or logging or whatever VM functionality may creep in. >> >> I agree having IsInSignalHandler may be generally useful, my issue was >> with how you actually did it - in the shared code - and so simpler >> solution for the currnet case would just be to reuse the existing para,eter. >> >> > But I will revert my last commit completely and re-add the needs-vfork parameter. This was supposed to be mainly a cleanup; maybe it was wrong to mix in behavioral changes. >> >> True. Do the refactoring then look at enhancements later. Sorry for >> making this harder. >> > > Oh no problem, that's what reviews are for. There will be occasion enough for behavioral change RFEs later. > > Cheers, Thomas New version; restored the vfork control boolean parameter; the only behavioral change to before is that on AIX we always use vfork since this is fine for AIX, and we have no overcommit there so its more urgent in case of OOMs. ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From adinn at redhat.com Mon Mar 8 09:49:18 2021 From: adinn at redhat.com (Andrew Dinn) Date: Mon, 8 Mar 2021 09:49:18 +0000 Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: <-6qdl0SynMQ7vx-KT68Vgv7hmxq8bcBu89vfPedpYX8=.0ee3210e-efb6-44b8-ac1c-2c8ab5f53b0e@github.com> References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> <-6qdl0SynMQ7vx-KT68Vgv7hmxq8bcBu89vfPedpYX8=.0ee3210e-efb6-44b8-ac1c-2c8ab5f53b0e@github.com> Message-ID: On 08/03/2021 07:26, Dong Bo wrote: > While we didn't have any noticeable improvements with JAVA tests we > tried, e.g. `test/micro/org/openjdk/bench/vm/gc/Alloc.java`. Seems > the percentages of the atomics are too low, not to mention the actual > real-world use case, e.g. Spark, Tomcat. > > I guess we do not have to change the `ldaxr` to `ldxr` now, due to we > haven't seen any significant performance enhancements yet. :-) But I > feel a little inconsistent that we have code use the stronger > semantics, while a weaker instruction can still provide the correct > order semantics. The outcome reflects my expectation that any program for which this did make a difference would be a very unusual one (it would have to be something like a heavily parallel spending most of its time doing concurrent updates of a shared structure and employing a lock-free model with a lot of potential for update contention). I believe it would be better in future to perform investigations when there is a real world use case indicating there is a problem to be fixed and that the fix will make a difference. Looking into this change has taken quite a lot of your time and a lot of reviewer time. That time could probably have been better spent looking into other things. regards, Andrew Dinn ----------- Red Hat Distinguished Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From coleenp at openjdk.java.net Mon Mar 8 13:38:09 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 8 Mar 2021 13:38:09 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Mon, 8 Mar 2021 08:11:25 GMT, Robbin Ehn wrote: >> With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. >> >> In some cases we are in native while executing this method and in some in vm. >> That's why there is an check for state in vm. >> >> Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. >> This change-set passes T1 stand alone. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Comment, local JavaThread variable > - Merge branch 'master' into 8262443-gen-oop-map > - Go to blocked when loop Still looks good. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2742 From coleenp at openjdk.java.net Mon Mar 8 13:54:08 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 8 Mar 2021 13:54:08 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v10] In-Reply-To: References: Message-ID: On Thu, 4 Mar 2021 21:52:18 GMT, Mikhailo Seledtsov wrote: >> This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. >> >> Here is what I did so far: >> - created a UnitTestThread and a main test runner, based on gtests with similar needs >> - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) >> to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. >> - removed invocations from whitebox.cpp >> >> Testing: >> - ran GTestWrapper on usual platforms - All PASS >> - ensured that ReservedSpaceConcurrent is in the logs and passed >> >> After gathering the feedback my plan is: >> Plan: >> - move the remaining internal Memory/VirtualSpace tests into a gTest >> - I am thinking about using separate files for each test >> - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code > > Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: > > Using C_HEAP_ARRAY macros plus a minor fix Marked as reviewed by coleenp (Reviewer). test/hotspot/gtest/concurrentTestRunner.inline.hpp line 75: > 73: Semaphore done(0); > 74: > 75: UnitTestThread** t = NEW_C_HEAP_ARRAY(UnitTestThread*, nrOfThreads, mtInternal); Looks good! ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From eosterlund at openjdk.java.net Mon Mar 8 14:06:07 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 8 Mar 2021 14:06:07 GMT Subject: RFR: 8259643: ZGC can return metaspace OOM prematurely [v2] In-Reply-To: References: Message-ID: On Wed, 3 Feb 2021 16:18:53 GMT, Thomas Stuefe wrote: > Hi Erik, > > sorry for the delay, I got swamped. Hi Thomas, Sorry I also got swamped and still am swamped. Hope it is okay if I come back to this a bit later; there are higher priority items for me to deal with at the moment. Thanks -- Erik ------------- PR: https://git.openjdk.java.net/jdk/pull/2289 From stuefe at openjdk.java.net Mon Mar 8 14:43:08 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 8 Mar 2021 14:43:08 GMT Subject: RFR: 8259643: ZGC can return metaspace OOM prematurely [v2] In-Reply-To: References: Message-ID: On Thu, 28 Jan 2021 13:24:10 GMT, Erik ?sterlund wrote: >> There exists a race condition for ZGC metaspace allocations, where an allocation can throw OOM due to unbounded starvation from other threads. Towards the end of the allocation dance, we conceptually do this: >> >> 1. full_gc() >> 2. final_allocation_attempt() >> >> And if we still fail at 2 after doing a full GC, we conclude that there isn't enough metaspace memory. However, if the thread gets preempted between 1 and 2, then an unbounded number of metaspace allocations from other threads can fill up the entire metaspace, making the final allocation attempt fail and hence throw. This can cause a situation where almost the entire metaspace is unreachable from roots, yet we throw OOM. I managed to reproduce this with the right sleeps. >> >> The way we deal with this particular issue for heap allocations, is to have an allocation request queue, and satisfy those allocations before others, preventing starvation. My solution to this metaspace OOM problem will be to basically do exactly that - have a queue of "critical" allocations, that get precedence over normal metaspace allocations. >> >> The solution should work for other concurrent GCs (who likely have the same issue), but I only tried this with ZGC, so I am only hooking in ZGC to the new API (for concurrently unloading GCs to manage critical metaspace allocations) at this point. >> >> Passes ZGC tests from tier 1-5, and the particular test that failed (with the JVM sleeps that make it fail deterministically). > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > polish code alignment and rename register/unregister to add/remove Marked as reviewed by stuefe (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2289 From stuefe at openjdk.java.net Mon Mar 8 14:43:09 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 8 Mar 2021 14:43:09 GMT Subject: RFR: 8259643: ZGC can return metaspace OOM prematurely [v2] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 14:03:18 GMT, Erik ?sterlund wrote: >> Hi Erik, >> >> sorry for the delay, I got swamped. >> >> thanks for your patient explanations. I think I understand most of it now, but I still have a number of questions. Also see the code remarks, though most are just more questions. >> >>> > One issue with your patch just came to me: the block-on-allocate may be too early. `Metaspace::allocate()` is a bit hot. I wonder about the performance impact of pulling and releasing a lock on each atomar allocation, even if its uncontended. Ideally I'd like to keep this path as close to a simple pointer bump allocation as possible (which it isn't unfortunately). >>> >>> I have a global flag that denotes there being at least 1 critical allocation in the system. It is set by the first critical allocation, and cleared by the GC if all critical allocations could be satisfied. The global lock is only taken in Metaspace::allocate() if said flag is on. Normal apps should never see this flag being on. So the only overhead in the common case is to check the flag and see that no locking is required. I think that should be fast enough, right? And when you are in the critical mode, you obviously have more to worry about than lock contention, in terms of how the system is performing. >> >> I overlooked the flag check. I think this is fine then in its current form. >> >>> > >>> > I think I get this now. IIUC we have the problem that memory release happens delayed, and we carry around a baggage of "potentially free" memory which needs a collector run to materialize. So many threads jumping up and down and doing class loading and unloading drive up metaspace use rate and increase the "potentially free" overhead, right? So the thing is to time collector runs right. >>> >>> Exactly. >>> >>> > One concern I have is that if the customer runs with too tight a limit, we may bounce from full GC to full GC. Always scraping the barrel enough to just keep going - maybe collecting some short lived loaders - but not enough to get the VM into clear waters. I think this may be an issue today already. What is unclear to me is when it would be just better to give up and throw an OOM. To motivate the customer to increase the limits. >>> >>> Right - this is a bit of a philosophical one. There is always a balance there between precision, code complexity, and when to put the VM out of its misery when it is performing poorly. We deal with the same trade-off really with heap allocations, which is why I am also solving the starvation problem in pretty much the same way: with a queue satisfied by the GC, and locking out starvation. Then we throw OOM in fairly similar conditions. What they have in common is that when they throw, we will have a live portion of metaspace that is "pretty much" full, and there is no point in continuing, while allowing unfortunate timings on a less full (in terms of temporal liveness) metaspace to always succeed. >>> >>> One might argue that the trade-off should be moved in some direction, and that it is okay for it to be more or less exact, but I was hoping that by doing the same dance that heap OOM situations do, we can at least follow a trade-off that is pretty well established and has worked pretty well for heap OOM situations for many years. And I think heap OOM situations in the wild are far more common than metaspace OOM, so I don't think that the metaspace OOM mechanism needs to do better than what the heap OOM mechanism does. If that makes sense. >> >> It makes sense. If its an established proven pattern lets use it here too. Its not that complex. >> >> I think there are differences between heap and metaspace in elasticity. Metaspace is way less "spongy", chance of recovering metaspace is slimmer than with the heap, so a Full GC is more expensive in relation to its effects. I think I have seen series of Full GCs on quite empty heaps when customers set MaxMetaspaceSize too low (people seem to like doing that). I'm worried about small loaders with short lifetimes which just allow the VM to reclaim enough to keep going. Reflection invocation loaders, or projects like jruby. But I think this is not the norm, nor does it have anything to do with your patch. Typically we do just one futile GC and then throw OOM. >> >>> > >>> > Why not just cover the whole synchronous GC collect call? I'd put that barrier up as early as possible, to prevent as many threads as possible from entering the more expensive fail path. At that point we know we are near exhaustion. Any thread allocating could just as well wait inside MetaspaceArena::allocate. If the GC succeeds in releasing lots of memory, they will not have been disturbed much. >>> >>> Do you mean >>> >>> 1. why I don't hold the MetaspaceCritical_lock across the collect() call at the mutator side of the code, or >>> 2. why I don't hold the MetaspaceCritical_lock across the entire GC cycle instead of purge? >>> >>> I think you meant 2), so will answer that: >>> a) Keeping the lock across the entire GC cycle is rather problematic when it comes to constraining lock ranks. It would have to be above any lock ever needed in an entire GC cycle, yet we take a bunch of other locks that mutators hold at different point, interacting with class unloading. It would be very hard to find the right rank for this. >>> b) The GC goes in and out of safepoints, and needs to sometimes block out safepoints. Holding locks in and out of safepoints while blocking in and out safepoints, is in my experience rather deadlock prone. >>> c) There is no need to keep the lock across more than the operation that frees metaspace memory, which in a concurrent GC always happens when safepoints are blocked out. If a mutator succeeds during a concurrent GC due to finding memory in a local free list or something, while another allocation failed and needs a critical allocation, then that is absolutely fine, as the successful allocation is never what causes the failing allocation to fail. >> >> Thanks, that makes sense. I was completely offtrack, thinking more in direction of (1), wrt the current coding, not your patch. I thought if the mutator thread locks across the collect call and the subsequent allocation attempt (which would have to be some sort of priority allocation, ignoring the lock) this would be a simple solution. But I think that has a number of holes, never mind the allocation request ordering. >> >>> >>> > > > Why do we even need a queue? Why could we not just let the first thread attempting a synchronous gc block metaspace allocation path for all threads, including others running into a limit, until the gc is finished and it had its first-allocation-right served? >>> > > >>> > > >>> > > Each "critical" allocation rides on one particular GC cycle, that denotes the make-or-break point of the allocation. >>> > >>> > >>> > I feel like I should know this, but if multiple threads enter satisfy_failed_metadata_allocation around the same time and call a synchronous collect(), they would wait on the same GC, right? They won't start individual GCs for each thread? >>> >>> The rule we have in ZGC to ensure what we want in terms of OOM situations, is that GCs that are _requested_ before an equivalent GC _starts_, can be satisfied with the same GC cycle. >>> >>> Let's go through what will likely happen in practice with my solution when you have, let's say 1000 concurrent calls to satisfy a failing metaspace allocation. >>> >>> 1. Thread 1 registers its allocation, sees it is the first one, and starts a metaspace GC. >>> 2. GC starts running >>> 3. Threads 2-999 register their allocations, and see that there was already a critical allocation before them. This causes them to wait for the GC to purge, opportunistically, riding on that first GC. >>> 4. The GC satisfies allocations. For the sake of example, let's say that allocations 1-500 could be satisfied, but not the rest. >>> 5. Threads 2-500 who were waiting for purge to finish, wake up, and run off happily with their satisfied allocations. >>> 6. Threads 501-1000 wake up seeing that their allocations didn't get satisfied. They now stop being opportunistic, and request a gc each before finally giving up. >>> 7. The first GC cycle finishes. >>> 8. Thread 1 wakes up after the entire first GC cycle is done and sees its satisfied allocation, happily running off with it. >>> 9. The next GC cycle starts >>> 10. The next GC cycle successfully satisfies metadata allocations for threads 501-750, but not the rest. >>> 11. The entire next GC cycle finishes, satisfying the GC requested by threads 2-1000, as they all _requested_ a metaspace GC, _before_ it started running. Therefore, no further GC will run. >>> 12. Threads 501-750 wake up, happily running off with their satisfied allocations. >>> 13. Threads 751-1000 wake up, grumpy about the lack of memory after their GC. They are all gonna throw. >>> >>> So basically, if they can all get satisfied with 1 GC, then 1 GC will be enough. But we won't throw until a thread has had a full GC run _after_ it was requested, but multiple threads can ride on the same GC there. In this example, threads 2-1000 all ride on the same GC. >> >> This is a nice elegant solution. I had some trouble understanding your explanation: When you write "threads 2-1000 all ride on the same GC" this confused me since threads 2-500 were lucky and were satisfied by the purge in GC cycle 1. So I would have expected threads 501-1000 to ride on the second GC, thread 1-500 on the first. Unless with "ride" you mean "guaranteed to be processed"? >> >>> >>> Note though, that we would never allow a GC that is already running to satisfy a GC request that comes in while the GC is already running, as we then wouldn't catch situations when a thread releases a lot of memory, and then expects it to be available just after. >>> >>> > > In order to prevent starvation, we have to satisfy all critical allocations who have their make-or-break GC cycle associated with the current purge() operation, before we release the lock in purge(), letting new allocations in, or we will rely on luck again. However, among the pending critical allocations, they will all possibly have different make-or-break GC cycles associated with them. So in purge() some of them need to be satisfied, and others do not, yet can happily get their allocations satisfied opportunistically if possible. So we need to make sure they are ordered somehow, such that the earliest arriving pending critical allocations are satisfied first, before subsequent critical allocations (possibly waiting for a later GC), or we can get premature OOM situations again, where a thread releases a bunch of memory, expecting to be able to allocate, yet fails due to races with various threads. >>> > > The queue basically ensures the ordering of critical allocation satisfaction is sound, so that the pending critical allocations with the associated make-or-break GC being the one running purge(), are satisfied first, before satisfying (opportunistically) other critical allocations, that are really waiting for the next GC to happen. >>> > >>> > >>> > I still don't get why the order of the critical allocations matter. I understand that even with your barrier, multiple threads can fail the initial allocation, enter the "satisfy_failed_metadata_allocation()" path, and now their allocation count as critical since if they fail again they will throw an OOM. But why does the order of the critical allocations among themselves matter? Why not just let the critical allocations trickle out unordered? Is the relation to the GC cycle not arbitrary? >>> >>> If we don't preserve the order, we would miss situations when a thread releases a large chunk of metaspace (e.g. by releasing a class loader reference), and then expects that memory to be available. An allocation from a critical allocation that is associated with a subsequent GC, could starve a thread that is associated with the current GC cycle, hence causing a premature OOM for that thread, while not really needing that allocation until next GC cycle, while doing it in the right order would satisfy both allocations. >>> >>> One might argue it might be okay to throw in such a scenario with an unordered solution. We are pretty close to running out of memory anyway I guess. But I'm not really sure what such a solution would look like in more detail, and thought writing this little list was easy enough, and for me easier to reason about, partially because we do the same dance in the GC to deal with heap OOM, which has been a success. >> >> I think I get this now. Its still a brain teaser if you are not used to the pattern, but I think I got most of it. If the pattern is well established and proofed it makes sense. >> >> Thanks for the great explanations! >> >>> >>> Thanks, >>> /Erik >>> >>> ^--- I know this will be treated as a PR bot command, but I can't give up on my slash! >>> >> >> Alias it to integrate :) >> >>> >>> > Just FYI, I have very vague plans to extend usage of the metaspace allocator to other areas. To get back the cost of implementation. Eg one candidate may be replacement of the current resource areas, which are just more primitive arena based allocators. This is very vague and not a high priority, but I am a bit interested in keeping the coding generic if its not too much effort. But I don't see your patch causing any problems there. >>> >>> That sounds very interesting. We had internal discussions about riding on the MetaspaceExpand lock which I believe would also work, but thought this really ought to be a separate thing so that we don't tie ourselves too tight to the internal implementation of the allocator. Given that you might want to use the allocator for other stuff, it sounds like that is indeed the right choice. >> >> Yes, I think this is definitely better. The expand lock (I renamed it to be just the Metaspace_lock since its also used for reclamation and other stuff) is used in a more fine granular fashion. I cannot see it working in the same way as the critical lock, protecting the queue and preventing entry for non-priority allocation. >> >> Thanks, Thomas > >> Hi Erik, >> >> sorry for the delay, I got swamped. > > Hi Thomas, > > Sorry I also got swamped and still am swamped. Hope it is okay if I come back to this a bit later; there are higher priority items for me to deal with at the moment. > > Thanks > -- Erik Hi Eric, please feel free to commit this, and answer my question at your leisure, if at all. I am fine with your change as it is (now that I understand it :) Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2289 From jbachorik at openjdk.java.net Mon Mar 8 15:34:23 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 8 Mar 2021 15:34:23 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v10] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with two additional commits since the last revision: - Adjust the deadspace calculation - Minor cleanup ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/f6954186..f708023b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=09 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=08-09 Stats: 28 lines in 7 files changed: 11 ins; 7 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From dcubed at openjdk.java.net Mon Mar 8 15:34:11 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 8 Mar 2021 15:34:11 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Mon, 8 Mar 2021 08:11:25 GMT, Robbin Ehn wrote: >> With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. >> >> In some cases we are in native while executing this method and in some in vm. >> That's why there is an check for state in vm. >> >> Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. >> This change-set passes T1 stand alone. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Comment, local JavaThread variable > - Merge branch 'master' into 8262443-gen-oop-map > - Go to blocked when loop Thumbs up! ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2742 From rehn at openjdk.java.net Mon Mar 8 16:01:06 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 8 Mar 2021 16:01:06 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Mon, 8 Mar 2021 13:34:59 GMT, Coleen Phillimore wrote: > Still looks good. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From rehn at openjdk.java.net Mon Mar 8 16:01:06 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 8 Mar 2021 16:01:06 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: <4fBeXovASFVPWSFrq06er_hIOk7z2neLqyTlEUaxHhc=.917a229d-75e2-4b59-8693-8219f573014d@github.com> On Mon, 8 Mar 2021 15:31:49 GMT, Daniel D. Daugherty wrote: > Thumbs up! Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From jbachorik at openjdk.java.net Mon Mar 8 17:08:27 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 8 Mar 2021 17:08:27 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v11] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Cache live size estimate for memory spaces ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/f708023b..343e4809 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=10 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=09-10 Stats: 17 lines in 3 files changed: 13 ins; 1 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 8 17:17:11 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 8 Mar 2021 17:17:11 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Wed, 3 Mar 2021 12:13:43 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Add tests for the heap usage summary event > > src/hotspot/share/gc/shared/space.hpp line 553: > >> 551: size_t capacity() const { return byte_size(bottom(), end()); } >> 552: size_t used() const { return byte_size(bottom(), top()); } >> 553: size_t live() const { > > The code for serial gc, contrary to others, tries to give some resemblance of tracking actual liveness. I.e. calculating this anew every call to `SerialHeap::live()`. > However if calling an `update_live_estimate()` in parallel and G1 (and the other collectors) is fine at certain places, this should be as good for serial gc. > Doing so would reduce the footprint of this change quite a bit (for serial gc) Ok. I am caching the live estimate per memory space now. Not sure how much it will change the footprint of this change, though, but it is good for consistency anyway. > src/hotspot/share/gc/shared/space.inline.hpp line 128: > >> 126: p2i(dead_start), p2i(dead_end), dead_length * HeapWordSize); >> 127: >> 128: _dead_space += dead_length; > > I do not think adding this to the counter here instead of the other method for every object makes a difference performance-wise. > > As mentioned before, `_allowed_deadspace_words` counts *down* from `(space->capacity() * ratio / 100) / HeapWordSize;` to whatever end value. > > So at the end of collection, `(space->capacity() * ratio / 100) / HeapWordSize - _allowed_deadspace_words` should be equal to what `_dead_space` is now. > > Please add a getter to `DeadSpacer` that calculates this (factoring out the calculation of the maximum allowed deadspace). Fixed in https://github.com/openjdk/jdk/pull/2579/commits/f708023b84e33aee34f92d183a0d03d2747646db ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From kvn at openjdk.java.net Mon Mar 8 17:28:07 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 8 Mar 2021 17:28:07 GMT Subject: RFR: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true In-Reply-To: References: Message-ID: <0PnuLERyEbdKgb1YDnmrIGhY8UPYySrhRzQyI9B0HPk=.77cf9a39-7130-4ca8-8c54-a8e57c42f201@github.com> On Sun, 7 Mar 2021 20:50:27 GMT, Igor Veresov wrote: > Cleanup the behavior of the compilation policy with -Xcomp. src/hotspot/share/compiler/compilationPolicy.cpp line 678: > 676: methodHandle max_method_h(Thread::current(), max_method); > 677: > 678: if (max_task != NULL && max_task->comp_level() == CompLevel_full_profile && TieredStopAtLevel > CompLevel_full_profile && What is value of `TieredStopAtLevel` when Tiered is off in case you have only C2 and when you have only C1? ------------- PR: https://git.openjdk.java.net/jdk/pull/2864 From jbachorik at openjdk.java.net Mon Mar 8 17:29:27 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 8 Mar 2021 17:29:27 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Remove unused field ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/343e4809..67d78940 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=11 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=10-11 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 8 17:29:30 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 8 Mar 2021 17:29:30 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Wed, 3 Mar 2021 12:03:01 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Add tests for the heap usage summary event > > src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 182: > >> 180: G1BlockOffsetTable* _bot; >> 181: >> 182: volatile size_t _live; > > I'm not happy with naming this `_live`, better use `_live_estimate`. The contents are not continuously updated and basically out of date after the first following allocation. > This includes the naming in all other instances too. I see your point - but that would probably lead to renaming `live()` method to `live_estimate()` (to keep the variable and the accessor method in sync) and that would break the nice symmetry we have now with `free()`, `used()` and `live()`. I have no strong feelings about this and if we can get quorum on this change I will do the renaming pass. > src/hotspot/share/gc/parallel/parallelScavengeHeap.hpp line 87: > >> 85: >> 86: // in order to provide accurate estimate this method must be called only when the heap has just been collected and compacted >> 87: inline void capture_live(); > > Sentences should start with upper case in the comment. Also I'd prefer to name the method `update_live_estimate()` instead. Done > src/hotspot/share/gc/parallel/psAdaptiveSizePolicy.hpp line 60: > >> 58: class PSAdaptiveSizePolicy : public AdaptiveSizePolicy { >> 59: friend class PSGCAdaptivePolicyCounters; >> 60: friend class ParallelScavengeHeap; > > Delete this apparently unneeded friend declaration (compiled successfully without here) Done > src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1070: > >> 1068: >> 1069: uint num_selected_for_rebuild() const { return _num_regions_selected_for_rebuild; } >> 1070: size_t live_estimate() const { return _live; } > > Please sync the member name with the getter name. I.e. `_live` -> `_live_estimate` Done > src/hotspot/share/gc/serial/serialHeap.hpp line 44: > >> 42: MemoryPool* _old_pool; >> 43: >> 44: size_t _live_size; > > Please rename to `_live_estimate` like the others. Avoid having different names in different collectors for the same thing. This field is unused. Removed. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 8 17:34:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 8 Mar 2021 17:34:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Tue, 2 Mar 2021 17:34:32 GMT, Aleksey Shipilev wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Add tests for the heap usage summary event > > src/hotspot/share/gc/shared/space.inline.hpp line 190: > >> 188: oop obj = oop(cur_obj); >> 189: size_t obj_size = obj->size(); >> 190: compact_top = cp->space->forward(obj, obj_size, cp, compact_top); > > This change seems superfluous now. Inline `obj_size` back? Done > src/hotspot/share/gc/shared/space.hpp line 555: > >> 553: size_t live() const { >> 554: return used() - _dead_space; >> 555: } > > Move it a few lines down, so `capacity`, `used`, `live` line up? Ok. I did my best to make the code look nice there :) > src/hotspot/share/gc/shared/collectedHeap.hpp line 218: > >> 216: virtual size_t capacity() const = 0; >> 217: virtual size_t used() const = 0; >> 218: // Returns the estimate of live set size. Because live set changes over time, > > I believe a blank line is in order here, look at other comments in the same header. Done > src/hotspot/share/gc/shared/space.inline.hpp line 90: > >> 88: >> 89: public: >> 90: size_t _dead_space; > > Should this really be "public"? Maybe `friend`-ing with the only user is better? Yep. Befriended. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From iveresov at openjdk.java.net Mon Mar 8 17:43:29 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Mon, 8 Mar 2021 17:43:29 GMT Subject: RFR: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true [v2] In-Reply-To: <0PnuLERyEbdKgb1YDnmrIGhY8UPYySrhRzQyI9B0HPk=.77cf9a39-7130-4ca8-8c54-a8e57c42f201@github.com> References: <0PnuLERyEbdKgb1YDnmrIGhY8UPYySrhRzQyI9B0HPk=.77cf9a39-7130-4ca8-8c54-a8e57c42f201@github.com> Message-ID: <1XkL1ly1cB2MIZ7nmf-1ln6zPmESqA8FJOeWuQoYNBA=.33ecf139-2b90-4f07-8c61-52dbd869a089@github.com> On Mon, 8 Mar 2021 17:25:23 GMT, Vladimir Kozlov wrote: >> Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix debugging leftover > > src/hotspot/share/compiler/compilationPolicy.cpp line 678: > >> 676: methodHandle max_method_h(Thread::current(), max_method); >> 677: >> 678: if (max_task != NULL && max_task->comp_level() == CompLevel_full_profile && TieredStopAtLevel > CompLevel_full_profile && > > What is value of `TieredStopAtLevel` when Tiered is off in case you have only C2 and when you have only C1? The default (i.e. CompLevel_full_optimization == 4). It doesn't matter though on this particular line, because this in-queue change optimization is for CompLevel_full_profile compilation requests only, and that level is illegal with both C1-only and C2-only configuration (including when tiered is off). So, this if condition is never going to be true in these modes. ------------- PR: https://git.openjdk.java.net/jdk/pull/2864 From iveresov at openjdk.java.net Mon Mar 8 17:43:28 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Mon, 8 Mar 2021 17:43:28 GMT Subject: RFR: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true [v2] In-Reply-To: References: Message-ID: > Cleanup the behavior of the compilation policy with -Xcomp. Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: Fix debugging leftover ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2864/files - new: https://git.openjdk.java.net/jdk/pull/2864/files/1ea5e534..65600b67 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2864&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2864&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2864.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2864/head:pull/2864 PR: https://git.openjdk.java.net/jdk/pull/2864 From kvn at openjdk.java.net Mon Mar 8 17:52:07 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Mon, 8 Mar 2021 17:52:07 GMT Subject: RFR: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true [v2] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 17:43:28 GMT, Igor Veresov wrote: >> Cleanup the behavior of the compilation policy with -Xcomp. > > Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: > > Fix debugging leftover Okay. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2864 From iveresov at openjdk.java.net Mon Mar 8 18:04:09 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Mon, 8 Mar 2021 18:04:09 GMT Subject: Integrated: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true In-Reply-To: References: Message-ID: <-Ik2K_QiH1WuTyRxyN6csRdbt1tbbi6cVnW4quywtDM=.8931cb17-9d5a-448c-8bdc-2e30f587ecfe@github.com> On Sun, 7 Mar 2021 20:50:27 GMT, Igor Veresov wrote: > Cleanup the behavior of the compilation policy with -Xcomp. This pull request has now been integrated. Changeset: 1f9ed905 Author: Igor Veresov URL: https://git.openjdk.java.net/jdk/commit/1f9ed905 Stats: 61 lines in 9 files changed: 15 ins; 27 del; 19 mod 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true Reviewed-by: kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/2864 From iveresov at openjdk.java.net Mon Mar 8 18:04:08 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Mon, 8 Mar 2021 18:04:08 GMT Subject: RFR: 8219555: compiler/jvmci/compilerToVM/IsMatureTest.java fails with Unexpected isMature state for multiple times invoked method: expected false to equal true [v2] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 17:49:29 GMT, Vladimir Kozlov wrote: >> Igor Veresov has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix debugging leftover > > Okay. @vnkozlov Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/2864 From hseigel at openjdk.java.net Mon Mar 8 18:49:21 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 8 Mar 2021 18:49:21 GMT Subject: RFR: 8252173: Use handles instead of jobjects in modules.cpp Message-ID: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> Hi, Please review this change for JDK-8252173 to use handles instead of jobjects in modules.cpp to make modules.cpp more debuggable. The change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8252173: Use handles instead of jobjects in modules.cpp Changes: https://git.openjdk.java.net/jdk/pull/2878/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2878&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8252173 Stats: 74 lines in 4 files changed: 17 ins; 5 del; 52 mod Patch: https://git.openjdk.java.net/jdk/pull/2878.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2878/head:pull/2878 PR: https://git.openjdk.java.net/jdk/pull/2878 From lfoltan at openjdk.java.net Mon Mar 8 19:15:05 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Mon, 8 Mar 2021 19:15:05 GMT Subject: RFR: 8252173: Use handles instead of jobjects in modules.cpp In-Reply-To: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> References: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> Message-ID: On Mon, 8 Mar 2021 18:45:04 GMT, Harold Seigel wrote: > Hi, > Please review this change for JDK-8252173 to use handles instead of jobjects in modules.cpp to make modules.cpp more debuggable. The change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and tiers 3-5 on Linux x64. > > Thanks, Harold Looks good Harold! Lois ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2878 From mseledtsov at openjdk.java.net Mon Mar 8 20:13:13 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Mon, 8 Mar 2021 20:13:13 GMT Subject: Integrated: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest In-Reply-To: References: Message-ID: On Fri, 5 Feb 2021 20:35:23 GMT, Mikhailo Seledtsov wrote: > This is a preliminary review. I would like to get the initial feedback before I proceed with conversion of the remaining tests. > > Here is what I did so far: > - created a UnitTestThread and a main test runner, based on gtests with similar needs > - moved the original code from HotSpot internals (so called hotspot internal tests: src/hotspot/share/memory/virtualspace.cpp) > to the newly created gtest while wrapping it into a TestReservedSpace class. I did not change the code of the test. > - removed invocations from whitebox.cpp > > Testing: > - ran GTestWrapper on usual platforms - All PASS > - ensured that ReservedSpaceConcurrent is in the logs and passed > > After gathering the feedback my plan is: > Plan: > - move the remaining internal Memory/VirtualSpace tests into a gTest > - I am thinking about using separate files for each test > - create a common file for UnitTestThread and MultiThreadTestRunner to reuse the code This pull request has now been integrated. Changeset: 9221540e Author: Mikhailo Seledtsov URL: https://git.openjdk.java.net/jdk/commit/9221540e Stats: 1302 lines in 13 files changed: 634 ins; 664 del; 4 mod 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest Reviewed-by: iignatyev, coleenp, stuefe ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From mseledtsov at openjdk.java.net Mon Mar 8 20:13:10 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Mon, 8 Mar 2021 20:13:10 GMT Subject: RFR: 8213269: convert test/hotspot/jtreg/runtime/memory/RunUnitTestsConcurrently to gtest [v10] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 13:51:04 GMT, Coleen Phillimore wrote: >> Mikhailo Seledtsov has updated the pull request incrementally with one additional commit since the last revision: >> >> Using C_HEAP_ARRAY macros plus a minor fix > > Marked as reviewed by coleenp (Reviewer). Thank you again Igor, Coleen, Thomas and Kim for review and discussion. ------------- PR: https://git.openjdk.java.net/jdk/pull/2436 From dholmes at openjdk.java.net Mon Mar 8 20:36:10 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 8 Mar 2021 20:36:10 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: <1uF96utjrXt_XB7Rn94ihM_s4Ok2p91h2hliK-wXSi0=.102680db-b285-4980-899c-24c6cae9154e@github.com> On Mon, 8 Mar 2021 08:11:25 GMT, Robbin Ehn wrote: >> With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. >> >> In some cases we are in native while executing this method and in some in vm. >> That's why there is an check for state in vm. >> >> Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. >> This change-set passes T1 stand alone. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Comment, local JavaThread variable > - Merge branch 'master' into 8262443-gen-oop-map > - Go to blocked when loop Still good. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2742 From dholmes at openjdk.java.net Mon Mar 8 20:54:10 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 8 Mar 2021 20:54:10 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v4] In-Reply-To: <669DggxCocAwvx_r9xqKCYlqKl81l4HJBSRnpLfcKus=.da682648-5514-4b5d-a4c7-b9392f35cde0@github.com> References: <669DggxCocAwvx_r9xqKCYlqKl81l4HJBSRnpLfcKus=.da682648-5514-4b5d-a4c7-b9392f35cde0@github.com> Message-ID: On Mon, 8 Mar 2021 08:34:23 GMT, Thomas Stuefe wrote: >> `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: >> a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) >> b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) >> >> The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. >> >> In addition to that, this patch does a number of small changes: >> >> 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: >> - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. >> - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. >> 2) I added a comment to the function to not use it outside of fatal error situations. >> 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. >> 4) consistently used global scope :: for posix APIs. >> >> Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. >> >> ---- >> >> Tests: GAs, manual tests using -XX:ShowMessageBoxOnError > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Restore use-vfork parameter > - Revert last commit Hi Thomas, I am fine with this version of the refactoring. Thanks, David src/hotspot/os/windows/os_windows.cpp line 5517: > 5515: // Run the specified command in a separate process. Return its exit value, > 5516: // or -1 on failure (e.g. can't create a new process). > 5517: int os::fork_and_exec(const char* cmd, bool dummy /* ignored */) { You could just have `bool _ignored` ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2810 From hseigel at openjdk.java.net Mon Mar 8 21:20:08 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 8 Mar 2021 21:20:08 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v4] In-Reply-To: <669DggxCocAwvx_r9xqKCYlqKl81l4HJBSRnpLfcKus=.da682648-5514-4b5d-a4c7-b9392f35cde0@github.com> References: <669DggxCocAwvx_r9xqKCYlqKl81l4HJBSRnpLfcKus=.da682648-5514-4b5d-a4c7-b9392f35cde0@github.com> Message-ID: On Mon, 8 Mar 2021 08:34:23 GMT, Thomas Stuefe wrote: >> `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: >> a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) >> b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) >> >> The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. >> >> In addition to that, this patch does a number of small changes: >> >> 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: >> - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. >> - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. >> 2) I added a comment to the function to not use it outside of fatal error situations. >> 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. >> 4) consistently used global scope :: for posix APIs. >> >> Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. >> >> ---- >> >> Tests: GAs, manual tests using -XX:ShowMessageBoxOnError > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Restore use-vfork parameter > - Revert last commit Changes look good. Thanks for doing this. Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2810 From hohensee at amazon.com Mon Mar 8 23:27:37 2021 From: hohensee at amazon.com (Hohensee, Paul) Date: Mon, 8 Mar 2021 23:27:37 +0000 Subject: CFV: New HotSpot Group Member: Christian Hagedorn Message-ID: Vote: yes ?-----Original Message----- From: hotspot-dev on behalf of Tobias Hartmann Date: Friday, March 5, 2021 at 5:28 AM To: hotspot-dev Source Developers Subject: CFV: New HotSpot Group Member: Christian Hagedorn Hi, I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. Christian is a member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. He contributed over 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, acquiring expert knowledge in key areas (for example, loop unswitching and superword optimizations). He investigated and fixed several highly complex and long-standing issues in the code base and improved maintainability of the JITs. All the while, Christian is constantly updating and extending the sparse documentation, making life easier for other engineers. HotSpot Group membership would allow Christian to continue to do so by adding to the OpenJDK wiki pages. Votes are due by Friday, 19 March 2021 at 13:30 UTC. Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [3]. Best regards, Tobias [1] https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits [2] https://openjdk.java.net/census [3] https://openjdk.java.net/groups/#member-vote From coleenp at openjdk.java.net Tue Mar 9 00:06:12 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 00:06:12 GMT Subject: RFR: 8263002: Remove CDS MiscCode region In-Reply-To: References: Message-ID: <4EsEk0VVQDAbkJzXmrMh2a_B4sFq7diglgCo3UqZpBQ=.5f4541c2-76aa-46da-82b5-dc8e537a85f3@github.com> On Sun, 7 Mar 2021 06:26:00 GMT, Ioi Lam wrote: > The CDS MiscCode region is used for: > (a) C++ vtables > (b) Method trampolines > > (a) can be moved to the ReadWrite region > (b) were introduced in JDK-8145221 so we can delay writing into Methods. This was intended to improve copy-on-write sharing to reduce memory footprint. However, this hasn't been shown to have any significant effect (footprint of metadata usually is much smaller than the Java heap), and introduces a lot of complexity in the HotSpot code. > > Removing (b) will make it easier to implement JDK-8026297 (Generating AdapterHandlerEntry during CDS dump), which will further improve start-up time. > > ============ > Other benefits of removing the MiscCode region: > > - We no longer have a read/write/executable region. This address the concern in JDK-8262922. > - We can enable CDS on macOS/AArch64, which does not allow read/write/executable regions. (JDK-8253795) wow. Looks good to me. src/hotspot/share/memory/archiveBuilder.cpp line 831: > 829: if (STATIC_DUMP) { > 830: if (!_builder->is_in_buffer_space(*p)) { > 831: tty->print_cr("ohashii %p %p", p, *p); What is ohashii ? Should this be an assert? Or log_error instead of tty->print src/hotspot/share/memory/cppVtables.cpp line 218: > 216: assert(DumpSharedSpaces, "must"); > 217: size_t vtptrs_bytes = _num_cloned_vtable_kinds * sizeof(CppVtableInfo*); > 218: _index = (CppVtableInfo**)ArchiveBuilder::current()->rw_region()->allocate(vtptrs_bytes); I thought these would be read-only? src/hotspot/share/memory/dumpAllocStats.cpp line 106: > 104: > 105: assert(all_ro_bytes == ro_all, "everything should have been counted"); > 106: //assert(all_rw_bytes == rw_all, "everything should have been counted"); FIXME! ? Does this need to be fixed? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2861 From coleenp at openjdk.java.net Tue Mar 9 00:50:07 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 00:50:07 GMT Subject: RFR: 8252173: Use handles instead of jobjects in modules.cpp In-Reply-To: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> References: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> Message-ID: On Mon, 8 Mar 2021 18:45:04 GMT, Harold Seigel wrote: > Hi, > Please review this change for JDK-8252173 to use handles instead of jobjects in modules.cpp to make modules.cpp more debuggable. The change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and tiers 3-5 on Linux x64. > > Thanks, Harold Looks good! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2878 From dongbo at openjdk.java.net Tue Mar 9 02:30:11 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 9 Mar 2021 02:30:11 GMT Subject: Withdrawn: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 08:29:14 GMT, Dong Bo wrote: > The aarch64 LSE atomic operations are introduced to C++ hotspot code in JDK-8261027 and optimized in JDK-8261649. > For memory_order_conservative, the acquire semantics in atomic instructions, i.e. ldaddal, swpal, casal, ensure that no subsequent accesses can pass the atomic operations. > We also have a trailing dmb to ensure barrier-ordered-after relationship, it can ensure what the acquire does. So the acquire semantics is no longer needed, {ldaddl, swpl, casl} would be enough. > > Checked by using the herd7 consistency model simulator with the test in comments before `gen_cas_entry`: > AArch64 LseCasAfter > { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } > P0 | P1 ; > LDR W4, [X2] | MOV W3, #0 ; > DMB LD | MOV W4, #1 ; > LDR W3, [X1] | CASL W3, W4, [X1] ; > | DMB ISH ; > | STR W4, [X2] ; > exists > (0:X3=0 /\ 0:X4=1) > No `X3 == 0 && X4 == 1` witnessed. > > Remove the acquire semantics does not allow prior accesses to pass the atomic operations, because the release semantics are still there. > Just in case, checked by herd7 with the testcase below: > AArch64 LseCasPrior > { 0:X1=x; 0:X2=y; 1:X1=x; 1:X2=y; } > P0 | P1 ; > LDR W3, [X1] | MOV W3, #0 ; > DMB LD | MOV W4, #1 ; > LDR W4, [X2] | STR W4, [X2] ; > | CASL W3, W4, [X1] ; > | DMB ISH ; > exists > (0:X3=1 /\ 0:X4=0) > No `X3 == 1 && X4 == 0` witnessed. > > Similarly, the default implementations of `atomic_fetch_add` and `atomic_xchg` via `ldaxr+stlxr+dmb` can be replaced by `ldxr+stlxr+dmb`. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From dongbo at openjdk.java.net Tue Mar 9 02:30:10 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 9 Mar 2021 02:30:10 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: <-6qdl0SynMQ7vx-KT68Vgv7hmxq8bcBu89vfPedpYX8=.0ee3210e-efb6-44b8-ac1c-2c8ab5f53b0e@github.com> References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> <-6qdl0SynMQ7vx-KT68Vgv7hmxq8bcBu89vfPedpYX8=.0ee3210e-efb6-44b8-ac1c-2c8ab5f53b0e@github.com> Message-ID: <5TBdj1TZXwexuQG-1cSbm7Ro3TL_CDi3gvfj7ONWNss=.3b4f2c15-77cf-4d31-b798-e72232d8d4af@github.com> On Mon, 8 Mar 2021 07:23:04 GMT, Dong Bo wrote: >> Correction: >> >> I agree that the code will still be correct if you change the ldaxr to *ldxr*. > >> > Hm, from our point of view, ldaxr+stlxr+dmb and ldxr+stlxr+dmb provide the same order semantics. >> > The acquire are used to ensure all loads/stores that are after an ldaxr (actually loads/stores after the dmb of atomic__default__impl in this case) in program order, while the dmb has already guaranteed this for us. >> > Without the acquire, the loads/stores after the atomic operations still can not pass the dmb. >> > Remove the acquire does not change the order between preceding loads/stores and stlxr. >> >> I agree that the code will still be correct if you change the ldaxr to ldar. While this may make some difference on machines which do not support LSE I would not expect it to be significant for anything other than a very carefully crafted benchmark or an extremely specialized parallel algorithm. Is this change request motivated by an actual real-world use case? > > Thanks for the comments. > We only witnessed ~4% improvements with C code below on one of our platform. > The load-acquire is implemented as a full-barrier by the core in fact. > // highly-contended fetch_and_add test, %3.96 improvements with 8 threads > unsigned long res = 0; > unsigned long sum = 0; > extern unsigned long aarch64_atomic_fetch_add_8_default_impl(void *ptr, unsigned long val); > > void *executor (void *arg) > { > for (int i = 0; i < ITER; i++) { > sum += aarch64_atomic_fetch_add_8_default_impl(&res, 1); > } > } > > int main(int argc, char **args) > { > int i; > pthread_t exethreads[THREADS]; > int threads = atoi(args[1]); > > for (i = 0; i < threads; i++) > pthread_create(&exethreads[i], NULL, executor, NULL); > for (i = 0; i < threads; i++) > pthread_join(exethreads[i], NULL); > > return 0; > } > While we didn't have any noticeable improvements with JAVA tests we tried, e.g. `test/micro/org/openjdk/bench/vm/gc/Alloc.java`. > Seems the percentages of the atomics are too low, not to mention the actual real-world use case, e.g. Spark, Tomcat. > > I guess we do not have to change the `ldaxr` to `ldxr` now, due to we haven't seen any significant performance enhancements yet. :-) > But I feel a little inconsistent that we have code use the stronger semantics, while a weaker instruction can still provide the correct order semantics. OK, withdrawing... Thank you all for the time. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From iklam at openjdk.java.net Tue Mar 9 04:53:27 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 9 Mar 2021 04:53:27 GMT Subject: RFR: 8263002: Remove CDS MiscCode region [v2] In-Reply-To: References: Message-ID: > The CDS MiscCode region is used for: > (a) C++ vtables > (b) Method trampolines > > (a) can be moved to the ReadWrite region > (b) were introduced in JDK-8145221 so we can delay writing into Methods. This was intended to improve copy-on-write sharing to reduce memory footprint. However, this hasn't been shown to have any significant effect (footprint of metadata usually is much smaller than the Java heap), and introduces a lot of complexity in the HotSpot code. > > Removing (b) will make it easier to implement JDK-8026297 (Generating AdapterHandlerEntry during CDS dump), which will further improve start-up time. > > ============ > Other benefits of removing the MiscCode region: > > - We no longer have a read/write/executable region. This address the concern in JDK-8262922. > - We can enable CDS on macOS/AArch64, which does not allow read/write/executable regions. (JDK-8253795) Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: - bumped CURRENT_CDS_ARCHIVE_VERSION by one since format has changed - @coleenp review: remove temp debug code; fixed accounting of vtable sizes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2861/files - new: https://git.openjdk.java.net/jdk/pull/2861/files/1c6ac95e..a3e4f25e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2861&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2861&range=00-01 Stats: 25 lines in 7 files changed: 10 ins; 5 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/2861.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2861/head:pull/2861 PR: https://git.openjdk.java.net/jdk/pull/2861 From iklam at openjdk.java.net Tue Mar 9 04:53:29 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 9 Mar 2021 04:53:29 GMT Subject: RFR: 8263002: Remove CDS MiscCode region [v2] In-Reply-To: <4EsEk0VVQDAbkJzXmrMh2a_B4sFq7diglgCo3UqZpBQ=.5f4541c2-76aa-46da-82b5-dc8e537a85f3@github.com> References: <4EsEk0VVQDAbkJzXmrMh2a_B4sFq7diglgCo3UqZpBQ=.5f4541c2-76aa-46da-82b5-dc8e537a85f3@github.com> Message-ID: On Mon, 8 Mar 2021 23:50:30 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: >> >> - bumped CURRENT_CDS_ARCHIVE_VERSION by one since format has changed >> - @coleenp review: remove temp debug code; fixed accounting of vtable sizes > > src/hotspot/share/memory/archiveBuilder.cpp line 831: > >> 829: if (STATIC_DUMP) { >> 830: if (!_builder->is_in_buffer_space(*p)) { >> 831: tty->print_cr("ohashii %p %p", p, *p); > > What is ohashii ? Should this be an assert? Or log_error instead of tty->print This was debug code that I left behind by mistake. I've removed it. > src/hotspot/share/memory/cppVtables.cpp line 218: > >> 216: assert(DumpSharedSpaces, "must"); >> 217: size_t vtptrs_bytes = _num_cloned_vtable_kinds * sizeof(CppVtableInfo*); >> 218: _index = (CppVtableInfo**)ArchiveBuilder::current()->rw_region()->allocate(vtptrs_bytes); > > I thought these would be read-only? The vtables contain function pointers and need to be updated at runtime to match the latest location of libjvm.so, so they need to be R/W. > src/hotspot/share/memory/dumpAllocStats.cpp line 106: > >> 104: >> 105: assert(all_ro_bytes == ro_all, "everything should have been counted"); >> 106: //assert(all_rw_bytes == rw_all, "everything should have been counted"); FIXME! > > ? Does this need to be fixed? This is fixed in the latest version. I added accounting for the vtables. ------------- PR: https://git.openjdk.java.net/jdk/pull/2861 From iklam at openjdk.java.net Tue Mar 9 05:34:09 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 9 Mar 2021 05:34:09 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v2] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Fri, 5 Mar 2021 21:03:49 GMT, Coleen Phillimore wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove unnecessary tag comparison. Just some initial comments. So far the runtime changes look reasonable to me. I'll continue tomorrow. src/hotspot/share/classfile/vmSymbols.hpp line 538: > 536: template(string_void_signature, "(Ljava/lang/String;)V") \ > 537: template(string_int_signature, "(Ljava/lang/String;)I") \ > 538: template(throwable_signature, "Ljava/lang/Throwable;") \ nit: need to align the backslash src/hotspot/share/oops/constantPool.cpp line 784: > 782: } > 783: > 784: Symbol* exception_message(const constantPoolHandle& this_cp, int which, constantTag tag, oop pending_exception) { Does this function need to be static? ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From stuefe at openjdk.java.net Tue Mar 9 06:01:07 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 9 Mar 2021 06:01:07 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v4] In-Reply-To: References: <669DggxCocAwvx_r9xqKCYlqKl81l4HJBSRnpLfcKus=.da682648-5514-4b5d-a4c7-b9392f35cde0@github.com> Message-ID: On Mon, 8 Mar 2021 20:51:13 GMT, David Holmes wrote: > Hi Thomas, > > I am fine with this version of the refactoring. > > Thanks, > David Thank you David! ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From stuefe at openjdk.java.net Tue Mar 9 06:01:08 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 9 Mar 2021 06:01:08 GMT Subject: RFR: JDK-8262955: Unify os::fork_and_exec() across Posix platforms [v4] In-Reply-To: References: <669DggxCocAwvx_r9xqKCYlqKl81l4HJBSRnpLfcKus=.da682648-5514-4b5d-a4c7-b9392f35cde0@github.com> Message-ID: <0AVfQjlzbDnEgnmg3j6pmFbvSLMQ0KzKgesbNcEUqyM=.2164493d-842c-470a-b662-400aa7bebcbb@github.com> On Mon, 8 Mar 2021 21:17:45 GMT, Harold Seigel wrote: > Changes look good. Thanks for doing this. > Harold Thanks Harold! ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From stuefe at openjdk.java.net Tue Mar 9 06:05:06 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 9 Mar 2021 06:05:06 GMT Subject: Integrated: JDK-8262955: Unify os::fork_and_exec() across Posix platforms In-Reply-To: References: Message-ID: <8iJk3i9WhfFYPWtr0lITol9UUrteJ7kI_pdVOTIewf0=.31259347-1f73-4e63-bedf-d231b6f5aa3f@github.com> On Wed, 3 Mar 2021 14:16:19 GMT, Thomas Stuefe wrote: > `os::fork_and_exec()` can be used from within the hotspot to start a child process. It is only called in fatal situations, in two cases: > a) to automatically start a debugger when ShowMessageBoxOnError is specified (uses *fork*()) > b) to start a caller provided binary on OOM if -XX:OnOutOfMemoryError is specified (uses *vfork*()) > > The variants for AIX, Linux, Bsd are almost completely identical. So, this function can be unified under posix. > > In addition to that, this patch does a number of small changes: > > 1) Before, whether we would vfork() only on Linux and only for case (b). I changed this to always use vfork unconditionally, on all platforms, because: > - even though vfork() can be unsafe, the way we use it - calling vfork()->exec()->_exit() with no intermediate steps - is safe. > - Using vfork is good for OOM situations on all platforms, not just Linux, and also for starting the debugger in non-OOM cases. Keep in mind that we do this only for cases where the parent VM is about to die, so even if it were unsafe, the damage would be limited. > 2) I added a comment to the function to not use it outside of fatal error situations. > 3) I added a posix wrapper for getting the environ pointer, to hide MacOS specifics, and used it in two places to unify that coding. > 4) consistently used global scope :: for posix APIs. > > Note that if we wanted to make os::fork_and_exec() a first class function, always safe to use, we should modify it to at least not leak any parent process file descriptors. Possibly safest would be to completely rewrite this function and use posix_spawn(). posix_spawn() we use in Runtime.exec() by default since JDK 13 (1). But as long as this is spawned by only dying VMs I think this function is fine. > > ---- > > Tests: GAs, manual tests using -XX:ShowMessageBoxOnError This pull request has now been integrated. Changeset: 5b9b170d Author: Thomas Stuefe URL: https://git.openjdk.java.net/jdk/commit/5b9b170d Stats: 296 lines in 7 files changed: 86 ins; 205 del; 5 mod 8262955: Unify os::fork_and_exec() across Posix platforms Reviewed-by: dholmes, hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/2810 From akozlov at openjdk.java.net Tue Mar 9 08:23:10 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 9 Mar 2021 08:23:10 GMT Subject: Integrated: JDK-8263068: Rename safefetch.hpp to safefetch.inline.hpp In-Reply-To: <6pYgLK51Ll4jWSuTlGWnEOuXU_3F8uBphuM1c0TAVdI=.726e1003-6b80-4745-b901-1493a0227d4c@github.com> References: <6pYgLK51Ll4jWSuTlGWnEOuXU_3F8uBphuM1c0TAVdI=.726e1003-6b80-4745-b901-1493a0227d4c@github.com> Message-ID: On Fri, 5 Mar 2021 13:15:56 GMT, Anton Kozlov wrote: > Please review a trivial renaming of safefetch.hpp to safefetch.inline.hpp. It is a preparation to fix for @stefank note https://github.com/openjdk/jdk/pull/2200#discussion_r572707505. I'm going to rename threadWXSetters.hpp to threadWXSetters.inline.hpp and threadWXSetters header is needed for safefetch inline functions implementation. This pull request has now been integrated. Changeset: 0bc45625 Author: Anton Kozlov Committer: Vladimir Kempik URL: https://git.openjdk.java.net/jdk/commit/0bc45625 Stats: 12 lines in 10 files changed: 0 ins; 0 del; 12 mod 8263068: Rename safefetch.hpp to safefetch.inline.hpp Reviewed-by: stefank ------------- PR: https://git.openjdk.java.net/jdk/pull/2844 From aph at openjdk.java.net Tue Mar 9 09:06:07 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 9 Mar 2021 09:06:07 GMT Subject: RFR: 8262519: AArch64: Unnecessary acquire semantics of memory-order-conservative atomics in C++ Hotspot code In-Reply-To: <5TBdj1TZXwexuQG-1cSbm7Ro3TL_CDi3gvfj7ONWNss=.3b4f2c15-77cf-4d31-b798-e72232d8d4af@github.com> References: <2hEzda7I-KpcFouDUAsdRiyEe-LDSlSCnwbdHaBJiu4=.e5cc9cb6-9e49-4dda-a395-72cea414f7ec@github.com> <-6qdl0SynMQ7vx-KT68Vgv7hmxq8bcBu89vfPedpYX8=.0ee3210e-efb6-44b8-ac1c-2c8ab5f53b0e@github.com> <5TBdj1TZXwexuQG-1cSbm7Ro3TL_CDi3gvf j7ONWNss=.3b4f2c15-77cf-4d31-b798-e72232d8d4af@github.com> Message-ID: <6q4fR5TP9aw1Ym19s7v-4TiOyKLJKbi97obnirfe3GU=.7cf37da2-b959-4817-b28a-f6f5c36830a8@github.com> On Tue, 9 Mar 2021 02:27:07 GMT, Dong Bo wrote: >>> > Hm, from our point of view, ldaxr+stlxr+dmb and ldxr+stlxr+dmb provide the same order semantics. >>> > The acquire are used to ensure all loads/stores that are after an ldaxr (actually loads/stores after the dmb of atomic__default__impl in this case) in program order, while the dmb has already guaranteed this for us. >>> > Without the acquire, the loads/stores after the atomic operations still can not pass the dmb. >>> > Remove the acquire does not change the order between preceding loads/stores and stlxr. >>> >>> I agree that the code will still be correct if you change the ldaxr to ldar. While this may make some difference on machines which do not support LSE I would not expect it to be significant for anything other than a very carefully crafted benchmark or an extremely specialized parallel algorithm. Is this change request motivated by an actual real-world use case? >> >> Thanks for the comments. >> We only witnessed ~4% improvements with C code below on one of our platform. >> The load-acquire is implemented as a full-barrier by the core in fact. >> // highly-contended fetch_and_add test, %3.96 improvements with 8 threads >> unsigned long res = 0; >> unsigned long sum = 0; >> extern unsigned long aarch64_atomic_fetch_add_8_default_impl(void *ptr, unsigned long val); >> >> void *executor (void *arg) >> { >> for (int i = 0; i < ITER; i++) { >> sum += aarch64_atomic_fetch_add_8_default_impl(&res, 1); >> } >> } >> >> int main(int argc, char **args) >> { >> int i; >> pthread_t exethreads[THREADS]; >> int threads = atoi(args[1]); >> >> for (i = 0; i < threads; i++) >> pthread_create(&exethreads[i], NULL, executor, NULL); >> for (i = 0; i < threads; i++) >> pthread_join(exethreads[i], NULL); >> >> return 0; >> } >> While we didn't have any noticeable improvements with JAVA tests we tried, e.g. `test/micro/org/openjdk/bench/vm/gc/Alloc.java`. >> Seems the percentages of the atomics are too low, not to mention the actual real-world use case, e.g. Spark, Tomcat. >> >> I guess we do not have to change the `ldaxr` to `ldxr` now, due to we haven't seen any significant performance enhancements yet. :-) >> But I feel a little inconsistent that we have code use the stronger semantics, while a weaker instruction can still provide the correct order semantics. > > OK, withdrawing... Thank you all for the time. > _Mailing list message from [Andrew Dinn](mailto:adinn at redhat.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > I believe it would be better in future to perform investigations when > there is a real world use case indicating there is a problem to be fixed > and that the fix will make a difference. Looking into this change has > taken quite a lot of your time and a lot of reviewer time. That time > could probably have been better spent looking into other things. I don't completely agree. It seems to make sense to concentrate only on measurable things. However, efficient systems are composed from thousands of tiny optimizations, each one of which may be too small to measure on its own. The AArch64 HotSpot we have today would be less efficient if we'd insisted on strict performance justification for each little optimization. My misgivings about this change are due to not being entirely convinced we've found all the corner cases, and a worry (born of experience!) that some AArch64 implementations may be pushed into marginal behaviour. Even though such misbehaviour is unlikely today, I'm not sure it's worth the risk. ------------- PR: https://git.openjdk.java.net/jdk/pull/2788 From dholmes at openjdk.java.net Tue Mar 9 10:07:09 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 9 Mar 2021 10:07:09 GMT Subject: RFR: 8263002: Remove CDS MiscCode region [v2] In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 04:53:27 GMT, Ioi Lam wrote: >> The CDS MiscCode region is used for: >> (a) C++ vtables >> (b) Method trampolines >> >> (a) can be moved to the ReadWrite region >> (b) were introduced in JDK-8145221 so we can delay writing into Methods. This was intended to improve copy-on-write sharing to reduce memory footprint. However, this hasn't been shown to have any significant effect (footprint of metadata usually is much smaller than the Java heap), and introduces a lot of complexity in the HotSpot code. >> >> Removing (b) will make it easier to implement JDK-8026297 (Generating AdapterHandlerEntry during CDS dump), which will further improve start-up time. >> >> ============ >> Other benefits of removing the MiscCode region: >> >> - We no longer have a read/write/executable region. This address the concern in JDK-8262922. >> - We can enable CDS on macOS/AArch64, which does not allow read/write/executable regions. (JDK-8253795) > > Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: > > - bumped CURRENT_CDS_ARCHIVE_VERSION by one since format has changed > - @coleenp review: remove temp debug code; fixed accounting of vtable sizes Hi Ioi, This looks really good from a code simplification point of view! I can't claim to fully understand all the code involved though. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2861 From mdoerr at openjdk.java.net Tue Mar 9 11:19:23 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 9 Mar 2021 11:19:23 GMT Subject: RFR: 8261957: [PPC64] Support for Concurrent Thread-Stack Processing Message-ID: I'd like to support Concurrent Thread-Stack Processing on PPC64. This will be needed by ShenandoahGC and zGC when implemented. Maybe for other purposes in the future, too. I'm using conditional trap instructions by default, so we don't need the extra stubs unless -XX:-UseSIGTRAP is used. Original change: https://github.com/openjdk/jdk/commit/b9873e18 ------------- Commit messages: - Update for JDK-8255233 - Support large offsets in C1/C2 stubs. Add comments. - 8261957: [PPC64] Support for Concurrent Thread-Stack Processing Changes: https://git.openjdk.java.net/jdk/pull/2841/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2841&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8261957 Stats: 227 lines in 16 files changed: 177 ins; 10 del; 40 mod Patch: https://git.openjdk.java.net/jdk/pull/2841.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2841/head:pull/2841 PR: https://git.openjdk.java.net/jdk/pull/2841 From eosterlund at openjdk.java.net Tue Mar 9 11:37:09 2021 From: eosterlund at openjdk.java.net (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 9 Mar 2021 11:37:09 GMT Subject: RFR: 8259643: ZGC can return metaspace OOM prematurely [v2] In-Reply-To: References: Message-ID: On Mon, 8 Mar 2021 14:40:29 GMT, Thomas Stuefe wrote: >> Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: >> >> polish code alignment and rename register/unregister to add/remove > > Marked as reviewed by stuefe (Reviewer). > Hi Eric, > > please feel free to commit this, and answer my question at your leisure, if at all. I am fine with your change as it is (now that I understand it :) > > Cheers, Thomas Okay - thanks Thomas! :-) -- Erik ------------- PR: https://git.openjdk.java.net/jdk/pull/2289 From jbachorik at openjdk.java.net Tue Mar 9 12:16:17 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 9 Mar 2021 12:16:17 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v9] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <1SitAwn_HPwglwUiTJdjoACmIhD4U9My1RVGzGJz5Ho=.f3f6556f-3fb1-44c7-b522-5c15de734b26@github.com> On Wed, 3 Mar 2021 12:15:21 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Add tests for the heap usage summary event > > Fwiw, the change still does not capture G1 full gc `live_estimate()`. @tschatzl I think I have finished the changes you requested. Please, take a look once you have time. Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jaroslav.bachorik at datadoghq.com Tue Mar 9 12:46:11 2021 From: jaroslav.bachorik at datadoghq.com (=?UTF-8?Q?Jaroslav_Bachor=C3=ADk?=) Date: Tue, 9 Mar 2021 13:46:11 +0100 Subject: RFR: 8247471: Enhance CPU load events with the actual elapsed CPU time In-Reply-To: References: Message-ID: Gentle ping (again)? -JB- > On Tue, Jan 26, 2021 at 2:43 PM Jaroslav Bachorik > wrote: > > > > A continuation of an RFR thread started last year - https://mail.openjdk.java.net/pipermail/hotspot-jfr-dev/2020-June/001533.html > > > > This change adds the raw CPU time value to CPU load events (per-thread and per-process as well). > > The CPU time value is already known and used to calculate the load so adding it to the events does not incur any extra overhead while making it much easier for the end users to eg. aggregate and compare the active execution time per time period without the detailed knowledge how JFR computes and normalizes the CPU load. > > > > ------------- > > > > Commit messages: > > - Fix wording and remove unnecessary debug output > > - Fix jcheck > > - Merge branch 'master' into 8247471_cpuload_with_time > > - 8247471: Enhance CPU load events with the actual elapsed CPU time > > > > Changes: https://git.openjdk.java.net/jdk/pull/2186/files > > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2186&range=00 > > Issue: https://bugs.openjdk.java.net/browse/JDK-8247471 > > Stats: 543 lines in 11 files changed: 238 ins; 205 del; 100 mod > > Patch: https://git.openjdk.java.net/jdk/pull/2186.diff > > Fetch: git fetch https://git.openjdk.java.net/jdk pull/2186/head:pull/2186 > > > > PR: https://git.openjdk.java.net/jdk/pull/2186 From coleenp at openjdk.java.net Tue Mar 9 13:09:11 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 13:09:11 GMT Subject: RFR: 8259643: ZGC can return metaspace OOM prematurely [v2] In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 11:34:36 GMT, Erik ?sterlund wrote: >> Marked as reviewed by stuefe (Reviewer). > >> Hi Eric, >> >> please feel free to commit this, and answer my question at your leisure, if at all. I am fine with your change as it is (now that I understand it :) >> >> Cheers, Thomas > > Okay - thanks Thomas! :-) > > -- Erik This change is very interesting and looks fine to me for the record, but you didn't add the test. I'm a bit greedy for class loading and unloading tests. Even if it has timing dependencies to show the bug, I think it would be worth adding to jtreg. Could you add it with a follow-on RFE? Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/2289 From hseigel at openjdk.java.net Tue Mar 9 13:20:05 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 9 Mar 2021 13:20:05 GMT Subject: RFR: 8252173: Use handles instead of jobjects in modules.cpp In-Reply-To: References: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> Message-ID: On Tue, 9 Mar 2021 00:47:37 GMT, Coleen Phillimore wrote: >> Hi, >> Please review this change for JDK-8252173 to use handles instead of jobjects in modules.cpp to make modules.cpp more debuggable. The change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Looks good! Thanks Lois and Coleen for reviewing this! ------------- PR: https://git.openjdk.java.net/jdk/pull/2878 From hseigel at openjdk.java.net Tue Mar 9 13:20:07 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 9 Mar 2021 13:20:07 GMT Subject: Integrated: 8252173: Use handles instead of jobjects in modules.cpp In-Reply-To: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> References: <08_WNFzODhRXH2cxxjzOha1juG0v6qxiN9tN-r0yt2I=.951d01cb-6f8d-4493-a5c0-143174ba5673@github.com> Message-ID: On Mon, 8 Mar 2021 18:45:04 GMT, Harold Seigel wrote: > Hi, > Please review this change for JDK-8252173 to use handles instead of jobjects in modules.cpp to make modules.cpp more debuggable. The change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: b7f0b3fc Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/b7f0b3fc Stats: 74 lines in 4 files changed: 17 ins; 5 del; 52 mod 8252173: Use handles instead of jobjects in modules.cpp Reviewed-by: lfoltan, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/2878 From coleenp at openjdk.java.net Tue Mar 9 13:53:22 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 13:53:22 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL Message-ID: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> ClassFileParser.create_instance_klass cannot return NULL without a pending exception, it's called by klassFactory::create_from_stream who also cannot return NULL without a pending exception, it's called by SystemDictionary::parse_stream and SystemDictionary::resolve_from_stream who also cannot return NULL without a pending exception. I removed the NULL checks on returns from these 4 functions and either replaced them with an assert, or in cases that already had an unconditional indirection from the return value, just removed the null checks. I wrote a test case to cover the case of testing st->buffer() == NULL which returned NULL from SystemDictionary::resolve_from_stream to show that this code path is not used. ClassFileStream constructor will set buffer() to the u1* input argument if it doesn't point to anything and will get a ClassFormatError. Tested with tier1-3. ------------- Commit messages: - 8262913: KlassFactory::create_from_stream should never return NULL Changes: https://git.openjdk.java.net/jdk/pull/2892/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2892&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262913 Stats: 244 lines in 9 files changed: 199 ins; 16 del; 29 mod Patch: https://git.openjdk.java.net/jdk/pull/2892.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2892/head:pull/2892 PR: https://git.openjdk.java.net/jdk/pull/2892 From hseigel at openjdk.java.net Tue Mar 9 13:55:16 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 9 Mar 2021 13:55:16 GMT Subject: RFR: 8247869: Change NONCOPYABLE to delete the operations Message-ID: Please review this fix for JDK-8247869. The fix was regression tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows and tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8247869: Change NONCOPYABLE to delete the operations Changes: https://git.openjdk.java.net/jdk/pull/2891/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2891&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8247869 Stats: 13 lines in 1 file changed: 0 ins; 6 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/2891.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2891/head:pull/2891 PR: https://git.openjdk.java.net/jdk/pull/2891 From hseigel at openjdk.java.net Tue Mar 9 14:18:09 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 9 Mar 2021 14:18:09 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL In-Reply-To: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: On Tue, 9 Mar 2021 13:45:42 GMT, Coleen Phillimore wrote: > ClassFileParser.create_instance_klass cannot return NULL without a pending exception, it's called by > klassFactory::create_from_stream who also cannot return NULL without a pending exception, it's called by > SystemDictionary::parse_stream and SystemDictionary::resolve_from_stream who also cannot return NULL without a pending exception. > > I removed the NULL checks on returns from these 4 functions and either replaced them with an assert, or in cases that already had an unconditional indirection from the return value, just removed the null checks. > > I wrote a test case to cover the case of testing st->buffer() == NULL which returned NULL from SystemDictionary::resolve_from_stream to show that this code path is not used. ClassFileStream constructor will set buffer() to the u1* input argument if it doesn't point to anything and will get a ClassFormatError. > > Tested with tier1-3. These changes look good! One suggestion is to consider changing "getMessage().equals(" calls in the test to "getMessage().contains(". This may make the test more likely to continue passing if the exception message gets enhanced. Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2892 From coleenp at openjdk.java.net Tue Mar 9 14:20:10 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 14:20:10 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v2] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Tue, 9 Mar 2021 05:03:38 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unnecessary tag comparison. > > src/hotspot/share/classfile/vmSymbols.hpp line 538: > >> 536: template(string_void_signature, "(Ljava/lang/String;)V") \ >> 537: template(string_int_signature, "(Ljava/lang/String;)I") \ >> 538: template(throwable_signature, "Ljava/lang/Throwable;") \ > > nit: need to align the backslash It was aligned with one below, I moved a line down so more backslashes would match. > src/hotspot/share/oops/constantPool.cpp line 784: > >> 782: } >> 783: >> 784: Symbol* exception_message(const constantPoolHandle& this_cp, int which, constantTag tag, oop pending_exception) { > > Does this function need to be static? yes. It should be declared static. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Tue Mar 9 14:28:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 14:28:42 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v3] In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: <0Rwg_Gpl8cOWB2YmRjgtDlwYv7e9fYKJbV-bzqmWoaw=.3dc6741a-6df5-48c3-9950-d5717234b4de@github.com> > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Some code review changes from iklam ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2718/files - new: https://git.openjdk.java.net/jdk/pull/2718/files/b315aa70..133cbad4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=01-02 Stats: 4 lines in 2 files changed: 1 ins; 1 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2718.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2718/head:pull/2718 PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Tue Mar 9 14:32:36 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 14:32:36 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL [v2] In-Reply-To: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: > ClassFileParser.create_instance_klass cannot return NULL without a pending exception, it's called by > klassFactory::create_from_stream who also cannot return NULL without a pending exception, it's called by > SystemDictionary::parse_stream and SystemDictionary::resolve_from_stream who also cannot return NULL without a pending exception. > > I removed the NULL checks on returns from these 4 functions and either replaced them with an assert, or in cases that already had an unconditional indirection from the return value, just removed the null checks. > > I wrote a test case to cover the case of testing st->buffer() == NULL which returned NULL from SystemDictionary::resolve_from_stream to show that this code path is not used. ClassFileStream constructor will set buffer() to the u1* input argument if it doesn't point to anything and will get a ClassFormatError. > > Tested with tier1-3. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Replace equals with contains, suggested by hseigel. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2892/files - new: https://git.openjdk.java.net/jdk/pull/2892/files/27e9994d..88166bea Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2892&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2892&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2892.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2892/head:pull/2892 PR: https://git.openjdk.java.net/jdk/pull/2892 From coleenp at openjdk.java.net Tue Mar 9 14:32:37 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 9 Mar 2021 14:32:37 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL [v2] In-Reply-To: References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: On Tue, 9 Mar 2021 14:15:11 GMT, Harold Seigel wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Replace equals with contains, suggested by hseigel. > > These changes look good! One suggestion is to consider changing "getMessage().equals(" calls in the test to "getMessage().contains(". This may make the test more likely to continue passing if the exception message gets enhanced. > Thanks, Harold Thanks Harold, and thank you for the suggested change. ------------- PR: https://git.openjdk.java.net/jdk/pull/2892 From kbarrett at openjdk.java.net Tue Mar 9 15:03:06 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 9 Mar 2021 15:03:06 GMT Subject: RFR: 8247869: Change NONCOPYABLE to delete the operations In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 13:43:22 GMT, Harold Seigel wrote: > Please review this fix for JDK-8247869. The fix was regression tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows and tiers 3-5 on Linux x64. > > Thanks, Harold Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2891 From kim.barrett at oracle.com Tue Mar 9 15:19:20 2021 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 9 Mar 2021 15:19:20 +0000 Subject: CFV: New HotSpot Group Member: Christian Hagedorn In-Reply-To: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> References: <3c708766-898c-50cd-1a2b-b12a2f97df71@oracle.com> Message-ID: <78CDB58F-9A31-4F6E-BDCF-00A332FC6FC0@oracle.com> vote: yes > On Mar 5, 2021, at 8:27 AM, Tobias Hartmann wrote: > > Hi, > > I hereby nominate Christian Hagedorn to Membership in the HotSpot Group. > > Christian is a member of the HotSpot Compiler Team at Oracle and a JDK Reviewer. He contributed over > 70 changes to the JDK project [1]. Christian has worked on both C1 and C2, acquiring expert > knowledge in key areas (for example, loop unswitching and superword optimizations). He investigated > and fixed several highly complex and long-standing issues in the code base and improved > maintainability of the JITs. All the while, Christian is constantly updating and extending the > sparse documentation, making life easier for other engineers. HotSpot Group membership would allow > Christian to continue to do so by adding to the OpenJDK wiki pages. > > Votes are due by Friday, 19 March 2021 at 13:30 UTC. > > Only current Members of the HotSpot Group [2] are eligible to vote on this nomination. Votes must > be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [3]. > > Best regards, > Tobias > > [1] > https://github.com/search?q=committer-name%3A%22Christian+Hagedorn%22+repo%3Aopenjdk%2Fjdk&type=commits > [2] https://openjdk.java.net/census > [3] https://openjdk.java.net/groups/#member-vote From hseigel at openjdk.java.net Tue Mar 9 16:09:18 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 9 Mar 2021 16:09:18 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class Message-ID: Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8213177: GlobalCounter::CSContext could be an enum class Changes: https://git.openjdk.java.net/jdk/pull/2895/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2895&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8213177 Stats: 12 lines in 1 file changed: 10 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2895.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2895/head:pull/2895 PR: https://git.openjdk.java.net/jdk/pull/2895 From akozlov at openjdk.java.net Tue Mar 9 16:12:36 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 9 Mar 2021 16:12:36 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v24] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 105 commits: - Merge commit 'refs/pull/11/head' of https://github.com/AntonKozlov/jdk into jdk-macos - workaround JDK-8262895 by disabling subtest - Fix typo - Rename threadWXSetters.hpp -> threadWXSetters.inline.hpp - JDK-8259937: bsd_aarch64 part - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos - Fix after JDK-8259539, partially revert preconditions - JDK-8260471: bsd_aarch64 part - JDK-8259539: bsd_aarch64 part - JDK-8257828: bsd_aarch64 part - ... and 95 more: https://git.openjdk.java.net/jdk/compare/a6e34b3d...a72f6834 ------------- Changes: https://git.openjdk.java.net/jdk/pull/2200/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=23 Stats: 2873 lines in 73 files changed: 2787 ins; 27 del; 59 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 9 16:12:37 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 9 Mar 2021 16:12:37 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v12] In-Reply-To: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> References: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> Message-ID: On Tue, 9 Feb 2021 09:06:26 GMT, Stefan Karlsson wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Update signal handler part for debugger > > src/hotspot/share/runtime/threadWXSetters.hpp line 28: > >> 26: #define SHARE_RUNTIME_THREADWXSETTERS_HPP >> 27: >> 28: #include "runtime/thread.inline.hpp" > > This breaks against our convention to forbid inline.hpp files from being included from being included from .hpp files. You need to rework this by either moving the implementation to a .cpp file, or convert this file into an .inline.hpp > > See the Source Files section in: > https://htmlpreview.github.io/?https://github.com/openjdk/jdk/blob/master/doc/hotspot-style.html Thanks, I renamed the file to threadWXSetters.inline.hpp ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 9 16:58:19 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 9 Mar 2021 16:58:19 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v12] In-Reply-To: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> References: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> Message-ID: On Tue, 9 Feb 2021 09:12:13 GMT, Stefan Karlsson wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Update signal handler part for debugger > > src/hotspot/share/runtime/thread.hpp line 848: > >> 846: void init_wx(); >> 847: WXMode enable_wx(WXMode new_state); >> 848: #endif // __APPLE__ && AARCH64 > > Now that this is only compiled into macOS/AArch64, could this be moved over to thread_bsd_aarch64.hpp? The same goes for the associated functions. The thread_bsd_aarch64.hpp describes a part of JavaThread, while this block belongs to Thread for now. Since W^X is an attribute of any operating system thread, I assumed Thread to be the right place for W^X bookkeeping. In most cases, we manage W^X state of JavaThread. But sometimes a GC thread needs the WXWrite state, or safefetch is called from non-JavaThread. Probably this can be dealt with (e.g. GCThread to always have the WXWrite state). But such change would be much more than a simple refactoring and it would require a significant amount of testing. Ideally, I would like to investigate this as a follow-up change, or at least after other fixes to this PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 9 17:58:21 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 9 Mar 2021 17:58:21 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v12] In-Reply-To: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> References: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> Message-ID: On Tue, 9 Feb 2021 09:23:50 GMT, Stefan Karlsson wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Update signal handler part for debugger > > src/hotspot/share/runtime/thread.cpp line 2515: > >> 2513: void JavaThread::check_special_condition_for_native_trans(JavaThread *thread) { >> 2514: // Enable WXWrite: called directly from interpreter native wrapper. >> 2515: MACOS_AARCH64_ONLY(ThreadWXEnable wx(WXWrite, thread)); > > FWIW, I personally think that adding these MACOS_AARCH64_ONLY usages at the call sites increase the line-noise in the affected functions. I think I would have preferred a version: > ThreadWXEnable(WXMode new_mode, Thread* thread = NULL) { > MACOS_AARCH64_ONLY(initialize(new_mode, thread);) {} > void initialize(...); // Implementation in thread_bsd_aarch64.cpp (alt. inline.hpp) > With that said, I'm fine with taking this discussion as a follow-up. The former version used no such macros. I like that now it's clear the W^X management is relevant to macos/aarch64 only. I see the point to move the pre-processor condition into the class implementation. But I think it will bring a bit of inconsistency, as the rest of W^X implementation is explicitly guarded by preprocessor conditionals. I've also tried to push macro conditionals as far as possible down to implementation, providing a kind of generalized W^X interface. That required a few artificial decisions, e.g. how would we call the mode we execute on the rest of platforms with write and execute allowed, WXWriteExec?.. I abandoned that attempt. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Tue Mar 9 18:04:19 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 9 Mar 2021 18:04:19 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v21] In-Reply-To: References: Message-ID: On Mon, 1 Mar 2021 10:31:19 GMT, Andrew Haley wrote: >> Anton Kozlov has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge remote-tracking branch 'origin/jdk/jdk-macos' into jdk-macos >> - Minor fixes > > src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp line 62: > >> 60: >> 61: #if defined(__APPLE__) || defined(_WIN64) >> 62: #define R18_RESERVED > > #define R18_RESERVED true``` We always check for `R18_RESERVED` with `#if(n)def`, is there any reason to define the value for the macro? ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From iignatyev at openjdk.java.net Tue Mar 9 18:52:14 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 9 Mar 2021 18:52:14 GMT Subject: RFR: 8246494: introduce vm.flagless at-requires property Message-ID: resurrecting old [RFR](https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/041981.html): > Hi all, > > could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? > > the idea behind this patch is to have a way to clearly mark tests which ignore flags, so > a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; > b) they can be easily excluded from runs w/ flags. > > @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. > > this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. > > please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 > webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 > testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags > > [1] https://bugs.openjdk.java.net/browse/JDK-8151707 > [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 > [3] https://bugs.openjdk.java.net/browse/JDK-8246387 > after offline discussion with @pliden, it has been decided to reduce the scope of [8246499](https://bugs.openjdk.java.net/browse/JDK-8246499) and not mark the tests that use `UseXGC` flags for selection, e.g. `test/hotspot/jtreg/gc/z/TestSmallHeap.java`. Thanks, -- Igor ------------- Commit messages: - update copyright year - 8246494: introduce vm.flagless at-requires property Changes: https://git.openjdk.java.net/jdk/pull/2800/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2800&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8246494 Stats: 81 lines in 6 files changed: 75 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2800.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2800/head:pull/2800 PR: https://git.openjdk.java.net/jdk/pull/2800 From minqi at openjdk.java.net Tue Mar 9 19:12:28 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 9 Mar 2021 19:12:28 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v2] In-Reply-To: References: Message-ID: > Hi, Please review > > Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with > `java -XX:DumpLoadedClassList= .... ` > to collect shareable class names and saved in file `` , then > `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` > With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. > The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. > New added jcmd option: > `jcmd VM.cds static_dump ` > or > `jcmd VM.cds dynamic_dump ` > To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. > The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. > > Tests: tier1,tier2,tier3,tier4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Add function CDS.dumpSharedArchive in java to dump shared archive ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2737/files - new: https://git.openjdk.java.net/jdk/pull/2737/files/e371456c..d486c06e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=00-01 Stats: 450 lines in 13 files changed: 258 ins; 156 del; 36 mod Patch: https://git.openjdk.java.net/jdk/pull/2737.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737 PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Tue Mar 9 21:50:25 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 9 Mar 2021 21:50:25 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v3] In-Reply-To: References: Message-ID: > Hi, Please review > > Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with > `java -XX:DumpLoadedClassList= .... ` > to collect shareable class names and saved in file `` , then > `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` > With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. > The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. > New added jcmd option: > `jcmd VM.cds static_dump ` > or > `jcmd VM.cds dynamic_dump ` > To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. > The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. > > Tests: tier1,tier2,tier3,tier4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Fix white space in CDS.java ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2737/files - new: https://git.openjdk.java.net/jdk/pull/2737/files/d486c06e..bfa71577 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2737.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737 PR: https://git.openjdk.java.net/jdk/pull/2737 From igor.ignatyev at oracle.com Tue Mar 9 22:41:24 2021 From: igor.ignatyev at oracle.com (Igor Ignatyev) Date: Tue, 9 Mar 2021 22:41:24 +0000 Subject: RFR(S) : 8246494 : introduce vm.flagless at-requires property In-Reply-To: References: <5b1cac8c-7e9b-195a-edfa-6ab972e32bf0@oracle.com> Message-ID: RFR got migrated to github as https://github.com/openjdk/jdk/pull/2800 -- Igor On Jun 5, 2020, at 9:10 AM, Igor Ignatyev > wrote: Hi Per, you are reading this correctly, make TEST=test/hotspot/jtreg/gc/z/TestSmallHeap.java JTREG="VM_OPTIONS=-XX:+UseZGC" won't execute gc/z/TestSmallHeap.java; and I don't see it to be incorrect. Let me try to explain why using gc/z/TestSmallHeap.java as a running example. A hotspot test is expected not to be just runnable in an out-of-box configuration, but also to serve its purpose as much as possible (which is not always 100% given some tests require special build flavor, environment setup, etc); in other words, a test is to at least have all necessary VM flags within it and not to hope that someone will provide them. gc/z/TestSmallHeap.java does that, it explicitly selects zGC, so there is no need for -XX:+UseZGC to achieve that. Given this test can be run only when zGC can be selected, it @requires vm.gc.Z, which is set to true if zGC is already explicitly selected or if zGC is available and no other GC is specified, and the latter holds for an out-of-box configuration (assuming that zGC is available in the JVM under test); thus, again, you don't have to specify -XX:+UseZGC to run this test. So there are no "technical" reasons to run gc/z/TestSmallHeap.java (or any other gc/z/ tests) with -XX:+UseZGC. The proposed patches don't change that fact in any way. The patches exclude the tests that ignore external VM flags from execution if any significant VM flags are specified. gc/z/TestSmallHeap.java ignores all externally provided VM flags, including -XX:+UseZGC. And although in the case of -XX:+UseZGC, it's harmless, in almost all other cases it's not. Just to give you a few examples: Let's say you are fixing a bug in zGC which could be reproduced by gc/z/TestSmallHeap.java. You came up with two alternative solutions, one of which is guarded by `if (UseNewCode)`. To test these solutions, you ran gc/z tests twice: with -XX:+UseZGC -XX:+UseNewCode, and all tests passed; with XX:+UseZGC, and many tests (but not gc/z/TestSmallHeap.java) failed. So based on these results, you decided that the guarded solution is perfect, cleaned up the code, sent it out for review, got it pushed, and minutes later found out that gc/z/TestSmallHeap.java and some other tests which ignore VM flags failed. It would take you some time, to realize that you hadn't tested your UseNewCode solution by these tests. Yet were these tests excluded from your testing, it would be much easier for you to spot that and react accordingly. Here is another scenario, you decided to change the default value of ZUncommit, so you ran different tests with `XX:+UseZGC -XX:-ZUncommit`, all green, you pushed a trivial change s/true/false in z_globals.hpp, next thing you knew a bunch of zGC specific tests failed in CI. And again, these were the tests that silently ignored `XX:+UseZGC -XX:-ZUncommit`. Or a slight variation, zGC-supported was added to a future JIT, gc/z tests were run with the flag combination which enabled the future JIT, all passed, the victory was declared; N releases later; default JIT got changed to the future JIT; the next CI build is a disaster, with lots of tests failing from the bugs which had not been found N/2 years ago. Although I understand that it might take some getting used to from you and others who used to run gc/x tests with -XX:+Use${X}GC, I am certain that this will improve the overall quality of hotspot, save not only machine time (from running these tests with other flags) but engineers time from analyzing surprising failures, and increase confidence and trust in the hotspot test suite. In a word, I can see how this can be a bit surprising, yet still less surprising than the current behavior, but I don't see it as incorrect, it just surfaces limitations of certain tests. From my (slightly biased) point of view, it's the right thing to do. Thanks. -- Igor On Jun 5, 2020, at 1:20 AM, Per Liden > wrote: Hi Igor, When looking at the follow-up sub-tasks for this, I see for example this: http://cr.openjdk.java.net/~iignatyev/8246499/webrev.00/test/hotspot/jtreg/gc/z/TestSmallHeap.java.udiff.html Maybe I'm misunderstanding how this is supposed to work, but it looks like this test would now _not_ be executed if I do: make TEST=test/hotspot/jtreg/gc/z/TestSmallHeap.java JTREG="VM_OPTIONS=-XX:+UseZGC" Is that so? In that case, that seems incorrect. cheers, Per On 6/3/20 11:30 PM, Igor Ignatyev wrote: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 70 lines changed: 66 ins; 0 del; 4 mod Hi all, could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? the idea behind this patch is to have a way to clearly mark tests which ignore flags, so a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; b) they can be easily excluded from runs w/ flags. @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags [1] https://bugs.openjdk.java.net/browse/JDK-8151707 [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 [3] https://bugs.openjdk.java.net/browse/JDK-8246387 Thanks, -- Igor From dholmes at openjdk.java.net Tue Mar 9 23:43:07 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 9 Mar 2021 23:43:07 GMT Subject: RFR: 8247869: Change NONCOPYABLE to delete the operations In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 13:43:22 GMT, Harold Seigel wrote: > Please review this fix for JDK-8247869. The fix was regression tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows and tiers 3-5 on Linux x64. > > Thanks, Harold LGTM! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2891 From iklam at openjdk.java.net Wed Mar 10 01:27:27 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 10 Mar 2021 01:27:27 GMT Subject: RFR: 8263002: Remove CDS MiscCode region [v3] In-Reply-To: References: Message-ID: <86dJ3twAzGlbhJzcmLcYSGHt3WOl4Y-R3hM2hYMuawU=.6ac28771-40e3-4805-9aa4-1caa762e3f19@github.com> > The CDS MiscCode region is used for: > (a) C++ vtables > (b) Method trampolines > > (a) can be moved to the ReadWrite region > (b) were introduced in JDK-8145221 so we can delay writing into Methods. This was intended to improve copy-on-write sharing to reduce memory footprint. However, this hasn't been shown to have any significant effect (footprint of metadata usually is much smaller than the Java heap), and introduces a lot of complexity in the HotSpot code. > > Removing (b) will make it easier to implement JDK-8026297 (Generating AdapterHandlerEntry during CDS dump), which will further improve start-up time. > > ============ > Other benefits of removing the MiscCode region: > > - We no longer have a read/write/executable region. This address the concern in JDK-8262922. > - We can enable CDS on macOS/AArch64, which does not allow read/write/executable regions. (JDK-8253795) Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge branch 'master' into 8263002-remove-cds-mc-region - Merge branch 'master' into 8263002-remove-cds-mc-region - bumped CURRENT_CDS_ARCHIVE_VERSION by one since format has changed - @coleenp review: remove temp debug code; fixed accounting of vtable sizes - remove MC region - 8263002: Remove CDS MiscCode region ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2861/files - new: https://git.openjdk.java.net/jdk/pull/2861/files/a3e4f25e..a74edc28 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2861&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2861&range=01-02 Stats: 4238 lines in 135 files changed: 1818 ins; 1750 del; 670 mod Patch: https://git.openjdk.java.net/jdk/pull/2861.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2861/head:pull/2861 PR: https://git.openjdk.java.net/jdk/pull/2861 From iklam at openjdk.java.net Wed Mar 10 01:38:09 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 10 Mar 2021 01:38:09 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL [v2] In-Reply-To: References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: On Tue, 9 Mar 2021 14:32:36 GMT, Coleen Phillimore wrote: >> ClassFileParser.create_instance_klass cannot return NULL without a pending exception, it's called by >> klassFactory::create_from_stream who also cannot return NULL without a pending exception, it's called by >> SystemDictionary::parse_stream and SystemDictionary::resolve_from_stream who also cannot return NULL without a pending exception. >> >> I removed the NULL checks on returns from these 4 functions and either replaced them with an assert, or in cases that already had an unconditional indirection from the return value, just removed the null checks. >> >> I wrote a test case to cover the case of testing st->buffer() == NULL which returned NULL from SystemDictionary::resolve_from_stream to show that this code path is not used. ClassFileStream constructor will set buffer() to the u1* input argument if it doesn't point to anything and will get a ClassFormatError. >> >> Tested with tier1-3. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Replace equals with contains, suggested by hseigel. One small nit about the test code. The rest looks good to me. test/hotspot/jtreg/runtime/DefineClass/NullClassBytesTest.java line 73: > 71: } > 72: > 73: byte[] getClassData(String name) { This function can be simplified to something like return SimpleLoader.class.getClassLoader().getResourceAsStream(name + ".class").readAllBytes(); That way it will be more resilient if jtreg wants to put the classfile elsewhere. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2892 From coleenp at openjdk.java.net Wed Mar 10 02:03:32 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 02:03:32 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Remove complicated test in favor of Ioi's test, fix cause. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2718/files - new: https://git.openjdk.java.net/jdk/pull/2718/files/133cbad4..7461dec8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=02-03 Stats: 785 lines in 11 files changed: 176 ins; 607 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2718.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2718/head:pull/2718 PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 02:03:33 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 02:03:33 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v2] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Tue, 9 Mar 2021 05:30:56 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unnecessary tag comparison. > > Just some initial comments. So far the runtime changes look reasonable to me. I'll continue tomorrow. I removed the overly complicated test, that was meant to exercise the function SystemDictionary::handle_parallel_super_load, and thanks to Ioi for a new test. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From iklam at openjdk.java.net Wed Mar 10 04:05:10 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 10 Mar 2021 04:05:10 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 10 Mar 2021 02:03:32 GMT, Coleen Phillimore wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove complicated test in favor of Ioi's test, fix cause. LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 04:33:34 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 04:33:34 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL [v3] In-Reply-To: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: > ClassFileParser.create_instance_klass cannot return NULL without a pending exception, it's called by > klassFactory::create_from_stream who also cannot return NULL without a pending exception, it's called by > SystemDictionary::parse_stream and SystemDictionary::resolve_from_stream who also cannot return NULL without a pending exception. > > I removed the NULL checks on returns from these 4 functions and either replaced them with an assert, or in cases that already had an unconditional indirection from the return value, just removed the null checks. > > I wrote a test case to cover the case of testing st->buffer() == NULL which returned NULL from SystemDictionary::resolve_from_stream to show that this code path is not used. ClassFileStream constructor will set buffer() to the u1* input argument if it doesn't point to anything and will get a ClassFormatError. > > Tested with tier1-3. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Improve test, suggested by Ioi. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2892/files - new: https://git.openjdk.java.net/jdk/pull/2892/files/88166bea..a9459502 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2892&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2892&range=01-02 Stats: 15 lines in 1 file changed: 0 ins; 8 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/2892.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2892/head:pull/2892 PR: https://git.openjdk.java.net/jdk/pull/2892 From iklam at openjdk.java.net Wed Mar 10 04:42:09 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 10 Mar 2021 04:42:09 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v3] In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 21:50:25 GMT, Yumin Qi wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Fix white space in CDS.java Changes requested by iklam (Reviewer). src/hotspot/share/services/diagnosticCommand.cpp line 1124: > 1122: } > 1123: Symbol* cds_name = vmSymbols::jdk_internal_misc_CDS(); > 1124: Klass* cds_klass = SystemDictionary::resolve_or_null(cds_name, THREAD); Should be `cds_klass = SystemDictionary::resolve_or_fail(cds_name, CHECK);` src/java.base/share/classes/jdk/internal/misc/CDS.java line 278: > 276: dumpDynamicArchive(archiveFile); > 277: } > 278: } I think we should have some error checks and clean up: - Remove the classlist file - Check if if the process exit status is 0 - Remove the JSA file first, then try to dump it, and check if the file exists afterwards. If not, report the error. (For both dynamic and static dumps) src/java.base/share/classes/jdk/internal/misc/CDS.java line 256: > 254: > 255: // Do not take parent env which will cause dumping fail. > 256: Process proc = Runtime.getRuntime().exec(cmds.toArray(new String[0]), Could you explain why the parent's env variables will cause dumping to fail? src/java.base/share/classes/jdk/internal/misc/CDS.java line 213: > 211: testStr.contains("-XX:+DynamicDumpSharedSpaces") || > 212: testStr.contains("-XX:+RecordDynamicDumpInfo"); > 213: } The following flags should also be excluded: - -XX:-DumpSharedSpaces - -Xshare: - -XX:SharedClassListFile= - -XX:SharedArchiveFile= - -XX:ArchiveClassesAtExit= - -XX:+UseSharedSpaces - -XX:+RequireSharedSpaces We also need to have a few test cases when the LingeredApp is started with these flags. src/java.base/share/classes/jdk/internal/misc/CDS.java line 262: > 260: String line; > 261: InputStreamReader isr = new InputStreamReader(proc.getInputStream()); > 262: BufferedReader rdr = new BufferedReader(isr); Also, I think the output should always be logged. Otherwise if an error happens, it's very difficult for the user to diagnose (and they won't know about the "CDS.Debug" property). ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From iklam at openjdk.java.net Wed Mar 10 06:11:10 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 10 Mar 2021 06:11:10 GMT Subject: Integrated: 8263002: Remove CDS MiscCode region In-Reply-To: References: Message-ID: On Sun, 7 Mar 2021 06:26:00 GMT, Ioi Lam wrote: > The CDS MiscCode region is used for: > (a) C++ vtables > (b) Method trampolines > > (a) can be moved to the ReadWrite region > (b) were introduced in JDK-8145221 so we can delay writing into Methods. This was intended to improve copy-on-write sharing to reduce memory footprint. However, this hasn't been shown to have any significant effect (footprint of metadata usually is much smaller than the Java heap), and introduces a lot of complexity in the HotSpot code. > > Removing (b) will make it easier to implement JDK-8026297 (Generating AdapterHandlerEntry during CDS dump), which will further improve start-up time. > > ============ > Other benefits of removing the MiscCode region: > > - We no longer have a read/write/executable region. This address the concern in JDK-8262922. > - We can enable CDS on macOS/AArch64, which does not allow read/write/executable regions. (JDK-8253795) This pull request has now been integrated. Changeset: d8a9c3ca Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/d8a9c3ca Stats: 658 lines in 39 files changed: 17 ins; 542 del; 99 mod 8263002: Remove CDS MiscCode region Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/2861 From iklam at openjdk.java.net Wed Mar 10 06:11:09 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 10 Mar 2021 06:11:09 GMT Subject: RFR: 8263002: Remove CDS MiscCode region [v3] In-Reply-To: <4EsEk0VVQDAbkJzXmrMh2a_B4sFq7diglgCo3UqZpBQ=.5f4541c2-76aa-46da-82b5-dc8e537a85f3@github.com> References: <4EsEk0VVQDAbkJzXmrMh2a_B4sFq7diglgCo3UqZpBQ=.5f4541c2-76aa-46da-82b5-dc8e537a85f3@github.com> Message-ID: On Tue, 9 Mar 2021 00:03:02 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into 8263002-remove-cds-mc-region >> - Merge branch 'master' into 8263002-remove-cds-mc-region >> - bumped CURRENT_CDS_ARCHIVE_VERSION by one since format has changed >> - @coleenp review: remove temp debug code; fixed accounting of vtable sizes >> - remove MC region >> - 8263002: Remove CDS MiscCode region > > wow. Looks good to me. Thanks @coleenp and @dholmes-ora for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/2861 From dholmes at openjdk.java.net Wed Mar 10 06:26:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 10 Mar 2021 06:26:08 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> On Wed, 10 Mar 2021 02:03:32 GMT, Coleen Phillimore wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Remove complicated test in favor of Ioi's test, fix cause. Hi Coleen, That took some going through - sorry it took me a while to get back to it. The addition of the "cause" processing would have been better separated into its own RFE to keep things simple. I'll defer to Vladimir on the actual compiler changes. The rest I think I understand okay. I have a couple of additional comments below. Thanks, David src/hotspot/share/oops/constantPool.cpp line 561: > 559: // We also need to CAS to not overwrite an error from a racing thread. > 560: > 561: jbyte old_tag = Atomic::cmpxchg((jbyte*)this_cp->tag_addr_at(which), So the theory of operation is: if all racing threads resolve the klass successfully then they will have obtained the same klass instance and so it doesn't matter which thread actually called Atomic::release_store(adr, k); above most recently. One thread will manage to update the tag and all threads will return 'k'. If any thread encounters a resolution error there is a another CAS of the tag to ensure only one of them is first. If the error thread is first the others will see the UnresolvedClassInError and clear the klass from resolved_klasses() again. src/hotspot/share/oops/constantPool.cpp line 568: > 566: if (old_tag == JVM_CONSTANT_UnresolvedClassInError) { > 567: // Remove klass. > 568: Atomic::release_store(adr, (Klass*)NULL); You don't need a release_store in this case as you just did a CAS which has full memory synchronization. But in addition as you are setting it to NULL there are no prior stores that you need to ensure are visible. src/hotspot/share/oops/constantPool.cpp line 587: > 585: > 586: if (this_cp->tag_at(which).is_klass()) { > 587: Klass* k = this_cp->resolved_klasses()->at(resolved_klass_index); If this can't be encapsulated in a helper function please add a comment about checking the tag first. test/hotspot/jtreg/runtime/ParallelLoad/SaveResolutionErrorTest.java line 158: > 156: public Class loadClass(String name) throws ClassNotFoundException { > 157: if (name.equals("SaveResolutionErrorTest$Loadee")) { > 158: if (hack()) { You could just use a boolean field 'first'. src/hotspot/share/oops/constantPool.cpp line 877: > 875: // Needs clarification to section 5.4.3 of the VM spec (see 6308271) > 876: } else if (this_cp->tag_at(which).value() != error_tag) { > 877: add_resolution_error(this_cp, which, tag, PENDING_EXCEPTION); If the CAS below fails, which is this addition of the resolution error not undone? ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2718 From dholmes at openjdk.java.net Wed Mar 10 06:26:09 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 10 Mar 2021 06:26:09 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Fri, 5 Mar 2021 12:52:13 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/constantPool.cpp line 555: >> >>> 553: >>> 554: Klass** adr = this_cp->resolved_klasses()->adr_at(resolved_klass_index); >>> 555: Atomic::release_store(adr, k); >> >> If we are racing then isn't it the case that we may not have an entry in resolved_klasses()? > > The order is: add the klass to resolved_klasses() and then set the tag. We need to check the tag in order to see whether the klass is correct or not. This is the way it worked before resolved_klasses() was added, but there were a couple of shortcuts to just check the klass != NULL. With the race to set UnresolvedClassInError, we need to check the tag first again, because the klass is set to null if the unresolved class has won the race. Oh sorry - this is the code that sets the klass in the resolved_klasses(). I see now that we must always check the tag before using the value from resolved_klasses(). ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From david.holmes at oracle.com Wed Mar 10 06:48:45 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 10 Mar 2021 16:48:45 +1000 Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> Message-ID: Fix typo in comment: On 10/03/2021 4:26 pm, David Holmes wrote: > src/hotspot/share/oops/constantPool.cpp line 877: > >> 875: // Needs clarification to section 5.4.3 of the VM spec (see 6308271) >> 876: } else if (this_cp->tag_at(which).value() != error_tag) { >> 877: add_resolution_error(this_cp, which, tag, PENDING_EXCEPTION); > > If the CAS below fails, which is this addition of the resolution error not undone? s/which is/why is/ I think this was pre-existing but I'm not at all clear why we leave the entry in the table. Thanks, David > ------------- > > Marked as reviewed by dholmes (Reviewer). > > PR: https://git.openjdk.java.net/jdk/pull/2718 > From github.com+10482586+therealeliu at openjdk.java.net Wed Mar 10 10:50:20 2021 From: github.com+10482586+therealeliu at openjdk.java.net (Eric Liu) Date: Wed, 10 Mar 2021 10:50:20 GMT Subject: RFR: 8263058: Optimize vector shift with zero shift count Message-ID: Like scalar shift, vector shift could do nothing when shift count is zero. This patch implements the 'Identity' method for all kinds of vector shift nodes to optimize out 'ShiftVCntNode 0', which is typically a redundant 'mov' in final generated code like below: add x17, x12, x14 ldr q16, [x17, #16] mov v16.16b, v16.16b add x14, x13, x14 str q16, [x14, #16] With this patch, the code above could be optimized as below: add x17, x12, x14 ldr q16, [x17, #16] add x14, x13, x14 str q16, [x14, #16] [TESTS] compiler/vectorapi/TestVectorShiftImm.java, jdk/incubator/vector, hotspot::tier1 passed without new failure. ------------- Commit messages: - 8263058: Optimize vector shift with zero shift count Changes: https://git.openjdk.java.net/jdk/pull/2911/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2911&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263058 Stats: 73 lines in 4 files changed: 21 ins; 26 del; 26 mod Patch: https://git.openjdk.java.net/jdk/pull/2911.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2911/head:pull/2911 PR: https://git.openjdk.java.net/jdk/pull/2911 From ysuenaga at openjdk.java.net Wed Mar 10 11:20:05 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 10 Mar 2021 11:20:05 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Wed, 3 Mar 2021 05:19:00 GMT, Yasumasa Suenaga wrote: >>> > >>> > Maybe you could use the files under `/sys/devices/virtual/dmi/id/` like `board_vendor` and `board_name`? >>> >>> It does not exist on Raspberry Pi OS. >>> >> >> That's because the default Raspberry Pi firmware uses Device Tree. You'll only get those files if the system was booted from UEFI (x86-style firmware). > >> > > Maybe you could use the files under `/sys/devices/virtual/dmi/id/` like `board_vendor` and `board_name`? >> > >> > >> > It does not exist on Raspberry Pi OS. >> >> That's because the default Raspberry Pi firmware uses Device Tree. You'll only get those files if the system was booted from UEFI (x86-style firmware). > > As I said in before comment, I want to fix it for device tree at first if we cannot refer board name in same way between devce tree and ACPI. If we cannot refer device tree, "AArch64" still uses for CPU description - it is same behavior with current implementation. > > Maybe I can improve this change to refer `/sys/devices/virtual/dmi/id/` if I re-install UEFI supported OS (e.g. Fedora) to my Pi 4, but I cannot do it now. So I want to work for it in another issue. Ping! Could you review this PR? ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From aph at openjdk.java.net Wed Mar 10 11:48:10 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 10 Mar 2021 11:48:10 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v2] In-Reply-To: References: <2STI00DOT4Vgc0J4Y1Psz88Ku-9D0P5IVKisf-cHtUc=.b472f516-b24d-474e-9a10-5f3b87ae65dc@github.com> <2e5-2VKW1xK8pUzG3SXpzTvUo_T_UQXastnZVl-Yx8w=.ed316054-fe7b-440b-9bae-7287e5861d61@github.com> Message-ID: On Wed, 10 Mar 2021 11:16:50 GMT, Yasumasa Suenaga wrote: > Ping! Could you review this PR? Fix the out-of-bounds memory access and I'll approve it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From aph at openjdk.java.net Wed Mar 10 11:48:11 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 10 Mar 2021 11:48:11 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: Message-ID: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> On Wed, 3 Mar 2021 14:05:02 GMT, Yasumasa Suenaga wrote: > So I set `\0` to the tail of `buf` at L178. Which you then overwrite on Line 183. You really do need to move Line 174 to the end. I ran a test with the read fulling the whole buffer. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 10 12:55:24 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 10 Mar 2021 12:55:24 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v7] In-Reply-To: References: Message-ID: <_c5LV36LafwVPQsqKMYQ99bbhrsbt5Q4E7RgOftLHtw=.1739ace2-3d00-4ce6-a046-e35c176d8a9e@github.com> > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Fix buffer overflow ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2759/files - new: https://git.openjdk.java.net/jdk/pull/2759/files/28edb130..124e2cb2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2759.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2759/head:pull/2759 PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 10 13:00:08 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 10 Mar 2021 13:00:08 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> References: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> Message-ID: On Wed, 3 Mar 2021 16:49:39 GMT, Andrew Haley wrote: >> So I set `\0` to the tail of `buf` at L178. > >> So I set `\0` to the tail of `buf` at L178. > > Which you then overwrite on Line 183. You really do need to move Line 174 to the end. > I ran a test with the read fulling the whole buffer. Thanks for your comment! I pushed new commit. I issue `read()` with `buflen - 1` bytes, then I set `\0' to tail of `buf` after `read()`. And also `\0` will be converted to blank will happen when `read()` return 1 or greater. How about it? ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From hseigel at openjdk.java.net Wed Mar 10 13:17:10 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 10 Mar 2021 13:17:10 GMT Subject: RFR: 8247869: Change NONCOPYABLE to delete the operations In-Reply-To: References: Message-ID: <7xT9H0v8LkqKlMrAjidbsxvhdjxZkcVMrSjYbACGqdM=.c8e102e1-6f00-48b2-a720-6260c6162789@github.com> On Tue, 9 Mar 2021 23:40:39 GMT, David Holmes wrote: >> Please review this fix for JDK-8247869. The fix was regression tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows and tiers 3-5 on Linux x64. >> >> Thanks, Harold > > LGTM! > > Thanks, > David Thanks Kim and David for reviewing this change! ------------- PR: https://git.openjdk.java.net/jdk/pull/2891 From hseigel at openjdk.java.net Wed Mar 10 13:17:10 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 10 Mar 2021 13:17:10 GMT Subject: Integrated: 8247869: Change NONCOPYABLE to delete the operations In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 13:43:22 GMT, Harold Seigel wrote: > Please review this fix for JDK-8247869. The fix was regression tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows and tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: fab56766 Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/fab56766 Stats: 13 lines in 1 file changed: 0 ins; 6 del; 7 mod 8247869: Change NONCOPYABLE to delete the operations Reviewed-by: kbarrett, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/2891 From coleenp at openjdk.java.net Wed Mar 10 13:54:08 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 13:54:08 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> Message-ID: On Wed, 10 Mar 2021 05:43:43 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove complicated test in favor of Ioi's test, fix cause. > > src/hotspot/share/oops/constantPool.cpp line 561: > >> 559: // We also need to CAS to not overwrite an error from a racing thread. >> 560: >> 561: jbyte old_tag = Atomic::cmpxchg((jbyte*)this_cp->tag_addr_at(which), > > So the theory of operation is: if all racing threads resolve the klass successfully then they will have obtained the same klass instance and so it doesn't matter which thread actually called Atomic::release_store(adr, k); > above most recently. One thread will manage to update the tag and all threads will return 'k'. > If any thread encounters a resolution error there is a another CAS of the tag to ensure only one of them is first. If the error thread is first the others will see the UnresolvedClassInError and clear the klass from resolved_klasses() again. Yes. > src/hotspot/share/oops/constantPool.cpp line 587: > >> 585: >> 586: if (this_cp->tag_at(which).is_klass()) { >> 587: Klass* k = this_cp->resolved_klasses()->at(resolved_klass_index); > > If this can't be encapsulated in a helper function please add a comment about checking the tag first. Ok. I'll add the comment and file an RFE to encapsulate this. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 13:58:09 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 13:58:09 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> Message-ID: On Wed, 10 Mar 2021 05:50:35 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove complicated test in favor of Ioi's test, fix cause. > > test/hotspot/jtreg/runtime/ParallelLoad/SaveResolutionErrorTest.java line 158: > >> 156: public Class loadClass(String name) throws ClassNotFoundException { >> 157: if (name.equals("SaveResolutionErrorTest$Loadee")) { >> 158: if (hack()) { > > You could just use a boolean field 'first'. Ioi might have wanted to defend against optimization here, so I prefer to leave it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 14:04:10 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 14:04:10 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> Message-ID: <7eg9c-z1rX7RX6KRLXbk_g55tGwG_AhcPujTcTJwsv8=.2b983ad7-ce80-402b-9542-799d5e0252bd@github.com> On Wed, 10 Mar 2021 05:52:26 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove complicated test in favor of Ioi's test, fix cause. > > src/hotspot/share/oops/constantPool.cpp line 877: > >> 875: // Needs clarification to section 5.4.3 of the VM spec (see 6308271) >> 876: } else if (this_cp->tag_at(which).value() != error_tag) { >> 877: add_resolution_error(this_cp, which, tag, PENDING_EXCEPTION); > > If the CAS below fails, which is this addition of the resolution error not undone? It probably should be but we don't have a function to delete an individual resolution error, only all of them for the constant pool. So this needs to be done in a future RFE. > src/hotspot/share/oops/constantPool.cpp line 568: > >> 566: if (old_tag == JVM_CONSTANT_UnresolvedClassInError) { >> 567: // Remove klass. >> 568: Atomic::release_store(adr, (Klass*)NULL); > > You don't need a release_store in this case as you just did a CAS which has full memory synchronization. But in addition as you are setting it to NULL there are no prior stores that you need to ensure are visible. Ok, I'll change this to an assignment. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From enikitin at openjdk.java.net Wed Mar 10 14:15:08 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Wed, 10 Mar 2021 14:15:08 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> Message-ID: On Thu, 4 Mar 2021 13:15:59 GMT, Vladimir Ivanov wrote: >> Yes, they are native wrappers but, in contrast to c2i/i2c adapters, they are still implemented as `nmethods`. I'm not a JSR292 expert but I think this is because they are potentially containing (meta)data that needs to be discovered when walking the code cache and iterator methods like `CodeCache::metadata_do` will only walk the `nmethod` heaps. They might also use other properties of `nmethods`. So I think the question would be more like "could native wrappers be implemented as `BufferBlobs`, similar to i2c/c2i adapters?" >> @iwanowww, what do you think? > > I don't see a compelling reason why method handle linkers have to be nmethods and live in 'profiled'/'non-profiled' code heaps. I think the reason why it works that way now is the linkers are treated as ordinary native wrappers (since linker methods are just signature-polymorphic native static methods declared on `java.lang.invoke.MethodHandle` class). But native wrappers are represented as `nmethod`s for a reason: they can be unloaded along with the class. > It's not the case with MH linkers which aren't unloaded at all. > > Please, file an RFE if you find it desirable to put MH linkers into 'non-nmethods' heap. Raised the RFE: https://bugs.openjdk.java.net/browse/JDK-8263377 ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From coleenp at openjdk.java.net Wed Mar 10 14:15:30 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 14:15:30 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v5] In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add comment and removed atomic operation. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2718/files - new: https://git.openjdk.java.net/jdk/pull/2718/files/7461dec8..6ad7ab6f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=03-04 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2718.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2718/head:pull/2718 PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 14:15:31 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 14:15:31 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v4] In-Reply-To: <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> <8cnuA1d34aX7tNS8QodY0mPR_kiTQeAwV4qLr3CVlDY=.a5984cf3-2eb8-4b22-a858-f2151f29f5f1@github.com> Message-ID: On Wed, 10 Mar 2021 06:23:32 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove complicated test in favor of Ioi's test, fix cause. > > Hi Coleen, > > That took some going through - sorry it took me a while to get back to it. The addition of the "cause" processing would have been better separated into its own RFE to keep things simple. > > I'll defer to Vladimir on the actual compiler changes. The rest I think I understand okay. I have a couple of additional comments below. > > Thanks, > David Thanks David for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 14:21:11 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 14:21:11 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL [v2] In-Reply-To: References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: On Wed, 10 Mar 2021 01:35:00 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Replace equals with contains, suggested by hseigel. > > One small nit about the test code. The rest looks good to me. Thanks Harold and Ioi for the code reviews. > test/hotspot/jtreg/runtime/DefineClass/NullClassBytesTest.java line 73: > >> 71: } >> 72: >> 73: byte[] getClassData(String name) { > > This function can be simplified to something like > > return SimpleLoader.class.getClassLoader().getResourceAsStream(name + ".class").readAllBytes(); > > That way it will be more resilient if jtreg wants to put the classfile elsewhere. I like this suggestion! We should file an RFE to fix all the tests that have this getClassData function. ------------- PR: https://git.openjdk.java.net/jdk/pull/2892 From coleenp at openjdk.java.net Wed Mar 10 14:21:11 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 14:21:11 GMT Subject: Integrated: 8262913: KlassFactory::create_from_stream should never return NULL In-Reply-To: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: On Tue, 9 Mar 2021 13:45:42 GMT, Coleen Phillimore wrote: > ClassFileParser.create_instance_klass cannot return NULL without a pending exception, it's called by > klassFactory::create_from_stream who also cannot return NULL without a pending exception, it's called by > SystemDictionary::parse_stream and SystemDictionary::resolve_from_stream who also cannot return NULL without a pending exception. > > I removed the NULL checks on returns from these 4 functions and either replaced them with an assert, or in cases that already had an unconditional indirection from the return value, just removed the null checks. > > I wrote a test case to cover the case of testing st->buffer() == NULL which returned NULL from SystemDictionary::resolve_from_stream to show that this code path is not used. ClassFileStream constructor will set buffer() to the u1* input argument if it doesn't point to anything and will get a ClassFormatError. > > Tested with tier1-3. This pull request has now been integrated. Changeset: 4d21a455 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/4d21a455 Stats: 236 lines in 9 files changed: 191 ins; 16 del; 29 mod 8262913: KlassFactory::create_from_stream should never return NULL Reviewed-by: hseigel, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/2892 From coleenp at openjdk.java.net Wed Mar 10 14:25:08 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 14:25:08 GMT Subject: RFR: 8262913: KlassFactory::create_from_stream should never return NULL [v2] In-Reply-To: References: <26nmze5lGBY7kw5lRwIWAtUxMQY8Upse98FbSHwp9Y8=.92590a38-2b2b-4d9e-b3a0-3620d965d1af@github.com> Message-ID: On Wed, 10 Mar 2021 14:14:26 GMT, Coleen Phillimore wrote: >> test/hotspot/jtreg/runtime/DefineClass/NullClassBytesTest.java line 73: >> >>> 71: } >>> 72: >>> 73: byte[] getClassData(String name) { >> >> This function can be simplified to something like >> >> return SimpleLoader.class.getClassLoader().getResourceAsStream(name + ".class").readAllBytes(); >> >> That way it will be more resilient if jtreg wants to put the classfile elsewhere. > > I like this suggestion! We should file an RFE to fix all the tests that have this getClassData function. I think this was supposed to be SimpleLoader.getResourceAsStream(name + ".class").readAllBytes(); ------------- PR: https://git.openjdk.java.net/jdk/pull/2892 From enikitin at openjdk.java.net Wed Mar 10 14:26:07 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Wed, 10 Mar 2021 14:26:07 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> Message-ID: On Wed, 10 Mar 2021 14:12:41 GMT, Evgeny Nikitin wrote: >> I don't see a compelling reason why method handle linkers have to be nmethods and live in 'profiled'/'non-profiled' code heaps. I think the reason why it works that way now is the linkers are treated as ordinary native wrappers (since linker methods are just signature-polymorphic native static methods declared on `java.lang.invoke.MethodHandle` class). But native wrappers are represented as `nmethod`s for a reason: they can be unloaded along with the class. >> It's not the case with MH linkers which aren't unloaded at all. >> >> Please, file an RFE if you find it desirable to put MH linkers into 'non-nmethods' heap. > > Raised the RFE: https://bugs.openjdk.java.net/browse/JDK-8263377 Here's the rationale behind the non-nmethods heap monitoring: the test i2c_c2i that fills up that heap. The test was started with 8MB per each heap, and consumed all the non-nmethods one. The consumption speed diminishes quickly after ~10 MB, but with the limits of 8 MB (stated in the case) we can hit the Error: Heaps statistics: [non-nm: 1.8 MiB : 22%][profiled: 781.3 KiB : 9%][non-profiled: 189.3 KiB : 2%] ... Heaps statistics: [non-nm: 8.0 MiB : 99%][profiled: 3.0 MiB : 37%][non-profiled: 8.0 MiB : 99%] ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From hseigel at openjdk.java.net Wed Mar 10 15:00:10 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 10 Mar 2021 15:00:10 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v5] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 10 Mar 2021 14:15:30 GMT, Coleen Phillimore wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add comment and removed atomic operation. The changes look good! Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 16:15:38 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 16:15:38 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v6] In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'master' into parallel-super-test - Add comment and removed atomic operation. - Remove complicated test in favor of Ioi's test, fix cause. - Some code review changes from iklam - Remove unnecessary tag comparison. - Vladimir Ivanov's compiler patch. - Save Throwable::cause also to the resolution error table. - Fix deoptimization and compiler to preserve and recognize constant pool class loading errors. - 8262377: Parallel class resolution loses constant pool error - Add test for parallel class loading ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2718/files - new: https://git.openjdk.java.net/jdk/pull/2718/files/6ad7ab6f..d6f182f8 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2718&range=04-05 Stats: 7616 lines in 299 files changed: 3897 ins; 2527 del; 1192 mod Patch: https://git.openjdk.java.net/jdk/pull/2718.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2718/head:pull/2718 PR: https://git.openjdk.java.net/jdk/pull/2718 From aph at openjdk.java.net Wed Mar 10 18:04:09 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 10 Mar 2021 18:04:09 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> Message-ID: On Wed, 10 Mar 2021 12:57:42 GMT, Yasumasa Suenaga wrote: >>> So I set `\0` to the tail of `buf` at L178. >> >> Which you then overwrite on Line 183. You really do need to move Line 174 to the end. >> I ran a test with the read fulling the whole buffer. > > Thanks for your comment! I pushed new commit. > > I issue `read()` with `buflen - 1` bytes, then I set `\0' to tail of `buf` after `read()`. And also `\0` will be converted to blank will happen when `read()` return 1 or greater. How about it? I think you need this to cope with errors, empty files, and so on: void VM_Version::get_compatible_board(char *buf, int buflen) { assert(buf != NULL, "invalid argument"); assert(buflen >= 1, "invalid argument"); *buf = '\0'; int fd = open("/proc/device-tree/compatible", O_RDONLY); ssize_t read_sz = read(fd, buf, buflen - 1); if (read_sz >= 0) { buf[read_sz] = '\0'; // Replace '\0' to ' ' for (char *ch = buf; ch < buf + read_sz; ch++) { if (*ch == '\0') { *ch = ' '; } } } else { // read() retuned an error buf[0] = '\0'; } =>close(fd); } ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From lfoltan at openjdk.java.net Wed Mar 10 18:06:09 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Wed, 10 Mar 2021 18:06:09 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 16:04:19 GMT, Harold Seigel wrote: > Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Looks good Harold! I just have a couple of minor comments. Lois src/hotspot/share/utilities/globalCounter.hpp line 68: > 66: public: > 67: > 68: extra additional line src/hotspot/share/utilities/globalCounter.hpp line 72: > 70: // critical_section_begin() to critical_section_end(). > 71: enum class CSContext : uintx {}; // [COUNTER_ACTIVE, COUNTER_INCREMENT) > 72: Please change the [ in the comment to a ( src/hotspot/share/utilities/globalCounter.hpp line 74: > 72: > 73: // Give these access to the private COUNTER_* constants. > 74: friend struct EnumeratorRange; By "these" do you mean the GlobalCounter class? ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2895 From iignatyev at openjdk.java.net Wed Mar 10 18:07:09 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 10 Mar 2021 18:07:09 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v4] In-Reply-To: <2a-RCfm845Xe9B92_9mx-qGY9uDbMBBA95WSkaS6X4g=.4493f0d9-f318-4200-8532-27182367cead@github.com> References: <2a-RCfm845Xe9B92_9mx-qGY9uDbMBBA95WSkaS6X4g=.4493f0d9-f318-4200-8532-27182367cead@github.com> Message-ID: <2BSuC2-KIMim4io0E8ZbhYRYr8-1OHxhQhSuBDYpGjs=.d0426116-ef82-405f-92f3-d6e642fac0f3@github.com> On Thu, 18 Feb 2021 10:04:04 GMT, Evgeny Nikitin wrote: >> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes: >> >> * Code cache size getters are added to WhiteBox; >> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance); >> * Dependencies on WhiteBox added for all affected tests; >> * The test cases in question un-problemlisted. >> >> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86. > > Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision: > > Add non-nmethods pool to the monitoring Changes requested by iignatyev (Reviewer). test/hotspot/jtreg/vmTestbase/vm/mlvm/meth/share/MHTransformationGen.java line 104: > 102: NON_NMETHODS_POOL.ifPresent(pool -> check.accept(pool, 1_000_000)); > 103: PROFILED_NMETHODS_POOL.ifPresent(pool -> check.accept(pool, 1_000_000)); > 104: NON_PROFILED_NMETHODS_POOL.ifPresent(pool -> check.accept(pool, 1_000_000)); could you please introduce a (or two) static final field for these constants and use them here? ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From iignatyev at openjdk.java.net Wed Mar 10 18:37:14 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 10 Mar 2021 18:37:14 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> Message-ID: On Wed, 10 Mar 2021 14:23:03 GMT, Evgeny Nikitin wrote: >> Raised the RFE: https://bugs.openjdk.java.net/browse/JDK-8263377 > > Here's the rationale behind the non-nmethods heap monitoring: the test i2c_c2i that fills up that heap. The test was started with 8MB per each heap, and consumed all the non-nmethods one. The consumption speed diminishes quickly after ~10 MB, but with the limits of 8 MB (stated in the case) we can hit the Error: > > Heaps statistics: [non-nm: 1.8 MiB : 22%][profiled: 781.3 KiB : 9%][non-profiled: 189.3 KiB : 2%] > ... > Heaps statistics: [non-nm: 8.0 MiB : 99%][profiled: 3.0 MiB : 37%][non-profiled: 8.0 MiB : 99%] > Raised the RFE: https://bugs.openjdk.java.net/browse/JDK-8263377 do you plan to change `MHTransformationGen:: isCodeCacheEffectivelyFull` after/if 8263377 got integrated? ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From hseigel at openjdk.java.net Wed Mar 10 19:02:30 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 10 Mar 2021 19:02:30 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: > Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: Fix comments ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2895/files - new: https://git.openjdk.java.net/jdk/pull/2895/files/a7820525..32acbaa1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2895&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2895&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2895.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2895/head:pull/2895 PR: https://git.openjdk.java.net/jdk/pull/2895 From hseigel at openjdk.java.net Wed Mar 10 19:02:31 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 10 Mar 2021 19:02:31 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 17:50:08 GMT, Lois Foltan wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > src/hotspot/share/utilities/globalCounter.hpp line 72: > >> 70: // critical_section_begin() to critical_section_end(). >> 71: enum class CSContext : uintx {}; // [COUNTER_ACTIVE, COUNTER_INCREMENT) >> 72: > > Please change the [ in the comment to a ( Fixed > src/hotspot/share/utilities/globalCounter.hpp line 74: > >> 72: >> 73: // Give these access to the private COUNTER_* constants. >> 74: friend struct EnumeratorRange; > > By "these" do you mean the GlobalCounter class? Changed from 'these' to 'this'. This gives access needed during expansion of the ENUMERATOR_VALUE_RANGE macro below. ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From kbarrett at openjdk.java.net Wed Mar 10 19:08:15 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 10 Mar 2021 19:08:15 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 19:02:30 GMT, Harold Seigel wrote: >> Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments Marked as reviewed by kbarrett (Reviewer). src/hotspot/share/utilities/globalCounter.hpp line 71: > 69: // The type of the critical section context passed from > 70: // critical_section_begin() to critical_section_end(). > 71: enum class CSContext : uintx {}; // [COUNTER_ACTIVE, COUNTER_INCREMENT) This enum declaration instead of the typedef is all that's needed. The range comment is incorrect. src/hotspot/share/utilities/globalCounter.hpp line 95: > 93: }; > 94: > 95: ENUMERATOR_VALUE_RANGE(GlobalCounter::CSContext, There is not, so far as I know, any use-case for iteration over CSContext values. And those two don't define the range of such. They are not declared as CSContext values, nor should they be. ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From hseigel at openjdk.java.net Wed Mar 10 19:08:17 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 10 Mar 2021 19:08:17 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 17:48:26 GMT, Lois Foltan wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > src/hotspot/share/utilities/globalCounter.hpp line 68: > >> 66: public: >> 67: >> 68: > > extra additional line deleted ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From kbarrett at openjdk.java.net Wed Mar 10 19:20:08 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 10 Mar 2021 19:20:08 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 19:02:30 GMT, Harold Seigel wrote: >> Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments Changes requested by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From kbarrett at openjdk.java.net Wed Mar 10 19:20:10 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 10 Mar 2021 19:20:10 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: <0XsqkPVYgkyqXIbOhg7mykYg_p_OMGvPWfpzctz-hLY=.54ad892e-3277-47f3-9439-b2d881abd672@github.com> On Wed, 10 Mar 2021 18:57:12 GMT, Harold Seigel wrote: >> src/hotspot/share/utilities/globalCounter.hpp line 72: >> >>> 70: // critical_section_begin() to critical_section_end(). >>> 71: enum class CSContext : uintx {}; // [COUNTER_ACTIVE, COUNTER_INCREMENT) >>> 72: >> >> Please change the [ in the comment to a ( > > Fixed Those values are not a range! One is a tag bit value, the other is the increment exclusive of the tag bit. ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From hseigel at openjdk.java.net Wed Mar 10 19:20:11 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 10 Mar 2021 19:20:11 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 19:02:12 GMT, Kim Barrett wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > src/hotspot/share/utilities/globalCounter.hpp line 71: > >> 69: // The type of the critical section context passed from >> 70: // critical_section_begin() to critical_section_end(). >> 71: enum class CSContext : uintx {}; // [COUNTER_ACTIVE, COUNTER_INCREMENT) > > This enum declaration instead of the typedef is all that's needed. The range comment is incorrect. So, all this change needs is: "enum class CSContext: uintx {};" ? ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From kvn at openjdk.java.net Wed Mar 10 20:04:19 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 10 Mar 2021 20:04:19 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. Message-ID: Currently during deoptimization Vector's `payload` field values are restored during Vector reallocation: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L155 But for scalar-replaced values this is not correct because payload box object could be re-allocated after allocation of this vector. Scalar-replaced `payload` should be restored during regular fields reassignment (`Deoptimization::reassign_fields()` change). I renamed incorrect `eliminate_*` names for methods which restore/reallocate objects and locks. I added checks for EliminateAutoBox and EnableVectorAggressiveReboxing optimizations which can replace allocations with scalar objects independent from Escape Analysis. I added prints for unexpected StackValue values (stackValue.cpp) and for Vector debug info location type (location.cpp). ------------- Commit messages: - 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. Changes: https://git.openjdk.java.net/jdk/pull/2924/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2924&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263125 Stats: 38 lines in 3 files changed: 20 ins; 4 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/2924.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2924/head:pull/2924 PR: https://git.openjdk.java.net/jdk/pull/2924 From kvn at openjdk.java.net Wed Mar 10 20:08:24 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 10 Mar 2021 20:08:24 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: > Currently during deoptimization Vector's `payload` field values are restored during Vector reallocation: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L155 > > But for scalar-replaced values this is not correct because payload box object could be re-allocated after allocation of this vector. Scalar-replaced `payload` should be restored during regular fields reassignment (`Deoptimization::reassign_fields()` change). > > I renamed incorrect `eliminate_*` names for methods which restore/reallocate objects and locks. > > I added checks for EliminateAutoBox and EnableVectorAggressiveReboxing optimizations which can replace allocations with scalar objects independent from Escape Analysis. > > I added prints for unexpected StackValue values (stackValue.cpp) and for Vector debug info location type (location.cpp). Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Update Copyright year ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2924/files - new: https://git.openjdk.java.net/jdk/pull/2924/files/3fcd2738..2eb0b82d Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2924&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2924&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2924.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2924/head:pull/2924 PR: https://git.openjdk.java.net/jdk/pull/2924 From enikitin at openjdk.java.net Wed Mar 10 20:43:06 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Wed, 10 Mar 2021 20:43:06 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v2] In-Reply-To: References: <2_Gpraz6NaY17HPfRDW-LD-sQrrPQ4dpIVP8vikpdXM=.d425cd8b-aea5-43be-865e-72229db81e6e@github.com> <2qEkvkaxAPHeFaDoCRmcPaehczQgwZNnZMxO2Z-Vc28=.d4845a88-7d71-4768-b952-5ff9c4ab8311@github.com> <0Kpq74YISM_ggEDcc5XKgpLywkxLVFe3qIzJ6nxpSOw=.42b05b7a-717e-4a21-b300-c621a4296d9f@github.com> Message-ID: On Wed, 10 Mar 2021 18:34:03 GMT, Igor Ignatyev wrote: > > Raised the RFE: https://bugs.openjdk.java.net/browse/JDK-8263377 > > do you plan to change `MHTransformationGen:: isCodeCacheEffectivelyFull` after/if 8263377 got integrated? Raised the https://bugs.openjdk.java.net/browse/JDK-8263398 to not forget about it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From coleenp at openjdk.java.net Wed Mar 10 21:01:20 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 21:01:20 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v6] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 10 Mar 2021 20:55:00 GMT, Vladimir Kozlov wrote: >> Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: >> >> - Merge branch 'master' into parallel-super-test >> - Add comment and removed atomic operation. >> - Remove complicated test in favor of Ioi's test, fix cause. >> - Some code review changes from iklam >> - Remove unnecessary tag comparison. >> - Vladimir Ivanov's compiler patch. >> - Save Throwable::cause also to the resolution error table. >> - Fix deoptimization and compiler to preserve and recognize constant pool class loading errors. >> - 8262377: Parallel class resolution loses constant pool error >> - Add test for parallel class loading > > Compiler's CI changes seems fine. Thanks VladimirK and Harold! ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From kvn at openjdk.java.net Wed Mar 10 21:01:16 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 10 Mar 2021 21:01:16 GMT Subject: RFR: 8262377: Parallel class resolution loses constant pool error [v6] In-Reply-To: References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 10 Mar 2021 16:15:38 GMT, Coleen Phillimore wrote: >> This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. >> >> One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. >> >> I didn't squash the commits so it would be easier to see the different changes, but they all go together. >> >> The test description: >> >> Two Threads T1, T2 >> >> Three definitions of class A, defined by user defined class loader >> Class A extends B extends A (CCE) >> Class A extends B >> Class A extends C >> >> Five modes: >> Sequential >> Concurrent loading with user defined class loader >> Concurrent loading parallelCapable class loader >> Wait when loading the superclass with parallelCapable class loader >> Wait when loading the superclass with user defined class loader >> >> In all cases, after A is parsed and calls resolve_super_or_fail to load B >> and loading B waits. Classes ClassInLoader, CP1 and CP2 provide >> constant pool references to A. >> >> In all cases, when B waits, A is replaced with bytes so A extends C. >> >> Two tests x 3 modes (both threads do the same): >> (CCE) First test A extends B, which throws CCE. >> -- All three modes: first constant pool reference throws CCE, second reference A extends C >> (B) Second test A extends B which doesn't throw CCE. >> -- All three modes: both references A extends B. >> >> The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 >> is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is >> currently loading. >> >> Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. >> >> Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. > > Coleen Phillimore has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: > > - Merge branch 'master' into parallel-super-test > - Add comment and removed atomic operation. > - Remove complicated test in favor of Ioi's test, fix cause. > - Some code review changes from iklam > - Remove unnecessary tag comparison. > - Vladimir Ivanov's compiler patch. > - Save Throwable::cause also to the resolution error table. > - Fix deoptimization and compiler to preserve and recognize constant pool class loading errors. > - 8262377: Parallel class resolution loses constant pool error > - Add test for parallel class loading Compiler's CI changes seems fine. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2718 From coleenp at openjdk.java.net Wed Mar 10 21:01:23 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 10 Mar 2021 21:01:23 GMT Subject: Integrated: 8262377: Parallel class resolution loses constant pool error In-Reply-To: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> References: <82usnwhaTIFLLd1spezPPYn8Q4_GEqZ-q131wNWU7Lk=.23c97f36-6db5-4434-99a7-38a325bedbde@github.com> Message-ID: On Wed, 24 Feb 2021 23:51:58 GMT, Coleen Phillimore wrote: > This PR was to originally add some tests for parallel class loading situations that aren't covered in our internal parallel class loading tests. The tests found that class loading resolution errors weren't saving the error in the constant pool to implement JVMS 5.4.3. The compiler was also doing re-resolution rather than using the error saved at that constant pool index. > > One of the existing CDS tests verified that the Throwable.cause so this change also adds the cause and cause message to the resolution_errors() saved exceptions. > > I didn't squash the commits so it would be easier to see the different changes, but they all go together. > > The test description: > > Two Threads T1, T2 > > Three definitions of class A, defined by user defined class loader > Class A extends B extends A (CCE) > Class A extends B > Class A extends C > > Five modes: > Sequential > Concurrent loading with user defined class loader > Concurrent loading parallelCapable class loader > Wait when loading the superclass with parallelCapable class loader > Wait when loading the superclass with user defined class loader > > In all cases, after A is parsed and calls resolve_super_or_fail to load B > and loading B waits. Classes ClassInLoader, CP1 and CP2 provide > constant pool references to A. > > In all cases, when B waits, A is replaced with bytes so A extends C. > > Two tests x 3 modes (both threads do the same): > (CCE) First test A extends B, which throws CCE. > -- All three modes: first constant pool reference throws CCE, second reference A extends C > (B) Second test A extends B which doesn't throw CCE. > -- All three modes: both references A extends B. > > The code in SystemDictionary::handle_parallel_super_load treats the parallel case for thread T2 as if T1 > is not stalled and wins the race to load the class, by attempting to load the same superclass as T1 is > currently loading. > > Resolution for a constant pool reference should always fail with the same error even if there are concurrent threads doing that resolution. Forcing the second thread to resolve the super class of the first, even if the thread has a different set of bytes for the class A, is a way to do that, but this actually exposed that the second successful thread should check the result of the constant pool resolution for the first. So this exposed this bug. > > Tested with tier1, on all Oracle supported platforms and tier2-8 on linux-x64-debug and windows-x64-debug. This pull request has now been integrated. Changeset: 57f16f9f Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/57f16f9f Stats: 404 lines in 14 files changed: 297 ins; 45 del; 62 mod 8262377: Parallel class resolution loses constant pool error Co-authored-by: Vladimir Ivanov Co-authored-by: Coleen Phillimore Co-authored-by: Ioi Lam Reviewed-by: dholmes, iklam, hseigel, kvn ------------- PR: https://git.openjdk.java.net/jdk/pull/2718 From vlivanov at openjdk.java.net Wed Mar 10 22:10:08 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Wed, 10 Mar 2021 22:10:08 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 20:08:24 GMT, Vladimir Kozlov wrote: >> Currently during deoptimization Vector's `payload` field values are restored during Vector reallocation: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L155 >> >> But for scalar-replaced values this is not correct because payload box object could be re-allocated after allocation of this vector. Scalar-replaced `payload` should be restored during regular fields reassignment (`Deoptimization::reassign_fields()` change). >> >> I renamed incorrect `eliminate_*` names for methods which restore/reallocate objects and locks. >> >> I added checks for EliminateAutoBox and EnableVectorAggressiveReboxing optimizations which can replace allocations with scalar objects independent from Escape Analysis. >> >> I added prints for unexpected StackValue values (stackValue.cpp) and for Vector debug info location type (location.cpp). > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Update Copyright year I vividly remember I fixed the very same problem before in Panama. More specifically, why doesn't `VectorSupport::allocate_vector_payload` handle the case? Handle VectorSupport::allocate_vector_payload(InstanceKlass* ik, frame* fr, RegisterMap* reg_map, ScopeValue* payload, TRAPS) { if (payload->is_location() && payload->as_LocationValue()->location().type() == Location::vector) { // Vector value in an aligned adjacent tuple (1, 2, 4, 8, or 16 slots). Location location = payload->as_LocationValue()->location(); return allocate_vector_payload_helper(ik, fr, reg_map, location, THREAD); // safepoint } else { // Scalar-replaced boxed vector representation. StackValue* value = StackValue::create_stack_value(fr, reg_map, payload); return value->get_obj(); } } ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From kbarrett at openjdk.java.net Wed Mar 10 22:29:12 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 10 Mar 2021 22:29:12 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v2] In-Reply-To: References: Message-ID: <5UX2dVh773EzFV7YMQjRvtmHEUPi1F9a1MbE3PQkLZA=.4205621b-8669-4a75-afae-58e5dd65c5eb@github.com> On Wed, 10 Mar 2021 19:17:22 GMT, Harold Seigel wrote: >> src/hotspot/share/utilities/globalCounter.hpp line 71: >> >>> 69: // The type of the critical section context passed from >>> 70: // critical_section_begin() to critical_section_end(). >>> 71: enum class CSContext : uintx {}; // [COUNTER_ACTIVE, COUNTER_INCREMENT) >> >> This enum declaration instead of the typedef is all that's needed. The range comment is incorrect. > > So, all this change needs is: "enum class CSContext: uintx {};" ? Yes. The RFE was filed as a followup to JDK-8212827, where I did everything needed to make that work, but couldn't take that final step because we weren't supporting C++11/14 yet. ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From kvn at openjdk.java.net Wed Mar 10 23:35:07 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 10 Mar 2021 23:35:07 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 22:07:40 GMT, Vladimir Ivanov wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Update Copyright year > > I vividly remember I fixed the very same problem before in Panama. > > More specifically, why doesn't `VectorSupport::allocate_vector_payload` handle the case? > > Handle VectorSupport::allocate_vector_payload(InstanceKlass* ik, frame* fr, RegisterMap* reg_map, ScopeValue* payload, TRAPS) { > if (payload->is_location() && > payload->as_LocationValue()->location().type() == Location::vector) { > // Vector value in an aligned adjacent tuple (1, 2, 4, 8, or 16 slots). > Location location = payload->as_LocationValue()->location(); > return allocate_vector_payload_helper(ik, fr, reg_map, location, THREAD); // safepoint > } else { > // Scalar-replaced boxed vector representation. > StackValue* value = StackValue::create_stack_value(fr, reg_map, payload); > return value->get_obj(); > } > } Here is the order of scalarized objects in this case: # jdk.incubator.vector.ShortVector::rearrangeTemplate @ bci:14 (line 2117) L[0]=#ScObj0 L[1]=_ L[2]=#ScObj1 L[3]=#ScObj2 L[4]=_ L[5]=_ STK[0]=#Ptr0x00007f83a4e99850 STK[1]=#Ptr0x00007f83bd684760 STK[2]=#Ptr0x00007f83bca050f0 STK[3]=#4 STK[4]=#ScObj0 STK[5]=#ScObj1 # ScObj0 jdk/incubator/vector/Short64Vector={ [payload :0]=rsp + #40 } # ScObj1 jdk/incubator/vector/Short64Vector$Short64Shuffle={ [payload :0]=#ScObj3 } # ScObj2 jdk/incubator/vector/Short64Vector$Short64Mask={ [payload :0]=rsp + #32 } # ScObj3 byte[4]={ [0]=rsp + #48 , [1]=rsp + #52 , [2]=rsp + #56 , [3]=RBP } `ScObj1 Short64Shuffle` vector will be processed before `ScObj3 byte[4]` which it references. VectorSupport::allocate_vector_payload() called only during `ScObj1 Short64Shuffle` reallocation when `ScObj3 byte[4]` is not reallocated yet. As result `value->get_obj()` return NULL. And that is the bug. I forgot to say that the bug is triggered only with iterative EA I am working on. Without it `byte[]` array is not scalarized and everything works. I also thought about calling modified `VectorSupport::allocate_vector_payload()` from `Deoptimization::reassign_fields()` to keep checks in one place but it would require more code changes then suggested fix (which is mostly asserts and debug print). ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From ysuenaga at openjdk.java.net Wed Mar 10 23:47:27 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 10 Mar 2021 23:47:27 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v8] In-Reply-To: References: Message-ID: <4WPv82jIATB_TZZdx6DNKo9aF0KfxzvKGSZlWV4N8u0=.7f5b74c4-09a5-4b30-ad07-78d914e245d4@github.com> > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: refactoring ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2759/files - new: https://git.openjdk.java.net/jdk/pull/2759/files/124e2cb2..45710fdc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=07 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2759&range=06-07 Stats: 4 lines in 1 file changed: 3 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2759.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2759/head:pull/2759 PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Wed Mar 10 23:52:06 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 10 Mar 2021 23:52:06 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> Message-ID: On Wed, 10 Mar 2021 18:00:10 GMT, Andrew Haley wrote: >> Thanks for your comment! I pushed new commit. >> >> I issue `read()` with `buflen - 1` bytes, then I set `\0' to tail of `buf` after `read()`. And also `\0` will be converted to blank will happen when `read()` return 1 or greater. How about it? > > I think you need this to cope with errors, empty files, and so on: > > void VM_Version::get_compatible_board(char *buf, int buflen) { > assert(buf != NULL, "invalid argument"); > assert(buflen >= 1, "invalid argument"); > *buf = '\0'; > int fd = open("/proc/device-tree/compatible", O_RDONLY); > ssize_t read_sz = read(fd, buf, buflen - 1); > if (read_sz >= 0) { > buf[read_sz] = '\0'; > // Replace '\0' to ' ' > for (char *ch = buf; ch < buf + read_sz; ch++) { > if (*ch == '\0') { > *ch = ' '; > } > } > } else { // read() retuned an error > buf[0] = '\0'; > } > =>close(fd); > } I updated PR, but it is different from your suggestion a bit. This version would return empty string when `open()` or `read()` returns error, and also `close()` would be called when `open()` succeeded. Did you say we can call `read()` and `close()` even if `open()` failed? According to manpage, it seems to work (just return EBADF), but I feel strange a bit. If it is ok, I will do so. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From github.com+168222+mgkwill at openjdk.java.net Thu Mar 11 03:40:18 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Thu, 11 Mar 2021 03:40:18 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v18] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 28 commits: - Update LargePage Setup per review comments Signed-off-by: Marcus G K Williams - Merge remote-tracking branch 'upstream/master' into update_hlp - Cast os::vm_page_size to size_t, fix build Signed-off-by: Marcus G K Williams - Merge branch 'master' into pull/1153 - kstefanj update Signed-off-by: Marcus G K Williams - Merge branch 'master' into update_hlp - Merge branch 'master' into update_hlp - Remove extraneous ' from warning Signed-off-by: Marcus G K Williams - Merge branch 'master' into update_hlp - Merge branch 'master' into update_hlp - ... and 18 more: https://git.openjdk.java.net/jdk/compare/57f16f9f...e0c54616 ------------- Changes: https://git.openjdk.java.net/jdk/pull/1153/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=17 Stats: 130 lines in 2 files changed: 60 ins; 46 del; 24 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From minqi at openjdk.java.net Thu Mar 11 04:03:29 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 11 Mar 2021 04:03:29 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: References: Message-ID: <9EU_DwWh3XcyBxJkxgPH1qzvbaa2hvWQYuccdRXWKj0=.c6816df0-6e73-45bc-9e52-caa70b0611fd@github.com> > Hi, Please review > > Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with > `java -XX:DumpLoadedClassList= .... ` > to collect shareable class names and saved in file `` , then > `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` > With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. > The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. > New added jcmd option: > `jcmd VM.cds static_dump ` > or > `jcmd VM.cds dynamic_dump ` > To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. > The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. > > Tests: tier1,tier2,tier3,tier4 > > Thanks > Yumin Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Fix filter more flags to exclude in static dump, add more test cases - Merge branch 'master' into jdk-8259070 - Fix white space in CDS.java - Add function CDS.dumpSharedArchive in java to dump shared archive - 8259070: Add jcmd option to dump CDS ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2737/files - new: https://git.openjdk.java.net/jdk/pull/2737/files/bfa71577..a9010f8f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=02-03 Stats: 13690 lines in 458 files changed: 7913 ins; 3760 del; 2017 mod Patch: https://git.openjdk.java.net/jdk/pull/2737.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737 PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Thu Mar 11 04:16:11 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 11 Mar 2021 04:16:11 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v3] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 04:18:29 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix white space in CDS.java > > src/hotspot/share/services/diagnosticCommand.cpp line 1124: > >> 1122: } >> 1123: Symbol* cds_name = vmSymbols::jdk_internal_misc_CDS(); >> 1124: Klass* cds_klass = SystemDictionary::resolve_or_null(cds_name, THREAD); > > Should be `cds_klass = SystemDictionary::resolve_or_fail(cds_name, CHECK);` Changed to use resolve_or_fail. > src/java.base/share/classes/jdk/internal/misc/CDS.java line 213: > >> 211: testStr.contains("-XX:+DynamicDumpSharedSpaces") || >> 212: testStr.contains("-XX:+RecordDynamicDumpInfo"); >> 213: } > > The following flags should also be excluded: > > - -XX:-DumpSharedSpaces > - -Xshare: > - -XX:SharedClassListFile= > - -XX:SharedArchiveFile= > - -XX:ArchiveClassesAtExit= > - -XX:+UseSharedSpaces > - -XX:+RequireSharedSpaces > > We also need to have a few test cases when the LingeredApp is started with these flags. Added String[] for those flags to check. > src/java.base/share/classes/jdk/internal/misc/CDS.java line 262: > >> 260: String line; >> 261: InputStreamReader isr = new InputStreamReader(proc.getInputStream()); >> 262: BufferedReader rdr = new BufferedReader(isr); > > Also, I think the output should always be logged. Otherwise if an error happens, it's very difficult for the user to diagnose (and they won't know about the "CDS.Debug" property). Yes, done with separate thread. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From rehn at openjdk.java.net Thu Mar 11 07:36:09 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 11 Mar 2021 07:36:09 GMT Subject: RFR: 8262443: GenerateOopMap::do_interpretation can spin for a long time. [v2] In-Reply-To: <1uF96utjrXt_XB7Rn94ihM_s4Ok2p91h2hliK-wXSi0=.102680db-b285-4980-899c-24c6cae9154e@github.com> References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> <1uF96utjrXt_XB7Rn94ihM_s4Ok2p91h2hliK-wXSi0=.102680db-b285-4980-899c-24c6cae9154e@github.com> Message-ID: On Mon, 8 Mar 2021 20:33:43 GMT, David Holmes wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Comment, local JavaThread variable >> - Merge branch 'master' into 8262443-gen-oop-map >> - Go to blocked when loop > > Still good. Thanks @dholmes-ora, @dcubed-ojdk and @coleenp! ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From rehn at openjdk.java.net Thu Mar 11 07:36:10 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Thu, 11 Mar 2021 07:36:10 GMT Subject: Integrated: 8262443: GenerateOopMap::do_interpretation can spin for a long time. In-Reply-To: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> References: <28Qx7h9l5ubaDYe_QeS8uRIv_XTctt7Kog8BLx-_0Y8=.37a9d5f0-f1ae-4c7d-b92e-64a62fd12ed6@github.com> Message-ID: On Fri, 26 Feb 2021 08:50:38 GMT, Robbin Ehn wrote: > With Safepoint/Handshake timeout enabled in rare cases this methods spins for a long time, blocking safepoints/handshakes, so timeout (with a long delay) is triggered. > > In some cases we are in native while executing this method and in some in vm. > That's why there is an check for state in vm. > > Tested with other changes in t-1-7 this specific case of timeout is no longer an issue. > This change-set passes T1 stand alone. This pull request has now been integrated. Changeset: 7988c1d9 Author: Robbin Ehn URL: https://git.openjdk.java.net/jdk/commit/7988c1d9 Stats: 14 lines in 2 files changed: 9 ins; 2 del; 3 mod 8262443: GenerateOopMap::do_interpretation can spin for a long time. Reviewed-by: coleenp, dholmes, dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/2742 From shade at openjdk.java.net Thu Mar 11 09:49:19 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 11 Mar 2021 09:49:19 GMT Subject: RFR: 8263430: Uninitialized Method* variables after JDK-8233913 Message-ID: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> SonarCloud instance reports problems like: The left operand of '==' is a garbage value C2V_VMENTRY_NULL(jobject, getResolvedJavaMethod, (JNIEnv* env, jobject, jobject base, jlong offset)) Method* method; ... if (method == NULL) { // <--- here JVMCI_THROW_MSG_NULL(IllegalArgumentException, err_msg("Unexpected type: %s", JVMCIENV->klass_name(base_object))); } I believe this is caused by refactoring in [JDK-8233913](https://bugs.openjdk.java.net/browse/JDK-8233913) that [replaced](https://hg.openjdk.java.net/jdk/jdk/rev/15936b142f86#l39.38) `methodHandle` with naked `Method*`. `methodHandle` is implicitly initialized to null, while naked variable is not. After reading the original changeset, I found two other places where the same thing happens. ------------- Commit messages: - 8263430: Uninitialized Method* variables after JDK-8233913 Changes: https://git.openjdk.java.net/jdk/pull/2936/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2936&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263430 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/2936.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2936/head:pull/2936 PR: https://git.openjdk.java.net/jdk/pull/2936 From aph at openjdk.java.net Thu Mar 11 10:13:08 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Thu, 11 Mar 2021 10:13:08 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> Message-ID: On Wed, 10 Mar 2021 23:49:43 GMT, Yasumasa Suenaga wrote: >> I think you need this to cope with errors, empty files, and so on: >> >> void VM_Version::get_compatible_board(char *buf, int buflen) { >> assert(buf != NULL, "invalid argument"); >> assert(buflen >= 1, "invalid argument"); >> *buf = '\0'; >> int fd = open("/proc/device-tree/compatible", O_RDONLY); >> ssize_t read_sz = read(fd, buf, buflen - 1); >> if (read_sz >= 0) { >> buf[read_sz] = '\0'; >> // Replace '\0' to ' ' >> for (char *ch = buf; ch < buf + read_sz; ch++) { >> if (*ch == '\0') { >> *ch = ' '; >> } >> } >> } else { // read() retuned an error >> buf[0] = '\0'; >> } >> =>close(fd); >> } > > I updated PR, but it is different from your suggestion a bit. This version would return empty string when `open()` or `read()` returns error, and also `close()` would be called when `open()` succeeded. > > Did you say we can call `read()` and `close()` even if `open()` failed? According to manpage, it seems to work (just return EBADF), but I feel strange a bit. If it is ok, I will do so. No, if `open()` fails we should return straight away, with an empty string. That needs an addition. We must, however, terminate the string with 0 at the correct point, at the end of the bytes read. Otherwise ` strlen() `reads uninitialized memory. If the `read()` fails, we must return an empty string. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From shade at openjdk.java.net Thu Mar 11 11:12:26 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 11 Mar 2021 11:12:26 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP Message-ID: SonarCloud reports the following problem in MethodComparator::methods_EMCP: "Address of stack memory associated with local variable 's_new' is still referred to by the global variable '_s_new' upon returning to the caller. This will be a dangling reference" Code inspection reveals the assignment to static variables is only needed to pass them to helper methods. So, while this is not a detectable bug (yet), it is still cleaner not to expose stack variables in globals. Additional testing: - [x] Linux x86_64 fastdebug `tier1` - [x] Linux x86_64 fastdebug, `vmTestbase_nsk_jvmti` ------------- Commit messages: - 8263434: Dangling references after MethodComparator::methods_EMCP Changes: https://git.openjdk.java.net/jdk/pull/2937/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2937&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263434 Stats: 89 lines in 2 files changed: 2 ins; 8 del; 79 mod Patch: https://git.openjdk.java.net/jdk/pull/2937.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2937/head:pull/2937 PR: https://git.openjdk.java.net/jdk/pull/2937 From enikitin at openjdk.java.net Thu Mar 11 11:30:07 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Thu, 11 Mar 2021 11:30:07 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v4] In-Reply-To: <2BSuC2-KIMim4io0E8ZbhYRYr8-1OHxhQhSuBDYpGjs=.d0426116-ef82-405f-92f3-d6e642fac0f3@github.com> References: <2a-RCfm845Xe9B92_9mx-qGY9uDbMBBA95WSkaS6X4g=.4493f0d9-f318-4200-8532-27182367cead@github.com> <2BSuC2-KIMim4io0E8ZbhYRYr8-1OHxhQhSuBDYpGjs=.d0426116-ef82-405f-92f3-d6e642fac0f3@github.com> Message-ID: On Wed, 10 Mar 2021 18:03:44 GMT, Igor Ignatyev wrote: > could you please introduce a (or two) static final field for these constants and use them here? Fixed in the new version. ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From vlivanov at openjdk.java.net Thu Mar 11 11:45:05 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 11 Mar 2021 11:45:05 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 23:32:14 GMT, Vladimir Kozlov wrote: >> I vividly remember I fixed the very same problem before in Panama. >> >> More specifically, why doesn't `VectorSupport::allocate_vector_payload` handle the case? >> >> Handle VectorSupport::allocate_vector_payload(InstanceKlass* ik, frame* fr, RegisterMap* reg_map, ScopeValue* payload, TRAPS) { >> if (payload->is_location() && >> payload->as_LocationValue()->location().type() == Location::vector) { >> // Vector value in an aligned adjacent tuple (1, 2, 4, 8, or 16 slots). >> Location location = payload->as_LocationValue()->location(); >> return allocate_vector_payload_helper(ik, fr, reg_map, location, THREAD); // safepoint >> } else { >> // Scalar-replaced boxed vector representation. >> StackValue* value = StackValue::create_stack_value(fr, reg_map, payload); >> return value->get_obj(); >> } >> } > > Here is the order of scalarized objects in this case: > # jdk.incubator.vector.ShortVector::rearrangeTemplate @ bci:14 (line 2117) L[0]=#ScObj0 L[1]=_ L[2]=#ScObj1 L[3]=#ScObj2 L[4]=_ L[5]=_ STK[0]=#Ptr0x00007f83a4e99850 STK[1]=#Ptr0x00007f83bd684760 STK[2]=#Ptr0x00007f83bca050f0 STK[3]=#4 STK[4]=#ScObj0 STK[5]=#ScObj1 > # ScObj0 jdk/incubator/vector/Short64Vector={ [payload :0]=rsp + #40 } > # ScObj1 jdk/incubator/vector/Short64Vector$Short64Shuffle={ [payload :0]=#ScObj3 } > # ScObj2 jdk/incubator/vector/Short64Vector$Short64Mask={ [payload :0]=rsp + #32 } > # ScObj3 byte[4]={ [0]=rsp + #48 , [1]=rsp + #52 , [2]=rsp + #56 , [3]=RBP } > `ScObj1 Short64Shuffle` vector will be processed before `ScObj3 byte[4]` which it references. > VectorSupport::allocate_vector_payload() called only during `ScObj1 Short64Shuffle` reallocation when `ScObj3 byte[4]` is not reallocated yet. As result `value->get_obj()` return NULL. And that is the bug. > > I forgot to say that the bug is triggered only with iterative EA I am working on. Without it `byte[]` array is not scalarized and everything works. > > I also thought about calling modified `VectorSupport::allocate_vector_payload()` from `Deoptimization::reassign_fields()` to keep checks in one place but it would require more code changes then suggested fix (which is mostly asserts and debug print). Thanks for the clarifications, Vladimir. I agree that `VectorSupport::allocate_vector_payload` is not the right place to handle the problematic case. Some cleanup suggestions: now you can remove `StackValue::create_stack_value()`-related code from`VectorSupport::allocate_vector_payload()`, replace `ScopeValue* payload` argument with `Location location`, and turn `location.type() == Location::vector` check into an assert. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From vlivanov at openjdk.java.net Thu Mar 11 11:52:06 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 11 Mar 2021 11:52:06 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 19:58:22 GMT, Vladimir Kozlov wrote: > I renamed incorrect eliminate_* names for methods which restore/reallocate objects and locks Fully agree that `eliminate_allocations`/`eliminate_locks` are misleading, but `restore_*` still look a bit confusing to me. What do you think about `rematerialize_objects`/`rematerialize_scalarized_objects`/`relock_objects`/`restore_eliminated_locks`? ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From ysuenaga at openjdk.java.net Thu Mar 11 12:25:09 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Thu, 11 Mar 2021 12:25:09 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> Message-ID: On Thu, 11 Mar 2021 10:08:33 GMT, Andrew Haley wrote: >> I updated PR, but it is different from your suggestion a bit. This version would return empty string when `open()` or `read()` returns error, and also `close()` would be called when `open()` succeeded. >> >> Did you say we can call `read()` and `close()` even if `open()` failed? According to manpage, it seems to work (just return EBADF), but I feel strange a bit. If it is ok, I will do so. > > No, if `open()` fails we should return straight away, with an empty string. That needs an addition. > We must, however, terminate the string with 0 at the correct point, at the end of the bytes read. Otherwise ` strlen() `reads uninitialized memory. If the `read()` fails, we must return an empty string. I think my latest commit includes your suggestion: * returns empty string ( `\0` ) when `open()` failed. * returns empty string when `read()` failed or read nothing (returns `0` ) * add `\0` to `buf[read_sz]` just after `read()` call, and skip it at the loop - it can be assumed `\0` is set to tail of `buf` Or should I change as following for readability? int fd = open("/proc/device-tree/compatible", O_RDONLY); if (fd == -1) { *buf = '\0'; return; } ssize_t read_sz = read(fd, buf, buflen - 1); if (read_sz <= 0) { *buf = '\0'; return; } // Add '\0' to the tail buf[read_sz] = '\0'; // Replace '\0' to ' ' for (char *ch = buf; ch < buf + read_sz; ch++) { if (*ch == '\0') { *ch = ' '; } } close(fd); ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From vlivanov at openjdk.java.net Thu Mar 11 13:14:17 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Thu, 11 Mar 2021 13:14:17 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 20:08:24 GMT, Vladimir Kozlov wrote: >> Currently during deoptimization Vector's `payload` field values are restored during Vector reallocation: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L155 >> >> But for scalar-replaced values this is not correct because payload box object could be re-allocated after allocation of this vector. Scalar-replaced `payload` should be restored during regular fields reassignment (`Deoptimization::reassign_fields()` change). >> >> I renamed incorrect `eliminate_*` names for methods which restore/reallocate objects and locks. >> >> I added checks for EliminateAutoBox and EnableVectorAggressiveReboxing optimizations which can replace allocations with scalar objects independent from Escape Analysis. >> >> I added prints for unexpected StackValue values (stackValue.cpp) and for Vector debug info location type (location.cpp). > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Update Copyright year Marked as reviewed by vlivanov (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From coleenp at openjdk.java.net Thu Mar 11 13:38:07 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 11 Mar 2021 13:38:07 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 11:02:43 GMT, Aleksey Shipilev wrote: > SonarCloud reports the following problem in MethodComparator::methods_EMCP: > "Address of stack memory associated with local variable 's_new' is still referred to by the global variable '_s_new' upon returning to the caller. This will be a dangling reference" > > Code inspection reveals the assignment to static variables is only needed to pass them to helper methods. So, while this is not a detectable bug (yet), it is still cleaner not to expose stack variables in globals. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Linux x86_64 fastdebug, `vmTestbase_nsk_jvmti` This is much cleaner! Thank you, and thank you SonarCloud. src/hotspot/share/prims/methodComparator.cpp line 262: > 260: } > 261: > 262: bool MethodComparator::pool_constants_same(int cpi_old, int cpi_new, ConstantPool* old_cp, ConstantPool* new_cp) { Can these be const? src/hotspot/share/prims/methodComparator.cpp line 67: > 65: bool MethodComparator::args_same(Bytecodes::Code c_old, Bytecodes::Code c_new, > 66: BytecodeStream* s_old, BytecodeStream* s_new, > 67: ConstantPool* old_cp, ConstantPool* new_cp) { Can these be const pointers too? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2937 From hseigel at openjdk.java.net Thu Mar 11 13:49:25 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 11 Mar 2021 13:49:25 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v3] In-Reply-To: References: Message-ID: <4wskw3O1myB1FaFdB5DLTSixbSMdvwLfsu3eCNuEqqA=.fed3e12a-4954-4fbb-81cc-506dc89adeab@github.com> > Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: remove incorrect changes ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2895/files - new: https://git.openjdk.java.net/jdk/pull/2895/files/32acbaa1..71a9c057 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2895&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2895&range=01-02 Stats: 8 lines in 1 file changed: 0 ins; 7 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2895.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2895/head:pull/2895 PR: https://git.openjdk.java.net/jdk/pull/2895 From shade at openjdk.java.net Thu Mar 11 13:50:24 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 11 Mar 2021 13:50:24 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP [v2] In-Reply-To: References: Message-ID: > SonarCloud reports the following problem in MethodComparator::methods_EMCP: > "Address of stack memory associated with local variable 's_new' is still referred to by the global variable '_s_new' upon returning to the caller. This will be a dangling reference" > > Code inspection reveals the assignment to static variables is only needed to pass them to helper methods. So, while this is not a detectable bug (yet), it is still cleaner not to expose stack variables in globals. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Linux x86_64 fastdebug, `vmTestbase_nsk_jvmti` Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Sprinkling consts ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2937/files - new: https://git.openjdk.java.net/jdk/pull/2937/files/9a43dca1..5acc4807 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2937&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2937&range=00-01 Stats: 13 lines in 2 files changed: 3 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/2937.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2937/head:pull/2937 PR: https://git.openjdk.java.net/jdk/pull/2937 From shade at openjdk.java.net Thu Mar 11 13:50:25 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 11 Mar 2021 13:50:25 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP [v2] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 13:33:39 GMT, Coleen Phillimore wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Sprinkling consts > > src/hotspot/share/prims/methodComparator.cpp line 67: > >> 65: bool MethodComparator::args_same(Bytecodes::Code c_old, Bytecodes::Code c_new, >> 66: BytecodeStream* s_old, BytecodeStream* s_new, >> 67: ConstantPool* old_cp, ConstantPool* new_cp) { > > Can these be const pointers too? Yes, they can. Sprinkled. > src/hotspot/share/prims/methodComparator.cpp line 262: > >> 260: } >> 261: >> 262: bool MethodComparator::pool_constants_same(int cpi_old, int cpi_new, ConstantPool* old_cp, ConstantPool* new_cp) { > > Can these be const? Yes, they can. Sprinkled. ------------- PR: https://git.openjdk.java.net/jdk/pull/2937 From hseigel at openjdk.java.net Thu Mar 11 13:57:19 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 11 Mar 2021 13:57:19 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v4] In-Reply-To: References: Message-ID: > Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: remove #include ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2895/files - new: https://git.openjdk.java.net/jdk/pull/2895/files/71a9c057..4e395bde Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2895&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2895&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2895.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2895/head:pull/2895 PR: https://git.openjdk.java.net/jdk/pull/2895 From akozlov at openjdk.java.net Thu Mar 11 14:07:44 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 11 Mar 2021 14:07:44 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v23] In-Reply-To: References: Message-ID: <5oHOwsaj5Jpg7ukRTFk7MG--w0_bq2qSwH6FN0WOZNY=.e519c482-ba19-4e24-955d-07743ab92359@github.com> On Wed, 3 Mar 2021 15:57:13 GMT, Gerard Ziemski wrote: >> src/hotspot/os_cpu/bsd_aarch64/os_bsd_aarch64.cpp line 207: >> >>> 205: // Enable WXWrite: this function is called by the signal handler at arbitrary >>> 206: // point of execution. >>> 207: ThreadWXEnable wx(WXWrite, thread); >> >> Note that `thread` can be NULL here if the signal handler is running in a non-attached thread. If we then perform: >> `ThreadWXEnable(WXMode new_mode, Thread* thread = NULL) : >> _thread(thread ? thread : Thread::current()),` >> we call Thread::current() on a non-attached thread and that will assert/crash if we get NULL. Either avoid using WX when the thread is NULL, or else change to use Thread::current_or_null_safe() and ensure all uses have a NULL check. > >> Note that `thread` can be NULL here if the signal handler is running in a non-attached thread. If we then perform: >> `ThreadWXEnable(WXMode new_mode, Thread* thread = NULL) : _thread(thread ? thread : Thread::current()),` >> we call Thread::current() on a non-attached thread and that will assert/crash if we get NULL. Either avoid using WX when the thread is NULL, or else change to use Thread::current_or_null_safe() and ensure all uses have a NULL check. > > https://bugs.openjdk.java.net/browse/JDK-8262903 tracks this issue. Thanks for report and analysis! I fixed this in https://github.com/openjdk/jdk/pull/2200/commits/f6fb01b24f525e578692a1c6f2ff0a55b8233576 ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Thu Mar 11 14:07:43 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 11 Mar 2021 14:07:43 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v25] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: 8262903: [macos_aarch64] Thread::current() called on detached thread ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2200/files - new: https://git.openjdk.java.net/jdk/pull/2200/files/a72f6834..f6fb01b2 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=24 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=23-24 Stats: 13 lines in 5 files changed: 3 ins; 0 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Thu Mar 11 14:24:17 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 11 Mar 2021 14:24:17 GMT Subject: RFR: 8263430: Uninitialized Method* variables after JDK-8233913 In-Reply-To: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> References: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> Message-ID: On Thu, 11 Mar 2021 09:43:52 GMT, Aleksey Shipilev wrote: > SonarCloud instance reports problems like: > The left operand of '==' is a garbage value > > C2V_VMENTRY_NULL(jobject, getResolvedJavaMethod, (JNIEnv* env, jobject, jobject base, jlong offset)) > Method* method; > ... > if (method == NULL) { // <--- here > JVMCI_THROW_MSG_NULL(IllegalArgumentException, err_msg("Unexpected type: %s", JVMCIENV->klass_name(base_object))); > } > > I believe this is caused by refactoring in [JDK-8233913](https://bugs.openjdk.java.net/browse/JDK-8233913) that [replaced](https://hg.openjdk.java.net/jdk/jdk/rev/15936b142f86#l39.38) `methodHandle` with naked `Method*`. `methodHandle` is implicitly initialized to null, while naked variable is not. After reading the original changeset, I found two other places where the same thing happens. Looks good. src/hotspot/share/interpreter/linkResolver.cpp line 1153: > 1151: // superinterface.method, which explicitly does not check shadowing > 1152: Klass* resolved_klass = link_info.resolved_klass(); > 1153: Method* resolved_method = NULL; Why would it complain about this one? resolved_method is going to get some value in the next 4 lines. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2936 From shade at openjdk.java.net Thu Mar 11 14:24:18 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 11 Mar 2021 14:24:18 GMT Subject: RFR: 8263430: Uninitialized Method* variables after JDK-8233913 In-Reply-To: References: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> Message-ID: On Thu, 11 Mar 2021 14:16:46 GMT, Coleen Phillimore wrote: >> SonarCloud instance reports problems like: >> The left operand of '==' is a garbage value >> >> C2V_VMENTRY_NULL(jobject, getResolvedJavaMethod, (JNIEnv* env, jobject, jobject base, jlong offset)) >> Method* method; >> ... >> if (method == NULL) { // <--- here >> JVMCI_THROW_MSG_NULL(IllegalArgumentException, err_msg("Unexpected type: %s", JVMCIENV->klass_name(base_object))); >> } >> >> I believe this is caused by refactoring in [JDK-8233913](https://bugs.openjdk.java.net/browse/JDK-8233913) that [replaced](https://hg.openjdk.java.net/jdk/jdk/rev/15936b142f86#l39.38) `methodHandle` with naked `Method*`. `methodHandle` is implicitly initialized to null, while naked variable is not. After reading the original changeset, I found two other places where the same thing happens. > > src/hotspot/share/interpreter/linkResolver.cpp line 1153: > >> 1151: // superinterface.method, which explicitly does not check shadowing >> 1152: Klass* resolved_klass = link_info.resolved_klass(); >> 1153: Method* resolved_method = NULL; > > Why would it complain about this one? resolved_method is going to get some value in the next 4 lines. Ah yes, it does not complain about this one. I just initialized all locals that were introduced in your original change. I can revert this hunk, if you want. ------------- PR: https://git.openjdk.java.net/jdk/pull/2936 From tschatzl at openjdk.java.net Thu Mar 11 14:53:13 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 11 Mar 2021 14:53:13 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 8 Mar 2021 17:29:27 GMT, Jaroslav Bachorik wrote: >> The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. >> >> ## Introducing new JFR event >> >> While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. >> Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. >> >> ## Implementation >> >> The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. >> >> The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. >> >> ### Epsilon GC >> >> Trivial implementation - just return `used()` instead. >> >> ### Serial GC >> >> Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). >> >> ### Parallel GC >> >> For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). >> >> ### G1 GC >> >> Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. >> >> ### Shenandoah >> >> In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. >> This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. >> >> ### ZGC >> >> `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused field I am leaving this as "request changes" for now as the question I had earlier about that after G1 Full gc the value of `_live_estimate` still seems unanswered and there does not seem to be code in this change for this. Is this intentional? (Not even setting the live bytes to `used()` which at that point would be a good estimate) There is another PR (#2760) that implements something like that although I haven't looked at it in detail. Otherwise looks okay. src/hotspot/share/gc/shared/space.inline.hpp line 140: > 138: size_t get_dead_space() { > 139: return (_max_deadspace_words - _allowed_deadspace_words) * HeapWordSize; > 140: } Hotspot does not use a "get_" prefix for getters. Also not sure why this needs to be private (and the friend class), I would prefer this instead of the friending. Retrieving the actual amount of dead space from a class that calculates it does not seem something that needs hiding. ------------- Changes requested by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2579 From dcubed at openjdk.java.net Thu Mar 11 15:24:08 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Thu, 11 Mar 2021 15:24:08 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP [v2] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 13:35:20 GMT, Coleen Phillimore wrote: >> Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: >> >> Sprinkling consts > > This is much cleaner! Thank you, and thank you SonarCloud. Please make sure that you get a review from someone on the Serviceability team. ------------- PR: https://git.openjdk.java.net/jdk/pull/2937 From tschatzl at openjdk.java.net Thu Mar 11 15:46:11 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 11 Mar 2021 15:46:11 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 11 Mar 2021 14:50:10 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unused field > > I am leaving this as "request changes" for now as the question I had earlier about that after G1 Full gc the value of `_live_estimate` still seems unanswered and there does not seem to be code in this change for this. Is this intentional? (Not even setting the live bytes to `used()` which at that point would be a good estimate) > > There is another PR (#2760) that implements something like that although I haven't looked at it in detail. > > Otherwise looks okay. Started reviewing PR #2760, and it implements liveness calculation for G1 full gc. I also suggested [there](https://github.com/openjdk/jdk/pull/2760#discussion_r592449837) to extract this functionality out into an extra CR. Maybe you can work together. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From coleenp at openjdk.java.net Thu Mar 11 16:27:07 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 11 Mar 2021 16:27:07 GMT Subject: RFR: 8263430: Uninitialized Method* variables after JDK-8233913 In-Reply-To: References: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> Message-ID: On Thu, 11 Mar 2021 14:20:52 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/interpreter/linkResolver.cpp line 1153: >> >>> 1151: // superinterface.method, which explicitly does not check shadowing >>> 1152: Klass* resolved_klass = link_info.resolved_klass(); >>> 1153: Method* resolved_method = NULL; >> >> Why would it complain about this one? resolved_method is going to get some value in the next 4 lines. > > Ah yes, it does not complain about this one. I just initialized all locals that were introduced in your original change. I can revert this hunk, if you want. I don't think it matters that much. Go ahead and leave it. I'm glad it didn't complain about that. ------------- PR: https://git.openjdk.java.net/jdk/pull/2936 From hseigel at openjdk.java.net Thu Mar 11 16:33:06 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 11 Mar 2021 16:33:06 GMT Subject: RFR: 8263430: Uninitialized Method* variables after JDK-8233913 In-Reply-To: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> References: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> Message-ID: <2yVh8F9T-xg0IbNHr0nwuW5IncGC9yjX2oWokeYAM2Y=.f14d1987-074b-4876-8dcc-f1961d25beb4@github.com> On Thu, 11 Mar 2021 09:43:52 GMT, Aleksey Shipilev wrote: > SonarCloud instance reports problems like: > The left operand of '==' is a garbage value > > C2V_VMENTRY_NULL(jobject, getResolvedJavaMethod, (JNIEnv* env, jobject, jobject base, jlong offset)) > Method* method; > ... > if (method == NULL) { // <--- here > JVMCI_THROW_MSG_NULL(IllegalArgumentException, err_msg("Unexpected type: %s", JVMCIENV->klass_name(base_object))); > } > > I believe this is caused by refactoring in [JDK-8233913](https://bugs.openjdk.java.net/browse/JDK-8233913) that [replaced](https://hg.openjdk.java.net/jdk/jdk/rev/15936b142f86#l39.38) `methodHandle` with naked `Method*`. `methodHandle` is implicitly initialized to null, while naked variable is not. After reading the original changeset, I found two other places where the same thing happens. Changes look good. Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2936 From hseigel at openjdk.java.net Thu Mar 11 18:28:19 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Thu, 11 Mar 2021 18:28:19 GMT Subject: RFR: 8178348: left_n_bits(0) invokes undefined behavior Message-ID: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> Please review this small change to remove the unused left_n_bits(n) macro. This change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS. Thanks, Harold ------------- Commit messages: - 8178348: left_n_bits(0) invokes undefined behavior Changes: https://git.openjdk.java.net/jdk/pull/2944/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2944&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8178348 Stats: 5 lines in 2 files changed: 0 ins; 4 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2944.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2944/head:pull/2944 PR: https://git.openjdk.java.net/jdk/pull/2944 From rkennke at openjdk.java.net Thu Mar 11 18:43:24 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 11 Mar 2021 18:43:24 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable Message-ID: We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. I believe this might be the root cause for JDK-8262852. Testing: - [x] New testcase failed without change, passes now - [ ] hotspot_gc_shenandoah - [ ] tier1 (+Shenandoah) ------------- Commit messages: - 8263427: Shenandoah: Trigger weak-LRB even when heap is stable Changes: https://git.openjdk.java.net/jdk/pull/2945/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263427 Stats: 188 lines in 12 files changed: 161 ins; 12 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/2945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2945/head:pull/2945 PR: https://git.openjdk.java.net/jdk/pull/2945 From rkennke at openjdk.java.net Thu Mar 11 18:56:10 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 11 Mar 2021 18:56:10 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 18:50:48 GMT, Aleksey Shipilev wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [ ] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 607: > >> 605: VerifyThreadGCState(const char* label, char expected) : _label(label), _expected(expected) {} >> 606: void do_thread(Thread* t) { >> 607: char actual = ShenandoahThreadLocalData::gc_state(t) & ~((char)ShenandoahHeap::WEAK_ROOTS); > > Why these adjustments are needed? I think it shows we don't play well with GC-state lifecycle here... The verifier would complain about the extra bit being set or maybe not set (e.g. concurrent cycles would set it, but degen cycles would not). We haven't verified conc-weakroots-in-progress before, so I figured it would be easiest to keep it that way and mask it out. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From shade at openjdk.java.net Thu Mar 11 18:56:07 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 11 Mar 2021 18:56:07 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 18:38:26 GMT, Roman Kennke wrote: > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > Testing: > - [x] New testcase failed without change, passes now > - [ ] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) I would need some time to understand this. src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 607: > 605: VerifyThreadGCState(const char* label, char expected) : _label(label), _expected(expected) {} > 606: void do_thread(Thread* t) { > 607: char actual = ShenandoahThreadLocalData::gc_state(t) & ~((char)ShenandoahHeap::WEAK_ROOTS); Why these adjustments are needed? I think it shows we don't play well with GC-state lifecycle here... ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From zgu at openjdk.java.net Thu Mar 11 19:27:06 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 11 Mar 2021 19:27:06 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable In-Reply-To: References: Message-ID: <9J6biNSo_KaeRY-EzdOAb7XCeVU6xpw_o9Tk7Tu8peM=.70097cc8-6a8e-4408-83cc-7a8012807447@github.com> On Thu, 11 Mar 2021 18:52:54 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp line 607: >> >>> 605: VerifyThreadGCState(const char* label, char expected) : _label(label), _expected(expected) {} >>> 606: void do_thread(Thread* t) { >>> 607: char actual = ShenandoahThreadLocalData::gc_state(t) & ~((char)ShenandoahHeap::WEAK_ROOTS); >> >> Why these adjustments are needed? I think it shows we don't play well with GC-state lifecycle here... > > The verifier would complain about the extra bit being set or maybe not set (e.g. concurrent cycles would set it, but degen cycles would not). We haven't verified conc-weakroots-in-progress before, so I figured it would be easiest to keep it that way and mask it out. ShenandoahGCStateResetter used to disable the flag, so it makes sense. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From zgu at openjdk.java.net Thu Mar 11 19:32:08 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 11 Mar 2021 19:32:08 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 18:38:26 GMT, Roman Kennke wrote: > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. > > There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. > > This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) > > Testing: > - [x] New testcase failed without change, passes now > - [x] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) Changes requested by zgu (Reviewer). test/hotspot/jtreg/gc/shenandoah/TestReferenceShortcutCycle.java line 86: > 84: private static void testConcurrentCollection() throws Exception { > 85: setup(); > 86: WB.concurrentGCAcquireControl(); Need to ensure a concurrent GC between setup and test, otherwise, referent can be live. I think calling WB.concurrentGCRunToIdle() before WB.concurrentGCRunTo(WB.AFTER_CONCURRENT_REFERENCE_PROCESSING_STARTED) will do the trick. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From rkennke at openjdk.java.net Thu Mar 11 20:17:28 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Thu, 11 Mar 2021 20:17:28 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v2] In-Reply-To: References: Message-ID: > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. > > There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. > > This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) > > Testing: > - [x] New testcase failed without change, passes now > - [x] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Ensure test does a complete GC cycle before verification ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2945/files - new: https://git.openjdk.java.net/jdk/pull/2945/files/eec5e186..94575b41 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2945/head:pull/2945 PR: https://git.openjdk.java.net/jdk/pull/2945 From lfoltan at openjdk.java.net Thu Mar 11 20:20:08 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Thu, 11 Mar 2021 20:20:08 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v4] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 13:57:19 GMT, Harold Seigel wrote: >> Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > remove #include Ok, this change looks good Harold! Sorry I missed the issues in my first review. Lois ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2895 From stefank at openjdk.java.net Thu Mar 11 20:31:16 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 11 Mar 2021 20:31:16 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v25] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 14:07:43 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > 8262903: [macos_aarch64] Thread::current() called on detached thread Marked as reviewed by stefank (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From stefank at openjdk.java.net Thu Mar 11 20:31:17 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 11 Mar 2021 20:31:17 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v12] In-Reply-To: References: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> Message-ID: <3NYUmXmjyZFhGJwrHfEjSRX1VRaPjt5cCp9HRBxODbM=.4880b6d1-f6dd-45db-95f4-9064e9204d87@github.com> On Tue, 9 Mar 2021 17:55:12 GMT, Anton Kozlov wrote: >> src/hotspot/share/runtime/thread.cpp line 2515: >> >>> 2513: void JavaThread::check_special_condition_for_native_trans(JavaThread *thread) { >>> 2514: // Enable WXWrite: called directly from interpreter native wrapper. >>> 2515: MACOS_AARCH64_ONLY(ThreadWXEnable wx(WXWrite, thread)); >> >> FWIW, I personally think that adding these MACOS_AARCH64_ONLY usages at the call sites increase the line-noise in the affected functions. I think I would have preferred a version: >> ThreadWXEnable(WXMode new_mode, Thread* thread = NULL) { >> MACOS_AARCH64_ONLY(initialize(new_mode, thread);) {} >> void initialize(...); // Implementation in thread_bsd_aarch64.cpp (alt. inline.hpp) >> With that said, I'm fine with taking this discussion as a follow-up. > > The former version used no such macros. I like that now it's clear the W^X management is relevant to macos/aarch64 only. I see the point to move the pre-processor condition into the class implementation. But I think it will bring a bit of inconsistency, as the rest of W^X implementation is explicitly guarded by preprocessor conditionals. I've also tried to push macro conditionals as far as possible down to implementation, providing a kind of generalized W^X interface. That required a few artificial decisions, e.g. how would we call the mode we execute on the rest of platforms with write and execute allowed, WXWriteExec?.. I abandoned that attempt. I think we would use the same names, but I haven't given it more thought. I might take a look at this after this has been integrated. >> src/hotspot/share/runtime/thread.hpp line 848: >> >>> 846: void init_wx(); >>> 847: WXMode enable_wx(WXMode new_state); >>> 848: #endif // __APPLE__ && AARCH64 >> >> Now that this is only compiled into macOS/AArch64, could this be moved over to thread_bsd_aarch64.hpp? The same goes for the associated functions. > > The thread_bsd_aarch64.hpp describes a part of JavaThread, while this block belongs to Thread for now. Since W^X is an attribute of any operating system thread, I assumed Thread to be the right place for W^X bookkeeping. > > In most cases, we manage W^X state of JavaThread. But sometimes a GC thread needs the WXWrite state, or safefetch is called from non-JavaThread. Probably this can be dealt with (e.g. GCThread to always have the WXWrite state). But such change would be much more than a simple refactoring and it would require a significant amount of testing. Ideally, I would like to investigate this as a follow-up change, or at least after other fixes to this PR. Good point about Thread vs JavaThread. Yes, this can be looked into as follow-up cleanups. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From zgu at openjdk.java.net Thu Mar 11 21:19:08 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Thu, 11 Mar 2021 21:19:08 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v2] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 20:17:28 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Ensure test does a complete GC cycle before verification Looks good ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2945 From enikitin at openjdk.java.net Thu Mar 11 21:22:25 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Thu, 11 Mar 2021 21:22:25 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v5] In-Reply-To: References: Message-ID: > Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes: > > * Code cache size getters are added to WhiteBox; > * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance); > * Dependencies on WhiteBox added for all affected tests; > * The test cases in question un-problemlisted. > > Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86. Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision: Extract allowances into constants ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2523/files - new: https://git.openjdk.java.net/jdk/pull/2523/files/6a3c4785..76b02724 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=03-04 Stats: 6 lines in 1 file changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/2523.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2523/head:pull/2523 PR: https://git.openjdk.java.net/jdk/pull/2523 From kbarrett at openjdk.java.net Thu Mar 11 21:55:08 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 11 Mar 2021 21:55:08 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v4] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 13:57:19 GMT, Harold Seigel wrote: >> Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: > > remove #include Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2895 From iklam at openjdk.java.net Thu Mar 11 22:06:18 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 11 Mar 2021 22:06:18 GMT Subject: RFR: 8263392: Allow current thread to be specified in ExceptionMark Message-ID: `ExceptionMark`, usually used via the `EXCEPTION_MARK` marco, is used to guarantee that an exception is not thrown within a block of code. I made two changes to improve efficiency: - Avoid calling `Thread::current()` if a thread object is already available. - Avoid passing a reference to the `ExceptionMark` constructor. This helps C++ generate slightly better code. This new variant of `ExceptionMark` is mainly intended for future clean up of `TRAPS/CHECK/THREAD` code, where an exception context is temporarily needed but we will guarantee that all exceptions will be handled. I modified `SharedRuntime::monitor_exit_helper()` to use this pattern: Old style: void a_func_that_never_throws() { EXCEPTION_MARK; a_func_that_could_throw(THREAD); if (HAS_PENDING_EXCEPTION) { // handle it CLEAR_PENDING_EXCEPTION; } } New style: void a_func_that_never_throws(Thread* current) { // pass thread to avoid calling Thread::current() ExceptionMark em(current); Thread* THREAD = current; // For exception macros. a_func_that_could_throw(THREAD); if (HAS_PENDING_EXCEPTION) { // handle it CLEAR_PENDING_EXCEPTION; } } ------------- Commit messages: - fixed build - 8263392: Allow current thread to be specified in ExceptionMark Changes: https://git.openjdk.java.net/jdk/pull/2950/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2950&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263392 Stats: 35 lines in 8 files changed: 17 ins; 3 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/2950.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2950/head:pull/2950 PR: https://git.openjdk.java.net/jdk/pull/2950 From github.com+168222+mgkwill at openjdk.java.net Thu Mar 11 22:25:28 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Thu, 11 Mar 2021 22:25:28 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v19] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request incrementally with two additional commits since the last revision: - Fix whitespace error Signed-off-by: Marcus G K Williams - Fix first set of TestTracePageSizes.java issues Signed-off-by: Marcus G K Williams ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/1153/files - new: https://git.openjdk.java.net/jdk/pull/1153/files/e0c54616..90befbe1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=18 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=17-18 Stats: 66 lines in 2 files changed: 40 ins; 8 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From github.com+168222+mgkwill at openjdk.java.net Thu Mar 11 22:35:08 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Thu, 11 Mar 2021 22:35:08 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v16] In-Reply-To: References: <5Hmhp7S8616Kfbdsu5ObzFNy2uUFgJPCp0kvHr-U310=.3cabbe74-fe65-436b-973d-d6f3e64cd743@github.com> Message-ID: On Thu, 4 Mar 2021 07:21:37 GMT, Thomas Stuefe wrote: >>> > > What do you think? I think this would be a bit easier to read and understand, and we have that clear separation between scanning OS info and deciding what we do with it. >>> > >>> > >>> > I think what you propose Thomas looks good. One additional thing to keep in mind and think about here is how we should do the "sanity checking" when allowing multiple large page sizes. I think the best thing would be to sanity check all and if none succeeds disable `UseLargePages`. >>> >>> Oh, sure. I made this not explicit but implied this under "post processing and deciding". Presumably in the context of setup_large_page_type(). >>> >> Sure, got that, just wanted to highlight that we need to figure out how to handle the sanity check for multiple sizes. Should a size that fail the sanity check be removed from the `_page_sizes` member. Maybe `_page_sizes` should include all page sizes, and then we have an additional member for "useable large page sizes". As I said, not sure how to best handle this. >> >>> > > Still a small nit is that we let the user override the OS info with LargePageSizeInBytes. I rather would have a variable containing unmodified OS info, and a separate variable for whatever we make up. But thats just a small issue. >>> > >>> > >>> > I think we need to rethink exactly what `LargePageSizeInBytes` means when allowing multiple large page sizes. I've poked around in this area quite a bit lately and I'm not sure this flag is needed when we scan for available page sizes. But to allow it to go away we would have to change the APIs a bit to start passing down the page size we want to use for a certain mapping rather than using `os::large_page_size()` to get the page size. >>> >>> If we could do without this flag this would be fine for me too. But how would you let the user specify that the VM is to use a different default page size than is set on system level? >> >> I agree, it's not obvious how to make this work in a good way. But using the `os::page_size_for_region*` functions in the upper layers to request a page size could be one solution. But we probably need to have a way to change the "default" value for some cases. >> >> Another thing to think about/discuss is what should be done if a reservation-request within the VM for 4G with 1G pages fail, should we fall straight back to 4k page, should we try 2M page or possible fail hard to show something is probably wrong with the config. > >> Hi @kstefanj and @tstuefe . Trying to resolve your comments and working through your suggestions. I will be responding more over the next day or so as I try to implement and understand what you are proposing. Thanks again for your review and suggestions. > > Well, thanks for your patience :) Hello @tstuefe & @kstefanj. I've updated the change to implement your suggestions, all of them hopefully. I'd appreciate any further review and suggestions. There is one issue with the current code set for which I wanted to get suggestions. Invariably `os::page_size_for_region_aligned(bytes, 1)` in `os::Linux::reserve_memory_special_huge_tlbfs_mixed` returns with 4096, which of course breaks the assert `assert(large_page_size > (size_t)os::vm_page_size(), "large page size: %ld not larger than small page size: %ld", large_page_size, (size_t)os::vm_page_size());` - Line 4059 in os_linux.cpp. Here is the ps.log: > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/home/mgkwill/src/git/jdk/src/hotspot/os/linux/os_linux.cpp:4059), pid=2789288, tid=2789289 > # assert(large_page_size > (size_t)os::vm_page_size()) failed: large page size: 4096 not larger than small page size: 4096 > # > # JRE version: (17.0) (slowdebug build ) > # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.mgkwill.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, serial gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0xfb660a] os::Linux::reserve_memory_special_huge_tlbfs_mixed(unsigned long, unsigned long, char*, bool)+0x146 > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-support/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java/scratch/1/core.2789288) > # > # An error report file with more information is saved as: > # /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-support/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java/scratch/1/hs_err_pid2789288.log > # > # > Aborted (core dumped) > ps-2789288.log: > [0.002s][info][pagesize] Available page sizes: 4k, 2M, 1G > [0.003s][info][pagesize] Available large page sizes: 2M, 1G > [0.005s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 2097152, for bytes: 251658240 > [0.005s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096 > [0.005s][info][pagesize] Memory: 4k page, physical 131844416k(13735672k free), swap 0k(0k free) > [0.005s][info][pagesize] 2048k default large page > [0.005s][info][pagesize] Page Sizes: 4k, 2M, 1G > [0.005s][info][pagesize] CodeHeap 'non-nmethods': min=4M max=6M base=0x00007f8edc600000 page_size=2M size=6M > [0.006s][info][pagesize] CodeHeap 'profiled nmethods': min=4M max=116M base=0x00007f8edcc00000 page_size=2M size=116M > [0.007s][info][pagesize] CodeHeap 'non-profiled nmethods': min=4M max=118M base=0x00007f8ee4000000 page_size=2M size=118M > [0.023s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 2097152, for bytes: 16202596352 > [0.023s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096 > [0.023s][info][pagesize] Memory: 4k page, physical 131844416k(13735672k free), swap 0k(0k free) > [0.023s][info][pagesize] 2048k default large page > [0.023s][info][pagesize] Page Sizes: 4k, 2M, 1G > [0.023s][info][pagesize] Heap: min=8M max=15452M base=0x000000043a400000 page_size=2M size=15452M > [0.023s][info][pagesize] Card Table: min=31645697B max=31645697B base=0x00007f8ef8919000 page_size=4K size=30908K > [0.825s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496 > [0.825s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096 > [0.825s][info][pagesize] Memory: 4k page, physical 131844416k(13732388k free), swap 0k(0k free) > [0.825s][info][pagesize] 2048k default large page > [0.825s][info][pagesize] Page Sizes: 4k, 2M, 1G I'm not sure I understand why `Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496`. Any suggestions as to the issue and solution? Once I solve this I will remove the excessive logging. Thanks, Marcus ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From dholmes at openjdk.java.net Fri Mar 12 04:51:07 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 12 Mar 2021 04:51:07 GMT Subject: RFR: 8178348: left_n_bits(0) invokes undefined behavior In-Reply-To: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> References: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> Message-ID: <27C6S9SwAIdpPhQp85EpQDXOZWVhtqUrUyiaHbmpXZo=.1ade55f1-effd-424c-9f89-de95012295e2@github.com> On Thu, 11 Mar 2021 18:22:33 GMT, Harold Seigel wrote: > Please review this small change to remove the unused left_n_bits(n) macro. This change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS. > > Thanks, Harold If it is unused and broken then removing seems fine. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2944 From ysuenaga at openjdk.java.net Fri Mar 12 05:19:06 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 12 Mar 2021 05:19:06 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> Message-ID: On Thu, 11 Mar 2021 12:22:10 GMT, Yasumasa Suenaga wrote: >> No, if `open()` fails we should return straight away, with an empty string. That needs an addition. >> We must, however, terminate the string with 0 at the correct point, at the end of the bytes read. Otherwise ` strlen() `reads uninitialized memory. If the `read()` fails, we must return an empty string. > > I think my latest commit includes your suggestion: > > * returns empty string ( `\0` ) when `open()` failed. > * returns empty string when `read()` failed or read nothing (returns `0` ) > * add `\0` to `buf[read_sz]` just after `read()` call, and skip it at the loop - it can be assumed `\0` is set to tail of `buf` > > Or should I change as following for readability? > > int fd = open("/proc/device-tree/compatible", O_RDONLY); > if (fd == -1) { > *buf = '\0'; > return; > } > > ssize_t read_sz = read(fd, buf, buflen - 1); > close(fd); > if (read_sz <= 0) { > *buf = '\0'; > return; > } > > // Add '\0' to the tail > buf[read_sz] = '\0'; > // Replace '\0' to ' ' > for (char *ch = buf; ch < buf + read_sz; ch++) { > if (*ch == '\0') { > *ch = ' '; > } > } I tested current PR manually with this Gist code, it works fine without buffer overrun. https://gist.github.com/YaSuenag/bdc73c3540335a096fa289ead36ca8b5 ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From mli at openjdk.java.net Fri Mar 12 05:25:15 2021 From: mli at openjdk.java.net (Hamlin Li) Date: Fri, 12 Mar 2021 05:25:15 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 11 Mar 2021 15:42:51 GMT, Thomas Schatzl wrote: > Started reviewing PR #2760, and it implements liveness calculation for G1 full gc. I also suggested [there](https://github.com/openjdk/jdk/pull/2760#discussion_r592449837) to extract this functionality out into an extra CR. Maybe you can work together. Hi Thomas, Jaroslav, How about we track jfr liveness event in g1 full gc in a separate bug, so this PR #2579 can go ahead without blocking? If you don't mind I can work on new bug later after liveness collection in g1 full gc is finished in another separate bug. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From dholmes at openjdk.java.net Fri Mar 12 05:27:17 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 12 Mar 2021 05:27:17 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v25] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 14:07:43 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > 8262903: [macos_aarch64] Thread::current() called on detached thread src/hotspot/share/runtime/safefetch.inline.hpp line 35: > 33: inline int SafeFetch32(int* adr, int errValue) { > 34: assert(StubRoutines::SafeFetch32_stub(), "stub not yet generated"); > 35: MACOS_AARCH64_ONLY(ThreadWXEnable wx(WXExec, Thread::current())); I think you may have to use `Thread::current_or_null_safe()` here in case this gets called from a signal handling context - see vmError.cpp testing for TestSafeFetchInErrorHandler. Same possibly for SafeFetchN. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From dholmes at openjdk.java.net Fri Mar 12 06:05:06 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 12 Mar 2021 06:05:06 GMT Subject: RFR: 8263392: Allow current thread to be specified in ExceptionMark In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 20:36:41 GMT, Ioi Lam wrote: > `ExceptionMark`, usually used via the `EXCEPTION_MARK` marco, is used to guarantee that an exception is not thrown within a block of code. I made two changes to improve efficiency: > > - Avoid calling `Thread::current()` if a thread object is already available. > - Avoid passing a reference to the `ExceptionMark` constructor. This helps C++ generate slightly better code. > > This new variant of `ExceptionMark` is mainly intended for future clean up of `TRAPS/CHECK/THREAD` code, where an exception context is temporarily needed but we will guarantee that all exceptions will be handled. I modified `SharedRuntime::monitor_exit_helper()` to use this pattern: > > Old style: > > void a_func_that_never_throws() { > EXCEPTION_MARK; > a_func_that_could_throw(THREAD); > if (HAS_PENDING_EXCEPTION) { > // handle it > CLEAR_PENDING_EXCEPTION; > } > } > > New style: > > void a_func_that_never_throws(Thread* current) { // pass thread to avoid calling Thread::current() > ExceptionMark em(current); > Thread* THREAD = current; // For exception macros. > a_func_that_could_throw(THREAD); > if (HAS_PENDING_EXCEPTION) { > // handle it > CLEAR_PENDING_EXCEPTION; > } > } Looks good! Thanks for doing this cleanup. David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2950 From akozlov at openjdk.java.net Fri Mar 12 07:12:17 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 12 Mar 2021 07:12:17 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v25] In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 05:24:10 GMT, David Holmes wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8262903: [macos_aarch64] Thread::current() called on detached thread > > src/hotspot/share/runtime/safefetch.inline.hpp line 35: > >> 33: inline int SafeFetch32(int* adr, int errValue) { >> 34: assert(StubRoutines::SafeFetch32_stub(), "stub not yet generated"); >> 35: MACOS_AARCH64_ONLY(ThreadWXEnable wx(WXExec, Thread::current())); > > I think you may have to use `Thread::current_or_null_safe()` here in case this gets called from a signal handling context - see vmError.cpp testing for TestSafeFetchInErrorHandler. Same possibly for SafeFetchN. I'm not sure about expected behavior then. We may crash trying to execute the generated code, since we may have no WXExec. If we switch to WXExec, we would need to go back to a previous W^X state, but we don't know which one without the thread. BTW, the test passes, probably that's why it didn't get attention. All non-trivial actions in the current implementation of `pd_hotspot_signal_handler` (hhttps://github.com/openjdk/jdk/pull/2200/files#diff-9dcc5bcf908e2f8eb00b2c2837d667687a7540936d8f538ee1ea14e31ad50b40R215-R324) assume non-NULL thread. So AFAICS, we should always have a thread when the SafeFetch is called. Probably a fix to the https://bugs.openjdk.java.net/browse/JDK-8262903 could just move ThreadWXEnable under the `if`. But now after https://github.com/openjdk/jdk/pull/2200/commits/f6fb01b24f525e578692a1c6f2ff0a55b8233576is ThreadWXEnable allows optional W^X state change, like `MutexLocker` allows optional locking. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From shade at openjdk.java.net Fri Mar 12 07:44:07 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 12 Mar 2021 07:44:07 GMT Subject: Integrated: 8263430: Uninitialized Method* variables after JDK-8233913 In-Reply-To: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> References: <3sqyor231ixQhZEk6BuYEc60v6JGRwnV-lw2kWKg9i4=.339299e3-ac9c-486e-a1da-0352f4aa38f1@github.com> Message-ID: On Thu, 11 Mar 2021 09:43:52 GMT, Aleksey Shipilev wrote: > SonarCloud instance reports problems like: > The left operand of '==' is a garbage value > > C2V_VMENTRY_NULL(jobject, getResolvedJavaMethod, (JNIEnv* env, jobject, jobject base, jlong offset)) > Method* method; > ... > if (method == NULL) { // <--- here > JVMCI_THROW_MSG_NULL(IllegalArgumentException, err_msg("Unexpected type: %s", JVMCIENV->klass_name(base_object))); > } > > I believe this is caused by refactoring in [JDK-8233913](https://bugs.openjdk.java.net/browse/JDK-8233913) that [replaced](https://hg.openjdk.java.net/jdk/jdk/rev/15936b142f86#l39.38) `methodHandle` with naked `Method*`. `methodHandle` is implicitly initialized to null, while naked variable is not. After reading the original changeset, I found two other places where the same thing happens. This pull request has now been integrated. Changeset: e25ad730 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/e25ad730 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod 8263430: Uninitialized Method* variables after JDK-8233913 Reviewed-by: coleenp, hseigel ------------- PR: https://git.openjdk.java.net/jdk/pull/2936 From david.holmes at oracle.com Fri Mar 12 07:58:56 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 12 Mar 2021 17:58:56 +1000 Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v25] In-Reply-To: References: Message-ID: <59c46a13-91d9-bd60-70dd-ff8fde81c0c7@oracle.com> On 12/03/2021 5:12 pm, Anton Kozlov wrote: > On Fri, 12 Mar 2021 05:24:10 GMT, David Holmes wrote: > >>> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >>> >>> 8262903: [macos_aarch64] Thread::current() called on detached thread >> >> src/hotspot/share/runtime/safefetch.inline.hpp line 35: >> >>> 33: inline int SafeFetch32(int* adr, int errValue) { >>> 34: assert(StubRoutines::SafeFetch32_stub(), "stub not yet generated"); >>> 35: MACOS_AARCH64_ONLY(ThreadWXEnable wx(WXExec, Thread::current())); >> >> I think you may have to use `Thread::current_or_null_safe()` here in case this gets called from a signal handling context - see vmError.cpp testing for TestSafeFetchInErrorHandler. Same possibly for SafeFetchN. > > I'm not sure about expected behavior then. We may crash trying to execute the generated code, since we may have no WXExec. If we switch to WXExec, we would need to go back to a previous W^X state, but we don't know which one without the thread. The NULL check is only part of it. In a signal handling context Thread::current() is not safe to call, you have to use Thread::current_or_null_safe(). > BTW, the test passes, probably that's why it didn't get attention. All non-trivial actions in the current implementation of `pd_hotspot_signal_handler` (hhttps://github.com/openjdk/jdk/pull/2200/files#diff-9dcc5bcf908e2f8eb00b2c2837d667687a7540936d8f538ee1ea14e31ad50b40R215-R324) assume non-NULL thread. So AFAICS, we should always have a thread when the SafeFetch is called. Okay but you still need to use Thread::current_or_null_safe(). Cheers, David > Probably a fix to the https://bugs.openjdk.java.net/browse/JDK-8262903 could just move ThreadWXEnable under the `if`. But now after https://github.com/openjdk/jdk/pull/2200/commits/f6fb01b24f525e578692a1c6f2ff0a55b8233576is ThreadWXEnable allows optional W^X state change, like `MutexLocker` allows optional locking. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/2200 > From kvn at openjdk.java.net Fri Mar 12 08:57:34 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 12 Mar 2021 08:57:34 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v3] In-Reply-To: References: Message-ID: > Currently during deoptimization Vector's `payload` field values are restored during Vector reallocation: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L155 > > But for scalar-replaced values this is not correct because payload box object could be re-allocated after allocation of this vector. Scalar-replaced `payload` should be restored during regular fields reassignment (`Deoptimization::reassign_fields()` change). > > I renamed incorrect `eliminate_*` names for methods which restore/reallocate objects and locks. > > I added checks for EliminateAutoBox and EnableVectorAggressiveReboxing optimizations which can replace allocations with scalar objects independent from Escape Analysis. > > I added prints for unexpected StackValue values (stackValue.cpp) and for Vector debug info location type (location.cpp). Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: Update methods names and refactor VectorSupport::allocate_vector_payload(). ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2924/files - new: https://git.openjdk.java.net/jdk/pull/2924/files/2eb0b82d..c24793fd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2924&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2924&range=01-02 Stats: 32 lines in 3 files changed: 13 ins; 2 del; 17 mod Patch: https://git.openjdk.java.net/jdk/pull/2924.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2924/head:pull/2924 PR: https://git.openjdk.java.net/jdk/pull/2924 From rrich at openjdk.java.net Fri Mar 12 08:57:34 2021 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 12 Mar 2021 08:57:34 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 20:08:24 GMT, Vladimir Kozlov wrote: >> Currently during deoptimization Vector's `payload` field values are restored during Vector reallocation: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L155 >> >> But for scalar-replaced values this is not correct because payload box object could be re-allocated after allocation of this vector. Scalar-replaced `payload` should be restored during regular fields reassignment (`Deoptimization::reassign_fields()` change). >> >> I renamed incorrect `eliminate_*` names for methods which restore/reallocate objects and locks. >> >> I added checks for EliminateAutoBox and EnableVectorAggressiveReboxing optimizations which can replace allocations with scalar objects independent from Escape Analysis. >> >> I added prints for unexpected StackValue values (stackValue.cpp) and for Vector debug info location type (location.cpp). > > Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Update Copyright year Hello Vladimir, your change looks good to me. You might want to add the refactoring Vladimir suggested. May I ask why there is a special case to reallocate a vectors payload at all. In other words: why is the method VectorSupport::allocate_vector_payload_helper() needed? Is it for support of VectorMask? Thanks, Richard. ------------- Marked as reviewed by rrich (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2924 From kvn at openjdk.java.net Fri Mar 12 08:57:35 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 12 Mar 2021 08:57:35 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 08:48:50 GMT, Richard Reingruber wrote: >> Vladimir Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> Update Copyright year > > Hello Vladimir, > > your change looks good to me. You might want to add the refactoring Vladimir > suggested. > > May I ask why there is a special case to reallocate a vectors payload at all. In > other words: why is the method VectorSupport::allocate_vector_payload_helper() > needed? Is it for support of VectorMask? > > Thanks, Richard. > > I renamed incorrect eliminate_* names for methods which restore/reallocate objects and locks > > Fully agree that `eliminate_allocations`/`eliminate_locks` are misleading, but `restore_*` still look a bit confusing to me. > What do you think about `rematerialize_objects`/`rematerialize_scalarized_objects`/`relock_objects`/`restore_eliminated_locks`? I selected `rematerialize_objects` and `restore_eliminated_locks`. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From kvn at openjdk.java.net Fri Mar 12 08:57:35 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 12 Mar 2021 08:57:35 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 08:52:19 GMT, Vladimir Kozlov wrote: >> Hello Vladimir, >> >> your change looks good to me. You might want to add the refactoring Vladimir >> suggested. >> >> May I ask why there is a special case to reallocate a vectors payload at all. In >> other words: why is the method VectorSupport::allocate_vector_payload_helper() >> needed? Is it for support of VectorMask? >> >> Thanks, Richard. > >> > I renamed incorrect eliminate_* names for methods which restore/reallocate objects and locks >> >> Fully agree that `eliminate_allocations`/`eliminate_locks` are misleading, but `restore_*` still look a bit confusing to me. >> What do you think about `rematerialize_objects`/`rematerialize_scalarized_objects`/`relock_objects`/`restore_eliminated_locks`? > > I selected `rematerialize_objects` and `restore_eliminated_locks`. > Thanks for the clarifications, Vladimir. > > I agree that `VectorSupport::allocate_vector_payload` is not the right place to handle the problematic case. > > Some cleanup suggestions: now you can remove `StackValue::create_stack_value()`-related code from`VectorSupport::allocate_vector_payload()`, replace `ScopeValue* payload` argument with `Location location`, and turn > `location.type() == Location::vector` check into an assert. I removed `create_stack_value()` code from `VectorSupport::allocate_vector_payload()` and added asserts for all possible cases I found. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From vlivanov at openjdk.java.net Fri Mar 12 09:12:06 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 12 Mar 2021 09:12:06 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 08:48:50 GMT, Richard Reingruber wrote: > May I ask why there is a special case to reallocate a vectors payload at all. In other words: why is the method VectorSupport::allocate_vector_payload_helper() needed? Is it for support of VectorMask? (I assume you are asking about `VectorSupport::allocate_vector_payload()`.) Special handling is needed because Vector/VectorMask is represented as a single value in scalarized form, but in the boxed form it's a pair of instances: Vector/VectorMask + primitive array (holding the payload). `reassign_fields` doesn't handle such case and that's why it is special-cased. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From kvn at openjdk.java.net Fri Mar 12 09:26:09 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 12 Mar 2021 09:26:09 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 08:48:50 GMT, Richard Reingruber wrote: > Hello Vladimir, > > your change looks good to me. You might want to add the refactoring Vladimir > suggested. > > May I ask why there is a special case to reallocate a vectors payload at all. In > other words: why is the method VectorSupport::allocate_vector_payload_helper() > needed? Is it for support of VectorMask? > > Thanks, Richard. @iwanowww may know correct answer. As I understand it is because vectors need to re-allocate additional `payload` array to store vectors element: https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L129 `create_stack_value()` does not handle new allocations and arrays. It would require a lot changes. I think it could be done if during debug info generation we describe element's storage as Scalarized array. But it would not help VectorMask as you said. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From aph at openjdk.java.net Fri Mar 12 09:29:09 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 12 Mar 2021 09:29:09 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v8] In-Reply-To: <4WPv82jIATB_TZZdx6DNKo9aF0KfxzvKGSZlWV4N8u0=.7f5b74c4-09a5-4b30-ad07-78d914e245d4@github.com> References: <4WPv82jIATB_TZZdx6DNKo9aF0KfxzvKGSZlWV4N8u0=.7f5b74c4-09a5-4b30-ad07-78d914e245d4@github.com> Message-ID: On Wed, 10 Mar 2021 23:47:27 GMT, Yasumasa Suenaga wrote: >> HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: >> >> $ jfr print --events jdk.CPUInformation raspi4.jfr >> jdk.CPUInformation { >> startTime = 22:57:13.521 >> cpu = "AArch64" >> description = "AArch64 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). >> >> In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. >> >> After this change, we can get the description as below: >> >> jdk.CPUInformation { >> startTime = 00:32:49.767 >> cpu = "AArch64" >> description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" >> sockets = 4 >> cores = 4 >> hwThreads = 4 >> } >> >> In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. >> >> jdk.CPUInformation { >> startTime = 17:28:03.907 >> cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" >> description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD >> Family: (0x17), Model: (0x71), Stepping: 0x0 >> Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 >> Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff >> Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff >> Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Di sable Bit, RDTSCP, Intel 64 Architecture" >> sockets = 1 >> cores = 2 >> hwThreads = 2 >> } > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > refactoring Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From aph at openjdk.java.net Fri Mar 12 09:29:11 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 12 Mar 2021 09:29:11 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v6] In-Reply-To: References: <5qPXesyUGV7LYBT-QR4bSUsDZ8RSm-yWMORQorp5zR8=.f1a62fa2-b038-4073-ad06-e87b7100e2e4@github.com> Message-ID: On Fri, 12 Mar 2021 05:16:36 GMT, Yasumasa Suenaga wrote: >> I think my latest commit includes your suggestion: >> >> * returns empty string ( `\0` ) when `open()` failed. >> * returns empty string when `read()` failed or read nothing (returns `0` ) >> * add `\0` to `buf[read_sz]` just after `read()` call, and skip it at the loop - it can be assumed `\0` is set to tail of `buf` >> >> Or should I change as following for readability? >> >> int fd = open("/proc/device-tree/compatible", O_RDONLY); >> if (fd == -1) { >> *buf = '\0'; >> return; >> } >> >> ssize_t read_sz = read(fd, buf, buflen - 1); >> close(fd); >> if (read_sz <= 0) { >> *buf = '\0'; >> return; >> } >> >> // Add '\0' to the tail >> buf[read_sz] = '\0'; >> // Replace '\0' to ' ' >> for (char *ch = buf; ch < buf + read_sz; ch++) { >> if (*ch == '\0') { >> *ch = ' '; >> } >> } > > I tested current PR manually with this Gist code, it works fine without buffer overrun. > https://gist.github.com/YaSuenag/bdc73c3540335a096fa289ead36ca8b5 That looks right. ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From ysuenaga at openjdk.java.net Fri Mar 12 10:52:13 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 12 Mar 2021 10:52:13 GMT Subject: RFR: 8262491: AArch64: CPU description should contain compatible board list [v8] In-Reply-To: References: <4WPv82jIATB_TZZdx6DNKo9aF0KfxzvKGSZlWV4N8u0=.7f5b74c4-09a5-4b30-ad07-78d914e245d4@github.com> Message-ID: On Fri, 12 Mar 2021 09:26:32 GMT, Andrew Haley wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> refactoring > > Marked as reviewed by aph (Reviewer). @theRealAph Thank you for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From shade at openjdk.java.net Fri Mar 12 10:58:07 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Fri, 12 Mar 2021 10:58:07 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v2] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 20:17:28 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Ensure test does a complete GC cycle before verification Still don't like the Verifier mess. Here is my attempt to un-confuse Verifier about this (applies on top of this PR): https://cr.openjdk.java.net/~shade/8263427/verifier-1.patch I think it catches a few threads having "weak roots enabled" state while starting the mark in `gc/stress/systemgc/TestSystemGCWithShenandoah.java`. Please see if this points to lifecycle bugs in current flag handling? ------------- Changes requested by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2945 From rrich at openjdk.java.net Fri Mar 12 10:59:07 2021 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 12 Mar 2021 10:59:07 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: Message-ID: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> On Fri, 12 Mar 2021 09:09:06 GMT, Vladimir Ivanov wrote: > > > > May I ask why there is a special case to reallocate a vectors payload at all. In > > other words: why is the method VectorSupport::allocate_vector_payload_helper() > > needed? Is it for support of VectorMask? > > (I assume you are asking about `VectorSupport::allocate_vector_payload()`.) > > Special handling is needed because Vector/VectorMask is represented as a single value in scalarized form And can't the payload for deoptimization be represented as a scalarized array in that case too? Maybe not because of the special handling a VectorMask requires. I'm probably not familiar enough with the vector implementation, so sorry for bothering. IIUC @vnkozlov says this could be done but not for VectorMask. > I think it could be done if during debug info generation we describe element's storage as Scalarized array. But it would not help VectorMask as you said. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From rrich at openjdk.java.net Fri Mar 12 10:59:07 2021 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 12 Mar 2021 10:59:07 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> References: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> Message-ID: On Fri, 12 Mar 2021 10:54:44 GMT, Richard Reingruber wrote: > I think it could be done if during debug info generation we describe element's storage as Scalarized array. But it would not help VectorMask as you said. I see. Thanks for answering. Change still looks good to me. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From vlivanov at openjdk.java.net Fri Mar 12 11:33:06 2021 From: vlivanov at openjdk.java.net (Vladimir Ivanov) Date: Fri, 12 Mar 2021 11:33:06 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> References: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> Message-ID: On Fri, 12 Mar 2021 10:54:44 GMT, Richard Reingruber wrote: > And can't the payload for deoptimization be represented as a scalarized array in that case too? Maybe not because of the special handling a VectorMask requires. Vector is special in a very similar way: debug info contains only vector value location. Custom logic is needed to turn it into scalarized array representation. And then there's still a step required to allocate the corresponding typed vector box which wraps the payload. I'm not saying it is not possible to represent current on-heap shape solely in debug info, but it would require significant refactorings/enhancements of existing machinery. And would complicate possible changes of the on-heap vector/mask shape. It is not something "cut in stone" and can evolve over time. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From redestad at openjdk.java.net Fri Mar 12 12:03:07 2021 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 12 Mar 2021 12:03:07 GMT Subject: RFR: 8261031: Move some ClassLoader name checking to native/VM [v3] In-Reply-To: References: <3fZUkpucpgdhZyyWDQ7Hp1oKthgl1ckXBq942wMNwxI=.7a3db0ca-03c0-44f9-ade9-3b4443cc6666@github.com> Message-ID: On Fri, 12 Feb 2021 22:48:51 GMT, Mandy Chung wrote: >> This more limited cleanup looks good. > > This patch changes `JVM_FindLoadedClass` interface to only accept a binary name. It used to accept both a binary name and internal form. Most, if not all, JVM entry points take the name of internal name. So this change makes this JVM entry point inconsistent with others. > > Looking closer each API that involves `fixClassName` or `verifyXXXClassName`, the JVM entry points called expects the internal form except `JVM_FindLoadedClass` (see details below). I think a better change is to change the native `JVM_FindLoadedClass` to accept the internal form only and have `findLoadedClass0` method to detect if the name contains '/' or '['. > > ClassLoader API does not allow loading of an array type whereas `Class::forName` allows to find an array type. Perhaps `verifyFixClassName` should be renamed like `binaryNameToInternalForm`. I think we don't need `fixClassname`? > > ClassLoader::defineClass > - `preDefineClass` checks the name and throws if it contains '/' or '[' > - no name check in `JVM_DefineClassWithSource` and `JVM_LookupDefineClass` > which expects the name is of internal form > > native Class::forName0 > - converts the binary name to internal form (i.e. replace '.' with '/') > - throw if the name contains '/' > - no explicit name check in `JVM_FindClassFromCaller` > > ClassLoader::loadClass > - calls native `findLoadedClass0` that calls `JVM_FindLoadedClass` which > accepts binary form and converts '.' to '/' but the current implementation > accepts both binary name and internal form > - calls `native findBootstrapClass` which converts '.' to '/' and pass the internal > form to `JVM_FindBootstrapClass`. > > It'd be helpful to document the internal APIs and JVM entry points clearly what it expects for example binary name vs internal form and where it does the validation e.g. Class::forName0 allows array type and native library methods do the name validation. Abandoning this. Sorry for wasting everyone's time. ------------- PR: https://git.openjdk.java.net/jdk/pull/2378 From redestad at openjdk.java.net Fri Mar 12 12:03:09 2021 From: redestad at openjdk.java.net (Claes Redestad) Date: Fri, 12 Mar 2021 12:03:09 GMT Subject: Withdrawn: 8261031: Move some ClassLoader name checking to native/VM In-Reply-To: <3fZUkpucpgdhZyyWDQ7Hp1oKthgl1ckXBq942wMNwxI=.7a3db0ca-03c0-44f9-ade9-3b4443cc6666@github.com> References: <3fZUkpucpgdhZyyWDQ7Hp1oKthgl1ckXBq942wMNwxI=.7a3db0ca-03c0-44f9-ade9-3b4443cc6666@github.com> Message-ID: On Wed, 3 Feb 2021 12:21:30 GMT, Claes Redestad wrote: > This patch moves some sanity checking done in ClassLoader.java to the corresponding endpoints in native or VM code. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/2378 From akozlov at openjdk.java.net Fri Mar 12 12:17:39 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 12 Mar 2021 12:17:39 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v26] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request incrementally with three additional commits since the last revision: - Add Azul copyright - Update Oracle copyright years - Use Thread::current_or_null_safe in SafeFetch ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2200/files - new: https://git.openjdk.java.net/jdk/pull/2200/files/f6fb01b2..29991c92 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=25 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=24-25 Stats: 83 lines in 53 files changed: 41 ins; 0 del; 42 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From lucy at openjdk.java.net Fri Mar 12 13:05:13 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Fri, 12 Mar 2021 13:05:13 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) Message-ID: 8263260: [s390] Support latest hardware (z14 and z15) ------------- Commit messages: - 8263260: [s390] Support latest hardware (z14 and z15) Changes: https://git.openjdk.java.net/jdk/pull/2918/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2918&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263260 Stats: 133 lines in 2 files changed: 67 ins; 39 del; 27 mod Patch: https://git.openjdk.java.net/jdk/pull/2918.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2918/head:pull/2918 PR: https://git.openjdk.java.net/jdk/pull/2918 From lucy at openjdk.java.net Fri Mar 12 13:05:13 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Fri, 12 Mar 2021 13:05:13 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 17:35:20 GMT, Lutz Schmidt wrote: > 8263260: [s390] Support latest hardware (z14 and z15) Dear Community, I would appreciate reviews for this enhancement, adding basic detection and support for recent s390 hardware generations. Thank you! Lutz ------------- PR: https://git.openjdk.java.net/jdk/pull/2918 From hseigel at openjdk.java.net Fri Mar 12 13:17:07 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 12 Mar 2021 13:17:07 GMT Subject: RFR: 8213177: GlobalCounter::CSContext could be an enum class [v4] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 21:52:06 GMT, Kim Barrett wrote: >> Harold Seigel has updated the pull request incrementally with one additional commit since the last revision: >> >> remove #include > > Looks good. Thanks Lois and Kim for reviewing this and pointing out what needed to be fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From hseigel at openjdk.java.net Fri Mar 12 13:17:09 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 12 Mar 2021 13:17:09 GMT Subject: Integrated: 8213177: GlobalCounter::CSContext could be an enum class In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 16:04:19 GMT, Harold Seigel wrote: > Please review this small change for JDK-8213177. The change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: 65421fae Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/65421fae Stats: 3 lines in 1 file changed: 1 ins; 0 del; 2 mod 8213177: GlobalCounter::CSContext could be an enum class Reviewed-by: lfoltan, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/2895 From vkempik at openjdk.java.net Fri Mar 12 13:38:17 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Fri, 12 Mar 2021 13:38:17 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v9] In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 08:12:10 GMT, Anton Kozlov wrote: >> I wasn't able to replicate JDK-8020753 and JDK-8186286. So will remove these workaround >> @gerard-ziemski, 8020753 was originally your fix, do you know if it still needed on intel-mac ? > > The x86_bsd still carries the workaround https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/bsd_x86/os_bsd_x86.cpp#L745. It's worth having macos ports to be aligned by features. I would propose to have this workaround for now, and decide on it later for macos/x86 and macos/aarch64 at once. Sorry for chiming in so late. Hello Anton. Please keep JDK-8020753 away from this port, as JDK-8020753 was very intel-specific workaround and macos11.0 on aarch64 doesn't show any signs of original bug. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Fri Mar 12 14:10:18 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 12 Mar 2021 14:10:18 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v25] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 20:28:46 GMT, Stefan Karlsson wrote: >> Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: >> >> 8262903: [macos_aarch64] Thread::current() called on detached thread > > Marked as reviewed by stefank (Reviewer). > you still need to use Thread::current_or_null_safe() [for SafeFetch]. OK :) I fixed this in https://github.com/openjdk/jdk/pull/2200/commits/fd4812e585e0528010a8863df50956a3b64a6744 @dcubed-ojdk, I also updated copyrights, this concludes fixes for the review https://github.com/openjdk/jdk/pull/2200#pullrequestreview-581784107. @theRealAph, could you elaborate on what is need to be done for https://github.com/openjdk/jdk/pull/2200#pullrequestreview-600597066. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From rkennke at openjdk.java.net Fri Mar 12 14:06:27 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 12 Mar 2021 14:06:27 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v3] In-Reply-To: References: Message-ID: > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. > > There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. > > This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) > > Testing: > - [x] New testcase failed without change, passes now > - [x] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Correct order of rendezvous, global- and local-flag updates; cleanup rendezvous - Verify correct weakroots-in-progress state (by Aleksey) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2945/files - new: https://git.openjdk.java.net/jdk/pull/2945/files/94575b41..739c9b62 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=01-02 Stats: 55 lines in 5 files changed: 32 ins; 17 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/2945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2945/head:pull/2945 PR: https://git.openjdk.java.net/jdk/pull/2945 From kvn at openjdk.java.net Fri Mar 12 17:03:08 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 12 Mar 2021 17:03:08 GMT Subject: Integrated: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 19:58:22 GMT, Vladimir Kozlov wrote: > Currently during deoptimization Vector's `payload` field values are restored during Vector reallocation: > https://github.com/openjdk/jdk/blob/master/src/hotspot/share/prims/vectorSupport.cpp#L155 > > But for scalar-replaced values this is not correct because payload box object could be re-allocated after allocation of this vector. Scalar-replaced `payload` should be restored during regular fields reassignment (`Deoptimization::reassign_fields()` change). > > I renamed incorrect `eliminate_*` names for methods which restore/reallocate objects and locks. > > I added checks for EliminateAutoBox and EnableVectorAggressiveReboxing optimizations which can replace allocations with scalar objects independent from Escape Analysis. > > I added prints for unexpected StackValue values (stackValue.cpp) and for Vector debug info location type (location.cpp). This pull request has now been integrated. Changeset: a6e056fd Author: Vladimir Kozlov URL: https://git.openjdk.java.net/jdk/commit/a6e056fd Stats: 61 lines in 5 files changed: 33 ins; 6 del; 22 mod 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. Reviewed-by: vlivanov, rrich ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From aph at openjdk.java.net Fri Mar 12 16:35:23 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 12 Mar 2021 16:35:23 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v24] In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 16:12:36 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 105 commits: > > - Merge commit 'refs/pull/11/head' of https://github.com/AntonKozlov/jdk into jdk-macos > - workaround JDK-8262895 by disabling subtest > - Fix typo > - Rename threadWXSetters.hpp -> threadWXSetters.inline.hpp > - JDK-8259937: bsd_aarch64 part > - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos > - Fix after JDK-8259539, partially revert preconditions > - JDK-8260471: bsd_aarch64 part > - JDK-8259539: bsd_aarch64 part > - JDK-8257828: bsd_aarch64 part > - ... and 95 more: https://git.openjdk.java.net/jdk/compare/a6e34b3d...a72f6834 > @theRealAph, could you elaborate on what is need to be done for [#2200 (review)](https://github.com/openjdk/jdk/pull/2200#pullrequestreview-600597066). I think that what you've got now is fine. ------------- Marked as reviewed by aph (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2200 From kvn at openjdk.java.net Fri Mar 12 17:03:07 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 12 Mar 2021 17:03:07 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> Message-ID: On Fri, 12 Mar 2021 11:30:18 GMT, Vladimir Ivanov wrote: > > And can't the payload for deoptimization be represented as a scalarized array in that case too? Maybe not because of the special handling a VectorMask requires. > > Vector is special in a very similar way: debug info contains only vector value location. Custom logic is needed to turn it into scalarized array representation. And then there's still a step required to allocate the corresponding typed vector box which wraps the payload. > > I'm not saying it is not possible to represent current on-heap shape solely in debug info, but it would require significant refactorings/enhancements of existing machinery. And would complicate possible changes of the on-heap vector/mask shape. It is not something "cut in stone" and can evolve over time. Yes, I said it too. It can be done but code would be much larger and more complicated. And, as Vladimir I. correctly pointed, VectorAPI is new code which can be evolved. To have specialized code for it makes it easy to do experiments. Thank you @iwanowww and @reinrich for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From coleenp at openjdk.java.net Fri Mar 12 17:44:06 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 12 Mar 2021 17:44:06 GMT Subject: RFR: 8178348: left_n_bits(0) invokes undefined behavior In-Reply-To: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> References: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> Message-ID: On Thu, 11 Mar 2021 18:22:33 GMT, Harold Seigel wrote: > Please review this small change to remove the unused left_n_bits(n) macro. This change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS. > > Thanks, Harold LGTM! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2944 From aph at openjdk.java.net Fri Mar 12 16:35:24 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 12 Mar 2021 16:35:24 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v21] In-Reply-To: References: Message-ID: On Tue, 9 Mar 2021 18:01:11 GMT, Anton Kozlov wrote: >> src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp line 62: >> >>> 60: >>> 61: #if defined(__APPLE__) || defined(_WIN64) >>> 62: #define R18_RESERVED >> >> #define R18_RESERVED true``` > > We always check for `R18_RESERVED` with `#if(n)def`, is there any reason to define the value for the macro? Robustness, clarity, maintainability, convention. Why not? ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Fri Mar 12 16:44:38 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Fri, 12 Mar 2021 16:44:38 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v27] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: Fix most of issues in java/foreign/ tests Failures related to va_args are tracked in JDK-8263512. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2200/files - new: https://git.openjdk.java.net/jdk/pull/2200/files/29991c92..5bfe0f08 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=26 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=25-26 Stats: 5 lines in 2 files changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From hseigel at openjdk.java.net Fri Mar 12 19:04:06 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 12 Mar 2021 19:04:06 GMT Subject: RFR: 8178348: left_n_bits(0) invokes undefined behavior In-Reply-To: References: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> Message-ID: On Fri, 12 Mar 2021 17:41:08 GMT, Coleen Phillimore wrote: >> Please review this small change to remove the unused left_n_bits(n) macro. This change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS. >> >> Thanks, Harold > > LGTM! Thanks David and Coleen for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/2944 From richard.reingruber at sap.com Fri Mar 12 18:01:19 2021 From: richard.reingruber at sap.com (Reingruber, Richard) Date: Fri, 12 Mar 2021 18:01:19 +0000 Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> Message-ID: I see. Thanks again for answering my questions. Richard. -----Original Message----- From: hotspot-dev On Behalf Of Vladimir Ivanov Sent: Freitag, 12. M?rz 2021 12:33 To: hotspot-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Subject: Re: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] On Fri, 12 Mar 2021 10:54:44 GMT, Richard Reingruber wrote: > And can't the payload for deoptimization be represented as a scalarized array in that case too? Maybe not because of the special handling a VectorMask requires. Vector is special in a very similar way: debug info contains only vector value location. Custom logic is needed to turn it into scalarized array representation. And then there's still a step required to allocate the corresponding typed vector box which wraps the payload. I'm not saying it is not possible to represent current on-heap shape solely in debug info, but it would require significant refactorings/enhancements of existing machinery. And would complicate possible changes of the on-heap vector/mask shape. It is not something "cut in stone" and can evolve over time. ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From hseigel at openjdk.java.net Fri Mar 12 19:04:08 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Fri, 12 Mar 2021 19:04:08 GMT Subject: Integrated: 8178348: left_n_bits(0) invokes undefined behavior In-Reply-To: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> References: <_i65eh8btlIcBdEthz_7eFnbAvOtvFQk9PX8PHw0DzE=.8af5cb8a-27dd-4f83-b9f3-8b3b3b3979e2@github.com> Message-ID: On Thu, 11 Mar 2021 18:22:33 GMT, Harold Seigel wrote: > Please review this small change to remove the unused left_n_bits(n) macro. This change was regression tested with Mach5 tiers 1 and 2 on Linux, Windows, and Mac OS. > > Thanks, Harold This pull request has now been integrated. Changeset: 4b5c664b Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/4b5c664b Stats: 5 lines in 2 files changed: 0 ins; 4 del; 1 mod 8178348: left_n_bits(0) invokes undefined behavior Reviewed-by: dholmes, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/2944 From rrich at openjdk.java.net Fri Mar 12 17:53:07 2021 From: rrich at openjdk.java.net (Richard Reingruber) Date: Fri, 12 Mar 2021 17:53:07 GMT Subject: RFR: 8263125: During deoptimization vectors should reassign scalarized payload after all objects are reallocated. [v2] In-Reply-To: References: <20oARtep00xABeE4RK_INvSURHwwiCmkZc4sjgXBuT8=.2bac7aab-8cd8-481c-b873-8f9521c9d7c4@github.com> Message-ID: On Fri, 12 Mar 2021 16:58:18 GMT, Vladimir Kozlov wrote: >>> And can't the payload for deoptimization be represented as a scalarized array in that case too? Maybe not because of the special handling a VectorMask requires. >> >> Vector is special in a very similar way: debug info contains only vector value location. Custom logic is needed to turn it into scalarized array representation. And then there's still a step required to allocate the corresponding typed vector box which wraps the payload. >> >> I'm not saying it is not possible to represent current on-heap shape solely in debug info, but it would require significant refactorings/enhancements of existing machinery. And would complicate possible changes of the on-heap vector/mask shape. It is not something "cut in stone" and can evolve over time. > >> > And can't the payload for deoptimization be represented as a scalarized array in that case too? Maybe not because of the special handling a VectorMask requires. >> >> Vector is special in a very similar way: debug info contains only vector value location. Custom logic is needed to turn it into scalarized array representation. And then there's still a step required to allocate the corresponding typed vector box which wraps the payload. >> >> I'm not saying it is not possible to represent current on-heap shape solely in debug info, but it would require significant refactorings/enhancements of existing machinery. And would complicate possible changes of the on-heap vector/mask shape. It is not something "cut in stone" and can evolve over time. > > Yes, I said it too. It can be done but code would be much larger and more complicated. > And, as Vladimir I. correctly pointed, VectorAPI is new code which can be evolved. To have specialized code for it makes it easy to do experiments. > > Thank you @iwanowww and @reinrich for reviews. I see. Thanks for elaborating on my question! ------------- PR: https://git.openjdk.java.net/jdk/pull/2924 From coleenp at openjdk.java.net Fri Mar 12 21:29:14 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 12 Mar 2021 21:29:14 GMT Subject: RFR: 8263544: Unused argument in ConstantPoolCacheEntry::set_field() In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 21:23:35 GMT, Frederic Parain wrote: > Please review this trivial fix removing an unused argument from ConstantPoolCacheEntry::set_field(). > > Thank you, > > Fred Looks good and trivial! Good find. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2978 From github.com+71302734+amitdpawar at openjdk.java.net Fri Mar 12 20:01:17 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Fri, 12 Mar 2021 20:01:17 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion Message-ID: In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. Following minimum expansion size are seen during expansion. 1. 512KB without largepages and without UseNUMA. 2. 64MB without largepages and with UseNUMA, 3. 2MB (on x86) with large pages and without UseNUMA, 4. 64MB without large pages and with UseNUMA. When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. Jtreg all test passed. Please review this change. ------------- Commit messages: - ParallelGC: Cooperative pretouch for oldgen expansion Changes: https://git.openjdk.java.net/jdk/pull/2976/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2976&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8260332 Stats: 185 lines in 9 files changed: 158 ins; 0 del; 27 mod Patch: https://git.openjdk.java.net/jdk/pull/2976.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2976/head:pull/2976 PR: https://git.openjdk.java.net/jdk/pull/2976 From fparain at openjdk.java.net Fri Mar 12 21:29:14 2021 From: fparain at openjdk.java.net (Frederic Parain) Date: Fri, 12 Mar 2021 21:29:14 GMT Subject: RFR: 8263544: Unused argument in ConstantPoolCacheEntry::set_field() Message-ID: Please review this trivial fix removing an unused argument from ConstantPoolCacheEntry::set_field(). Thank you, Fred ------------- Commit messages: - Remove unused argument in ConstantPoolCacheEntry::set_field() Changes: https://git.openjdk.java.net/jdk/pull/2978/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2978&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263544 Stats: 6 lines in 3 files changed: 0 ins; 3 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/2978.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2978/head:pull/2978 PR: https://git.openjdk.java.net/jdk/pull/2978 From iignatyev at openjdk.java.net Fri Mar 12 23:25:14 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Fri, 12 Mar 2021 23:25:14 GMT Subject: RFR: 8246494: introduce vm.flagless at-requires property In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 23:27:21 GMT, Igor Ignatyev wrote: > resurrecting old [RFR](https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/041981.html): > >> Hi all, >> >> could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? >> >> the idea behind this patch is to have a way to clearly mark tests which ignore flags, so >> a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; >> b) they can be easily excluded from runs w/ flags. >> >> @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. >> >> this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. >> >> please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 >> webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >> testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8151707 >> [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 >> [3] https://bugs.openjdk.java.net/browse/JDK-8246387 >> > > after offline discussion with @pliden, it has been decided to reduce the scope of [8246499](https://bugs.openjdk.java.net/browse/JDK-8246499) and not mark the tests that use `UseXGC` flags for selection, e.g. `test/hotspot/jtreg/gc/z/TestSmallHeap.java`. > > Thanks, > -- Igor ping? ------------- PR: https://git.openjdk.java.net/jdk/pull/2800 From iignatyev at openjdk.java.net Sat Mar 13 04:38:48 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 13 Mar 2021 04:38:48 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split Message-ID: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Hi all, could you please review this dull patch that replaces `ClassFileInstaller` w/ `jdk.test.lib.helpers.ClassFileInstaller` in all jtreg test descriptions to ensure we won't get split testlibrary, and removes `jdk/test/lib/ClassFileInstaller.java` (so it won't be accidentally used). from JBS: > after JDK-8263412, we might (again) encounter NCDFE b/c parts of testlibraries aren't on the classpath. this happens when jtreg builds `jdk.test.lib.helpers.ClassFileInstaller` as a part of test-specific code, but `ClassFileInstaller` as part of shared testibrary directory in one test, when in the following test, jtreg sees `ClassFileInstaller` in the shared directory, hence javac won't recompile it/its dependencies, but in runtime `jdk.test.lib.helpers.ClassFileInstaller` is nowhere to be found, hence we get NCDFE. testing: - [x] `grep ' ClassFileInstaller[^.]` - [ ] tier1-3 Thanks, -- Igor ------------- Commit messages: - fixup - update copyright year - rm test/lib/ClassFileInstaller.java - 's/ ClassFileInstaller / jdk.test.lib.helpers.ClassFileInstaller /g' Changes: https://git.openjdk.java.net/jdk/pull/2985/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2985&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263549 Stats: 1736 lines in 867 files changed: 0 ins; 67 del; 1669 mod Patch: https://git.openjdk.java.net/jdk/pull/2985.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2985/head:pull/2985 PR: https://git.openjdk.java.net/jdk/pull/2985 From iignatyev at openjdk.java.net Sat Mar 13 04:42:08 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 13 Mar 2021 04:42:08 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split In-Reply-To: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Message-ID: On Sat, 13 Mar 2021 04:31:31 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this dull patch that replaces `ClassFileInstaller` w/ `jdk.test.lib.helpers.ClassFileInstaller` in all jtreg test descriptions to ensure we won't get split testlibrary, and removes `jdk/test/lib/ClassFileInstaller.java` (so it won't be accidentally used). > > from JBS: >> after JDK-8263412, we might (again) encounter NCDFE b/c parts of testlibraries aren't on the classpath. this happens when jtreg builds `jdk.test.lib.helpers.ClassFileInstaller` as a part of test-specific code, but `ClassFileInstaller` as part of shared testibrary directory in one test, when in the following test, jtreg sees `ClassFileInstaller` in the shared directory, hence javac won't recompile it/its dependencies, but in runtime `jdk.test.lib.helpers.ClassFileInstaller` is nowhere to be found, hence we get NCDFE. > > testing: > - [x] `grep ' ClassFileInstaller[^.]` > - [ ] tier1-3 > > Thanks, > -- Igor note for reviewers: the big part of the patch is just `sed -i 's/ ClassFileInstaller / jdk.test.lib.helpers.ClassFileInstaller /g'` ------------- PR: https://git.openjdk.java.net/jdk/pull/2985 From dholmes at openjdk.java.net Sat Mar 13 05:52:22 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Sat, 13 Mar 2021 05:52:22 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v27] In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 16:44:38 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request incrementally with one additional commit since the last revision: > > Fix most of issues in java/foreign/ tests > > Failures related to va_args are tracked in JDK-8263512. src/hotspot/share/runtime/safefetch.inline.hpp line 35: > 33: inline int SafeFetch32(int* adr, int errValue) { > 34: assert(StubRoutines::SafeFetch32_stub(), "stub not yet generated"); > 35: Thread* thread = Thread::current_or_null_safe(); Sorry but this should be MACOS_AARCH64 only. All three lines need to be ifdef'd if you are going to include the assertion. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From dholmes at openjdk.java.net Sat Mar 13 06:06:06 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Sat, 13 Mar 2021 06:06:06 GMT Subject: RFR: 8263544: Unused argument in ConstantPoolCacheEntry::set_field() In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 21:23:35 GMT, Frederic Parain wrote: > Please review this trivial fix removing an unused argument from ConstantPoolCacheEntry::set_field(). > > Thank you, > > Fred Looks good and trivial. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2978 From david.holmes at oracle.com Sat Mar 13 06:10:43 2021 From: david.holmes at oracle.com (David Holmes) Date: Sat, 13 Mar 2021 16:10:43 +1000 Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion In-Reply-To: References: Message-ID: <7cf911f6-b50b-d4ea-b19f-8486facd3739@oracle.com> Hi Amit, On 13/03/2021 6:01 am, Amit Pawar wrote: > In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. > > This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. Sorry but I do not like the change to MutexLocker - a tryLock is only maybe locking the mutex and the fact you have to return a value to indicate whether it managed to lock or not, strikes me as a bad use of a RAII style of object. I think explicit lock/try_lock/unlock would be much clearer here. Thanks, David > > Following minimum expansion size are seen during expansion. > 1. 512KB without largepages and without UseNUMA. > 2. 64MB without largepages and with UseNUMA, > 3. 2MB (on x86) with large pages and without UseNUMA, > 4. 64MB without large pages and with UseNUMA. > > When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. > > Jtreg all test passed. > > Please review this change. > > ------------- > > Commit messages: > - ParallelGC: Cooperative pretouch for oldgen expansion > > Changes: https://git.openjdk.java.net/jdk/pull/2976/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2976&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8260332 > Stats: 185 lines in 9 files changed: 158 ins; 0 del; 27 mod > Patch: https://git.openjdk.java.net/jdk/pull/2976.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/2976/head:pull/2976 > > PR: https://git.openjdk.java.net/jdk/pull/2976 > From iklam at openjdk.java.net Sat Mar 13 06:19:06 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 13 Mar 2021 06:19:06 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split In-Reply-To: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Message-ID: <2ydx9TUT868fiCQNxF6IaEsq9toXBiDJjJK3GqWRREE=.77fc9c8d-efed-47e8-8f53-02255da6e97e@github.com> On Sat, 13 Mar 2021 04:31:31 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this dull patch that replaces `ClassFileInstaller` w/ `jdk.test.lib.helpers.ClassFileInstaller` in all jtreg test descriptions to ensure we won't get split testlibrary, and removes `jdk/test/lib/ClassFileInstaller.java` (so it won't be accidentally used). > > from JBS: >> after JDK-8263412, we might (again) encounter NCDFE b/c parts of testlibraries aren't on the classpath. this happens when jtreg builds `jdk.test.lib.helpers.ClassFileInstaller` as a part of test-specific code, but `ClassFileInstaller` as part of shared testibrary directory in one test, when in the following test, jtreg sees `ClassFileInstaller` in the shared directory, hence javac won't recompile it/its dependencies, but in runtime `jdk.test.lib.helpers.ClassFileInstaller` is nowhere to be found, hence we get NCDFE. > > testing: > - [x] `grep ' ClassFileInstaller[^.]` > - [ ] tier1-3 > > Thanks, > -- Igor I did this and scanned the differences (with the diff file from the webrev) and it looks reasonable to me. grep '^[+-]' diff.txt | grep -v Copyright | grep -v '^.[+-]' | less It looks like most of the changes are mechanical. There were only a few cases where manual changes were made. I trusted that you have tested those cases individually. But I don't understand why this error can happen. It seems like jtreg would allow two test cases to interfere with each other. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2985 From iignatyev at openjdk.java.net Sat Mar 13 06:30:00 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 13 Mar 2021 06:30:00 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split [v2] In-Reply-To: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Message-ID: > Hi all, > > could you please review this dull patch that replaces `ClassFileInstaller` w/ `jdk.test.lib.helpers.ClassFileInstaller` in all jtreg test descriptions to ensure we won't get split testlibrary, and removes `jdk/test/lib/ClassFileInstaller.java` (so it won't be accidentally used). > > from JBS: >> after JDK-8263412, we might (again) encounter NCDFE b/c parts of testlibraries aren't on the classpath. this happens when jtreg builds `jdk.test.lib.helpers.ClassFileInstaller` as a part of test-specific code, but `ClassFileInstaller` as part of shared testibrary directory in one test, when in the following test, jtreg sees `ClassFileInstaller` in the shared directory, hence javac won't recompile it/its dependencies, but in runtime `jdk.test.lib.helpers.ClassFileInstaller` is nowhere to be found, hence we get NCDFE. > > testing: > - [x] `grep ' ClassFileInstaller[^.]` > - [ ] tier1-3 > > Thanks, > -- Igor Igor Ignatyev has updated the pull request incrementally with one additional commit since the last revision: fix compilation error in IncorrectAOTLibraryTest test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2985/files - new: https://git.openjdk.java.net/jdk/pull/2985/files/6e53ad97..ff6d4f91 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2985&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2985&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2985.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2985/head:pull/2985 PR: https://git.openjdk.java.net/jdk/pull/2985 From iignatyev at openjdk.java.net Sat Mar 13 06:44:12 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 13 Mar 2021 06:44:12 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split [v3] In-Reply-To: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Message-ID: > Hi all, > > could you please review this dull patch that replaces `ClassFileInstaller` w/ `jdk.test.lib.helpers.ClassFileInstaller` in all jtreg test descriptions to ensure we won't get split testlibrary, and removes `jdk/test/lib/ClassFileInstaller.java` (so it won't be accidentally used). > > from JBS: >> after JDK-8263412, we might (again) encounter NCDFE b/c parts of testlibraries aren't on the classpath. this happens when jtreg builds `jdk.test.lib.helpers.ClassFileInstaller` as a part of test-specific code, but `ClassFileInstaller` as part of shared testibrary directory in one test, when in the following test, jtreg sees `ClassFileInstaller` in the shared directory, hence javac won't recompile it/its dependencies, but in runtime `jdk.test.lib.helpers.ClassFileInstaller` is nowhere to be found, hence we get NCDFE. > > testing: > - [x] `grep ' ClassFileInstaller[^.]` > - [ ] tier1-3 > > Thanks, > -- Igor Igor Ignatyev has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: fix compilation error in IncorrectAOTLibraryTest test ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2985/files - new: https://git.openjdk.java.net/jdk/pull/2985/files/ff6d4f91..3a3b7a84 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2985&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2985&range=01-02 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/2985.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2985/head:pull/2985 PR: https://git.openjdk.java.net/jdk/pull/2985 From iignatyev at openjdk.java.net Sat Mar 13 06:44:13 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 13 Mar 2021 06:44:13 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split [v3] In-Reply-To: <2ydx9TUT868fiCQNxF6IaEsq9toXBiDJjJK3GqWRREE=.77fc9c8d-efed-47e8-8f53-02255da6e97e@github.com> References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> <2ydx9TUT868fiCQNxF6IaEsq9toXBiDJjJK3GqWRREE=.77fc9c8d-efed-47e8-8f53-02255da6e97e@github.com> Message-ID: On Sat, 13 Mar 2021 06:16:37 GMT, Ioi Lam wrote: >> Igor Ignatyev has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> fix compilation error in IncorrectAOTLibraryTest test > > I did this and scanned the differences (with the diff file from the webrev) and it looks reasonable to me. > > grep '^[+-]' diff.txt | grep -v Copyright | grep -v '^.[+-]' | less > > It looks like most of the changes are mechanical. There were only a few cases where manual changes were made. I trusted that you have tested those cases individually. > > But I don't understand why this error can happen. It seems like jtreg would allow two test cases to interfere with each other. Hi Ioi, thanks for review this, I ran the whole tier1-3 jobs which should provide enough coverage. as oracle builds don't have AOT feature enabled, I missed a compilation error in `IncorrectAOTLibraryTest` test. the test failed in GitHub action and should be fixed by [3a3b7a8](https://github.com/openjdk/jdk/pull/2985/commits/3a3b7a846289181b466b3c1eb478a0a571d9468b). -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/2985 From stuefe at openjdk.java.net Sat Mar 13 06:56:10 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 13 Mar 2021 06:56:10 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v16] In-Reply-To: References: <5Hmhp7S8616Kfbdsu5ObzFNy2uUFgJPCp0kvHr-U310=.3cabbe74-fe65-436b-973d-d6f3e64cd743@github.com> Message-ID: <9BB5Ykj1Pmbw0BqOISSTpmg-umLX_jnb87Jl-qZZVHI=.08b679db-c76d-49de-93e1-52d260932b4a@github.com> On Thu, 11 Mar 2021 22:32:35 GMT, Marcus G K Williams wrote: >>> Hi @kstefanj and @tstuefe . Trying to resolve your comments and working through your suggestions. I will be responding more over the next day or so as I try to implement and understand what you are proposing. Thanks again for your review and suggestions. >> >> Well, thanks for your patience :) > > Hello @tstuefe & @kstefanj. > > I've updated the change to implement your suggestions, all of them hopefully. I'd appreciate any further review and suggestions. > > There is one issue with the current code set for which I wanted to get suggestions. Invariably > `os::page_size_for_region_aligned(bytes, 1)` in `os::Linux::reserve_memory_special_huge_tlbfs_mixed` returns with 4096, which of course breaks the assert `assert(large_page_size > (size_t)os::vm_page_size(), "large page size: %ld not larger than small page size: %ld", large_page_size, (size_t)os::vm_page_size());` - Line 4059 in os_linux.cpp. > > Here is the ps.log: > >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/home/mgkwill/src/git/jdk/src/hotspot/os/linux/os_linux.cpp:4059), pid=2789288, tid=2789289 >> # assert(large_page_size > (size_t)os::vm_page_size()) failed: large page size: 4096 not larger than small page size: 4096 >> # >> # JRE version: (17.0) (slowdebug build ) >> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.mgkwill.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, serial gc, linux-amd64) >> # Problematic frame: >> # V [libjvm.so+0xfb660a] os::Linux::reserve_memory_special_huge_tlbfs_mixed(unsigned long, unsigned long, char*, bool)+0x146 >> # >> # Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-support/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java/scratch/1/core.2789288) >> # >> # An error report file with more information is saved as: >> # /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-support/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java/scratch/1/hs_err_pid2789288.log >> # >> # >> Aborted (core dumped) >> ps-2789288.log: >> [0.002s][info][pagesize] Available page sizes: 4k, 2M, 1G >> [0.003s][info][pagesize] Available large page sizes: 2M, 1G >> [0.005s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 2097152, for bytes: 251658240 >> [0.005s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096 >> [0.005s][info][pagesize] Memory: 4k page, physical 131844416k(13735672k free), swap 0k(0k free) >> [0.005s][info][pagesize] 2048k default large page >> [0.005s][info][pagesize] Page Sizes: 4k, 2M, 1G >> [0.005s][info][pagesize] CodeHeap 'non-nmethods': min=4M max=6M base=0x00007f8edc600000 page_size=2M size=6M >> [0.006s][info][pagesize] CodeHeap 'profiled nmethods': min=4M max=116M base=0x00007f8edcc00000 page_size=2M size=116M >> [0.007s][info][pagesize] CodeHeap 'non-profiled nmethods': min=4M max=118M base=0x00007f8ee4000000 page_size=2M size=118M >> [0.023s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 2097152, for bytes: 16202596352 >> [0.023s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096 >> [0.023s][info][pagesize] Memory: 4k page, physical 131844416k(13735672k free), swap 0k(0k free) >> [0.023s][info][pagesize] 2048k default large page >> [0.023s][info][pagesize] Page Sizes: 4k, 2M, 1G >> [0.023s][info][pagesize] Heap: min=8M max=15452M base=0x000000043a400000 page_size=2M size=15452M >> [0.023s][info][pagesize] Card Table: min=31645697B max=31645697B base=0x00007f8ef8919000 page_size=4K size=30908K >> [0.825s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496 >> [0.825s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096 >> [0.825s][info][pagesize] Memory: 4k page, physical 131844416k(13732388k free), swap 0k(0k free) >> [0.825s][info][pagesize] 2048k default large page >> [0.825s][info][pagesize] Page Sizes: 4k, 2M, 1G > > I'm not sure I understand why `Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496`. > > Any suggestions as to the issue and solution? Once I solve this I will remove the excessive logging. > > FYI: The above issue is what is failing linux x86 and x64 testing: > Test results: passed: 4; failed: 2 Report written to /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-results/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java/html/report.html Error: Some tests failed or other problems occurred. > Results written to /home/mgkwill/src/git/jdk/build/linux-x86_64-server-slowdebug/test-support/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java > Finished running test 'jtreg:test/hotspot/jtreg/runtime/os/TestTracePageSizes.java' > Test report is stored in build/linux-x86_64-server-slowdebug/test-results/jtreg_test_hotspot_jtreg_runtime_os_TestTracePageSizes_java > > ============================== > Test summary > ============================== > TEST TOTAL PASS FAIL ERROR > jtreg:test/hotspot/jtreg/runtime/os/TestTracePageSizes.java >>> 6 4 2 0 > > Thanks, > Marcus Hi Markus, could you please edit your last comment and use three backticks: long log output Thanks! ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From iklam at openjdk.java.net Sat Mar 13 07:14:09 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Sat, 13 Mar 2021 07:14:09 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split [v3] In-Reply-To: <2ydx9TUT868fiCQNxF6IaEsq9toXBiDJjJK3GqWRREE=.77fc9c8d-efed-47e8-8f53-02255da6e97e@github.com> References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> <2ydx9TUT868fiCQNxF6IaEsq9toXBiDJjJK3GqWRREE=.77fc9c8d-efed-47e8-8f53-02255da6e97e@github.com> Message-ID: On Sat, 13 Mar 2021 06:16:37 GMT, Ioi Lam wrote: > But I don't understand why this error can happen. It seems like jtreg would allow two test cases to interfere with each other. The root cause seems to be https://bugs.openjdk.java.net/browse/CODETOOLS-7902847 ------------- PR: https://git.openjdk.java.net/jdk/pull/2985 From stuefe at openjdk.java.net Sat Mar 13 07:38:11 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 13 Mar 2021 07:38:11 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v15] In-Reply-To: References: Message-ID: On Fri, 15 Jan 2021 11:09:28 GMT, Stefan Johansson wrote: >> Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 23 commits: >> >> - Merge branch 'master' into update_hlp >> - Merge branch 'master' into update_hlp >> - Remove extraneous ' from warning >> >> Signed-off-by: Marcus G K Williams >> - Merge branch 'master' into update_hlp >> - Merge branch 'master' into update_hlp >> - Merge branch 'master' into update_hlp >> - Fix os::large_page_size() in last update >> >> Signed-off-by: Marcus G K Williams >> - Ivan W. Requested Changes >> >> Removed os::Linux::select_large_page_size and >> use os::page_size_for_region instead >> >> Removed Linux::find_large_page_size and use >> register_large_page_sizes. Streamlined >> Linux::setup_large_page_size >> >> Signed-off-by: Marcus G K Williams >> - Fix space format, use Linux:: for local func. >> >> Signed-off-by: Marcus G K Williams >> - Merge branch 'update_hlp' of github.com:mgkwill/jdk into update_hlp >> - ... and 13 more: https://git.openjdk.java.net/jdk/compare/da2415fe...d73e7a4c > > src/hotspot/os/linux/os_linux.cpp line 4013: > >> 4011: assert(UseLargePages && UseHugeTLBFS, "only for Huge TLBFS large pages"); >> 4012: assert(is_aligned(bytes, large_page_size), "Unaligned size"); >> 4013: assert(is_aligned(req_addr, large_page_size), "Unaligned address"); > > Adding an assert here that `large_page_size` is larger than os::vm_page_size (small page size) to ensure we actually get a large page size from `page_size_for_region_aligned()`. Otherwise the passed in a size wasn't correctly aligned. @kstefanj Hmm. `os::page_size_for_region_xxx` can return any page size, including the base page size. Caller may reasonably pass in any reserve size; we may run on a system where the only large page available is > caller size, or we specified LargePageSizeInBytes=1G. I actually would prefer this function to graciously handle the case of too small input size and just allocate whatever fits the caller size best; if its only 4K pages so be it. But this also could be done in a future RFE. > src/hotspot/os/linux/os_linux.cpp line 4046: > >> 4044: // Select large_page_size from _page_sizes >> 4045: // that is smaller than size_t bytes >> 4046: size_t large_page_size = os::page_size_for_region_aligned(bytes, 1); > > This one also needs to use `os::page_size_for_region_unaligned(...)` since we know we have a size that needs both small and large pages. +1 to what stefan wrote ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From stuefe at openjdk.java.net Sat Mar 13 07:38:07 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 13 Mar 2021 07:38:07 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v19] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 22:25:28 GMT, Marcus G K Williams wrote: >> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using >> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). >> >> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). >> >> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. > > Marcus G K Williams has updated the pull request incrementally with two additional commits since the last revision: > > - Fix whitespace error > > Signed-off-by: Marcus G K Williams > - Fix first set of TestTracePageSizes.java issues > > Signed-off-by: Marcus G K Williams Hi Markus, first off, starting to look better and better. About the assert, I'm quite sure this is the result of using `os::page_size_for_region_aligned()` - as Stefan mentioned above you should use the unaligned version of this function since the input size 21098496 is not aligned to 2M, which causes the function to return 4K even though we have space enough to fit 9-10 2M pages here: [0.825s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496 Sorry, I don't have time to dive deeper. I know @kstefanj had some more ideas and this ties in with his work, but he may be occupied with other things right now and may not be quick to reply. All in all this starts to look real good. I try to resist the urge to refactor everything on the back of this PR. There will be work left for future RFEs. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From ysuenaga at openjdk.java.net Sat Mar 13 09:44:07 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Sat, 13 Mar 2021 09:44:07 GMT Subject: Integrated: 8262491: AArch64: CPU description should contain compatible board list In-Reply-To: References: Message-ID: On Sat, 27 Feb 2021 02:11:48 GMT, Yasumasa Suenaga wrote: > HotSpot generates CPU description when it is started. We can see it `jdk.CPUInformation` JFR event as below: > > $ jfr print --events jdk.CPUInformation raspi4.jfr > jdk.CPUInformation { > startTime = 22:57:13.521 > cpu = "AArch64" > description = "AArch64 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > `description` contains "AArch64", it is fixed value, we cannot guess the process was run on what machine (SoC). > > In Linux, we can use `compatible`property in device tree to guess the machine. The 'compatible' property contains a sorted list of strings starting with the exact name of the machine, followed by an optional list of boards it is compatible with sorted from most compatible to least. > > After this change, we can get the description as below: > > jdk.CPUInformation { > startTime = 00:32:49.767 > cpu = "AArch64" > description = "raspberrypi,4-model-b brcm,bcm2711 0x41:0x0:0xd08:3, simd, crc" > sockets = 4 > cores = 4 > hwThreads = 4 > } > > In Linux on AMD64, we can see as following, then we can guess the CPU model from it. The same should do for AArch64. > > jdk.CPUInformation { > startTime = 17:28:03.907 > cpu = "AMD (null) (HT) SSE SSE2 SSE3 SSSE3 SSE4.1 SSE4.2 SSE4A AMD64" > description = "Brand: AMD Ryzen 3 3300X 4-Core Processor , Vendor: AuthenticAMD > Family: (0x17), Model: (0x71), Stepping: 0x0 > Ext. family: 0x8, Ext. model: 0x7, Type: 0x0, Signature: 0x00870f10 > Features: ebx: 0x01020800, ecx: 0xfed83203, edx: 0x178bfbff > Ext. features: eax: 0x00870f10, ebx: 0x20000000, ecx: 0x004003f3, edx: 0x2fd3fbff > Supports: On-Chip FPU, Virtual Mode Extensions, Debugging Extensions, Page Size Extensions, Time Stamp Counter, Model Specific Registers, Physical Address Extension, Machine Check Exceptions, CMPXCHG8B Instruction, On-Chip APIC, Fast System Call, Memory Type Range Registers, Page Global Enable, Machine Check Architecture, Conditional Mov Instruction, Page Attribute Table, 36-bit Page Size Extension, CLFLUSH Instruction, Intel Architecture MMX Technology, Fast Float Point Save and Restore, Streaming SIMD extensions, Streaming SIMD extensions 2, Hyper Threading, Streaming SIMD Extensions 3, PCLMULQDQ, Supplemental Streaming SIMD Extensions 3, Fused Multiply-Add, CMPXCHG16B, Streaming SIMD extensions 4.1, Streaming SIMD extensions 4.2, MOVBE, Popcount instruction, AESNI, XSAVE, OSXSAVE, AVX, F16C, LAHF/SAHF instruction support, Core multi-processor leagacy mode, Advanced Bit Manipulations: LZCNT, SSE4A: MOVNTSS, MOVNTSD, EXTRQ, INSERTQ, Misaligned SSE mode, SYSCALL/SYSRET, Execute Dis able Bit, RDTSCP, Intel 64 Architecture" > sockets = 1 > cores = 2 > hwThreads = 2 > } This pull request has now been integrated. Changeset: a5287710 Author: Yasumasa Suenaga URL: https://git.openjdk.java.net/jdk/commit/a5287710 Stats: 40 lines in 4 files changed: 35 ins; 0 del; 5 mod 8262491: AArch64: CPU description should contain compatible board list Reviewed-by: akozlov, aph ------------- PR: https://git.openjdk.java.net/jdk/pull/2759 From manc at openjdk.java.net Sat Mar 13 10:48:21 2021 From: manc at openjdk.java.net (Man Cao) Date: Sat, 13 Mar 2021 10:48:21 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation Message-ID: Hi all, Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. -Man ------------- Commit messages: - 8263551: Provide shared lock-free FIFO queue implementation Changes: https://git.openjdk.java.net/jdk/pull/2986/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2986&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263551 Stats: 365 lines in 4 files changed: 232 ins; 131 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2986.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2986/head:pull/2986 PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Sat Mar 13 11:02:08 2021 From: manc at openjdk.java.net (Man Cao) Date: Sat, 13 Mar 2021 11:02:08 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: <9IfAIpqnmfePZeYWjSTS1q_tbYLLZmWFAYW5yHh5YNE=.17034449-5017-42af-95f3-48886b98fa6e@github.com> On Sat, 13 Mar 2021 10:41:44 GMT, Man Cao wrote: > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man src/hotspot/share/utilities/lockFreeQueue.hpp line 99: > 97: } else { > 98: assert(get_next(*old_tail) == NULL, "invariant"); > 99: Atomic::store(next_ptr(*old_tail), &first); I changed this store from a normal store to an Atomic store. Otherwise there is a data race between this store and the load of _head->_next in pop(). ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From david.holmes at oracle.com Sat Mar 13 12:12:21 2021 From: david.holmes at oracle.com (David Holmes) Date: Sat, 13 Mar 2021 22:12:21 +1000 Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: <9IfAIpqnmfePZeYWjSTS1q_tbYLLZmWFAYW5yHh5YNE=.17034449-5017-42af-95f3-48886b98fa6e@github.com> References: <9IfAIpqnmfePZeYWjSTS1q_tbYLLZmWFAYW5yHh5YNE=.17034449-5017-42af-95f3-48886b98fa6e@github.com> Message-ID: <1a8e13d3-3702-d6bd-596f-bd1089795960@oracle.com> On 13/03/2021 9:02 pm, Man Cao wrote: > On Sat, 13 Mar 2021 10:41:44 GMT, Man Cao wrote: > >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > src/hotspot/share/utilities/lockFreeQueue.hpp line 99: > >> 97: } else { >> 98: assert(get_next(*old_tail) == NULL, "invariant"); >> 99: Atomic::store(next_ptr(*old_tail), &first); > > I changed this store from a normal store to an Atomic store. Otherwise there is a data race between this store and the load of _head->_next in pop(). Atomic store just ensures no word-tearing, it has no impact on ordering or data races. David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/2986 > From dcubed at openjdk.java.net Sat Mar 13 14:23:08 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Sat, 13 Mar 2021 14:23:08 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split [v3] In-Reply-To: References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Message-ID: On Sat, 13 Mar 2021 06:44:12 GMT, Igor Ignatyev wrote: >> Hi all, >> >> could you please review this dull patch that replaces `ClassFileInstaller` w/ `jdk.test.lib.helpers.ClassFileInstaller` in all jtreg test descriptions to ensure we won't get split testlibrary, and removes `jdk/test/lib/ClassFileInstaller.java` (so it won't be accidentally used). >> >> from JBS: >>> after JDK-8263412, we might (again) encounter NCDFE b/c parts of testlibraries aren't on the classpath. this happens when jtreg builds `jdk.test.lib.helpers.ClassFileInstaller` as a part of test-specific code, but `ClassFileInstaller` as part of shared testibrary directory in one test, when in the following test, jtreg sees `ClassFileInstaller` in the shared directory, hence javac won't recompile it/its dependencies, but in runtime `jdk.test.lib.helpers.ClassFileInstaller` is nowhere to be found, hence we get NCDFE. >> >> testing: >> - [x] `grep ' ClassFileInstaller[^.]` >> - [ ] tier1-3 >> >> Thanks, >> -- Igor > > Igor Ignatyev has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. I downloaded the patch and used Ioi's cmd pattern to scroll through the changes. I can't honestly say that I looked at every line since 867 changed files would overwhelm anyone's brain... I did notice a couple of `@run main` instead of `@run driver` calls to the ClassFileInstaller, but those are pre-existing. Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2985 From iignatyev at openjdk.java.net Sat Mar 13 14:56:06 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 13 Mar 2021 14:56:06 GMT Subject: RFR: 8263549: 8263412 can cause jtreg testlibrary split [v3] In-Reply-To: References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Message-ID: On Sat, 13 Mar 2021 14:20:20 GMT, Daniel D. Daugherty wrote: >> Igor Ignatyev has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. > > I downloaded the patch and used Ioi's cmd pattern to scroll through > the changes. I can't honestly say that I looked at every line since 867 > changed files would overwhelm anyone's brain... > > I did notice a couple of `@run main` instead of `@run driver` calls > to the ClassFileInstaller, but those are pre-existing. > > Thumbs up. Hi Dan, Thanks for your review! > I did notice a couple of @run main instead of @run driver calls to the ClassFileInstaller, but those are pre-existing. I noticed this too, planning to fix that with a separate RFE. -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/2985 From iignatyev at openjdk.java.net Sat Mar 13 14:56:07 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sat, 13 Mar 2021 14:56:07 GMT Subject: Integrated: 8263549: 8263412 can cause jtreg testlibrary split In-Reply-To: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> References: <68VznhnTGY9ALWqvXzAulGxWtvI5-z2ljGj8zy07SKc=.1b9ef93b-f288-4e96-8ea7-b7080c93fa4f@github.com> Message-ID: <_mpUEh9QVZbxsshQP1hNQlstVSmOwqImgOwhZV4lvIQ=.f49472ae-9d8e-4058-beb8-77e0f7b4fd3a@github.com> On Sat, 13 Mar 2021 04:31:31 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this dull patch that replaces `ClassFileInstaller` w/ `jdk.test.lib.helpers.ClassFileInstaller` in all jtreg test descriptions to ensure we won't get split testlibrary, and removes `jdk/test/lib/ClassFileInstaller.java` (so it won't be accidentally used). > > from JBS: >> after JDK-8263412, we might (again) encounter NCDFE b/c parts of testlibraries aren't on the classpath. this happens when jtreg builds `jdk.test.lib.helpers.ClassFileInstaller` as a part of test-specific code, but `ClassFileInstaller` as part of shared testibrary directory in one test, when in the following test, jtreg sees `ClassFileInstaller` in the shared directory, hence javac won't recompile it/its dependencies, but in runtime `jdk.test.lib.helpers.ClassFileInstaller` is nowhere to be found, hence we get NCDFE. > > testing: > - [x] `grep ' ClassFileInstaller[^.]` > - [ ] tier1-3 > > Thanks, > -- Igor This pull request has now been integrated. Changeset: a7aba2b6 Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/a7aba2b6 Stats: 1738 lines in 867 files changed: 2 ins; 67 del; 1669 mod 8263549: 8263412 can cause jtreg testlibrary split Reviewed-by: iklam, dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/2985 From kbarrett at openjdk.java.net Sun Mar 14 02:42:08 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 14 Mar 2021 02:42:08 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 10:41:44 GMT, Man Cao wrote: > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 70: > 68: assert(Atomic::load(&_tail) != result, "invariant"); > 69: assert(get_next(*result) != NULL, "invariant"); > 70: *next_ptr(*result) = NULL; This should be set_next / Atomic::store. src/hotspot/share/utilities/lockFreeQueue.hpp line 113: > 111: // Return the entry following value in the list used by the > 112: // specialized LockFreeQueue class. > 113: static T* get_next(const T& value) { I think this function should not be public; it's needed internal to the implementation of this class, but if a client needs access to the next list entry it should be getting it via a member on T, assuming T provides such. And if it doesn't, well, you probably aren't supposed to be doing that. I see that LockFreeStack has public next and set_next; by that argument they should be private too. (I think the only reason they can't currently be private is because of unit tests, which could be fixed.) src/hotspot/share/utilities/lockFreeQueue.hpp line 46: > 44: // > 45: // \tparam rcu_pop true if use GlobalCounter critical section in pop(). > 46: template I think this is the wrong place for the rcu parameterization. Among other things, it violates the SCARY principle for template design, making the entire class dependent on this parameter that is only relevant to the one operation. I think it would be better if the parameterization was on the pop operation. src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 58: > 56: // CS could lead to excessive allocation of objects, because the CS > 57: // may block return of released objects to a free list for reuse. > 58: LockFreeQueueCriticalSection cs(current_thread); The comment about excessive allocation is closely tied to the use in G1DirtyCardQueueSet. The purpose of a critical section here needs further description and generalization. I'm wondering whether it's actually important (maybe it is, just not sure and haven't though about it for a while), but I'm also thinking LockFreeQueue/Stack ought to be consistent about this. That would suggest a common utility for optional critical sections. src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 52: > 50: template > 51: T* LockFreeQueue::pop() { > 52: Thread* current_thread = Thread::current(); The only use of current_thread is as the argument to the critical section object, where it might not be used, depending on the value of rcu_pop. It would be better to make this also conditional. src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 30: > 28: #include "runtime/atomic.hpp" > 29: #include "runtime/thread.inline.hpp" > 30: #include "utilities/globalCounter.inline.hpp" Presumably this include is the reason for putting the `pop` support in an inline.hpp file. But it seems clumsy to have most of the implementation in the ordinary header and this one function here, esp. since I suspect most clients will end up needing to include this file. So I'm suggesting move all of the implementation here. src/hotspot/share/utilities/lockFreeQueue.hpp line 50: > 48: NONCOPYABLE(LockFreeQueue); > 49: > 50: protected: Protected members (to be accessible from a derived class) are inconsistent with a public non-virtual destructor (that may allow destructor slicing). I dislike classes that try to be both concrete implementation classes and base classes; they are hard to design well (and this class wasn't intended to be such). This was done to allow G1DirtyCardQueueSet to extend it with the `take_all` function; that seems like a useful operation in the generic form, even if it can't be made thread-safe. (The G1 function asserts_at_safepoint(), but that's not really appropriate for a generic form.) src/hotspot/share/gc/g1/g1DirtyCardQueue.hpp line 34: > 32: #include "gc/shared/ptrQueue.hpp" > 33: #include "memory/allocation.hpp" > 34: #include "memory/padded.hpp" There are other uses of padded in this file besides those in the Queue that are being moved to lockFreeQueue.hpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Sun Mar 14 02:42:09 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 14 Mar 2021 02:42:09 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: <9IfAIpqnmfePZeYWjSTS1q_tbYLLZmWFAYW5yHh5YNE=.17034449-5017-42af-95f3-48886b98fa6e@github.com> References: <9IfAIpqnmfePZeYWjSTS1q_tbYLLZmWFAYW5yHh5YNE=.17034449-5017-42af-95f3-48886b98fa6e@github.com> Message-ID: On Sat, 13 Mar 2021 10:59:16 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > src/hotspot/share/utilities/lockFreeQueue.hpp line 99: > >> 97: } else { >> 98: assert(get_next(*old_tail) == NULL, "invariant"); >> 99: Atomic::store(next_ptr(*old_tail), &first); > > I changed this store from a normal store to an Atomic store. Otherwise there is a data race between this store and the load of _head->_next in pop(). As David said, Atomic::store doesn't indicate any ordering; it's a relaxed atomic store. The old code was `old_tail->set_next(&first)`, which was hard-wired to the element type being BufferNode (which was okay in its place). But the BufferNode code predates consistent use of Atomic when accessing atomic/volatile data, so don't currently use Atomic::load/store. In this generic hoist we no longer want to assume next and set_next functions, just the next_ptr function that returns a pointer to atomic/volatile. I see that you added get_next, which uses Atomic::load; I think there should be an associated set_next, as there are other places that need it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Sun Mar 14 02:42:09 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 14 Mar 2021 02:42:09 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 00:55:02 GMT, Kim Barrett wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > src/hotspot/share/utilities/lockFreeQueue.hpp line 113: > >> 111: // Return the entry following value in the list used by the >> 112: // specialized LockFreeQueue class. >> 113: static T* get_next(const T& value) { > > I think this function should not be public; it's needed internal to the implementation of this class, but if a client needs access to the next list entry it should be getting it via a member on T, assuming T provides such. And if it doesn't, well, you probably aren't supposed to be doing that. I see that LockFreeStack has public next and set_next; by that argument they should be private too. (I think the only reason they can't currently be private is because of unit tests, which could be fixed.) As mentioned elsewhere, I think there should be an associated set_next. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Sun Mar 14 02:42:10 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 14 Mar 2021 02:42:10 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 00:55:33 GMT, Kim Barrett wrote: >> src/hotspot/share/utilities/lockFreeQueue.hpp line 113: >> >>> 111: // Return the entry following value in the list used by the >>> 112: // specialized LockFreeQueue class. >>> 113: static T* get_next(const T& value) { >> >> I think this function should not be public; it's needed internal to the implementation of this class, but if a client needs access to the next list entry it should be getting it via a member on T, assuming T provides such. And if it doesn't, well, you probably aren't supposed to be doing that. I see that LockFreeStack has public next and set_next; by that argument they should be private too. (I think the only reason they can't currently be private is because of unit tests, which could be fixed.) > > As mentioned elsewhere, I think there should be an associated set_next. Normal naming convention is `next` without `get_` prefix. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Sun Mar 14 02:53:05 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 14 Mar 2021 02:53:05 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 10:41:44 GMT, Man Cao wrote: > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man Tests? There should be some gtests to go with this. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From iklam at openjdk.java.net Sun Mar 14 05:02:09 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Sun, 14 Mar 2021 05:02:09 GMT Subject: RFR: 8263555: use driver-mode to run ClassFileInstaller In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 20:12:32 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small and trivial clean-up patch that replaces `@run main j.t.l.h.ClassFileInstaller` w/ `@run driver j.t.l.h.ClassFileInstaller`? > > from JBS: >> there is no point in running ClassFileInstaller class w/ external flags, so it should be run w/ `@run driver`. > > Thanks, > -- Igor Looks good and trivial. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2989 From iignatyev at openjdk.java.net Sun Mar 14 05:25:08 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sun, 14 Mar 2021 05:25:08 GMT Subject: RFR: 8263555: use driver-mode to run ClassFileInstaller In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 04:58:56 GMT, Ioi Lam wrote: >> Hi all, >> >> could you please review this small and trivial clean-up patch that replaces `@run main j.t.l.h.ClassFileInstaller` w/ `@run driver j.t.l.h.ClassFileInstaller`? >> >> from JBS: >>> there is no point in running ClassFileInstaller class w/ external flags, so it should be run w/ `@run driver`. >> >> Thanks, >> -- Igor > > Looks good and trivial. thanks, Ioi! ------------- PR: https://git.openjdk.java.net/jdk/pull/2989 From iignatyev at openjdk.java.net Sun Mar 14 05:25:09 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Sun, 14 Mar 2021 05:25:09 GMT Subject: Integrated: 8263555: use driver-mode to run ClassFileInstaller In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 20:12:32 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this small and trivial clean-up patch that replaces `@run main j.t.l.h.ClassFileInstaller` w/ `@run driver j.t.l.h.ClassFileInstaller`? > > from JBS: >> there is no point in running ClassFileInstaller class w/ external flags, so it should be run w/ `@run driver`. > > Thanks, > -- Igor This pull request has now been integrated. Changeset: 9c84899d Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/9c84899d Stats: 6 lines in 5 files changed: 0 ins; 0 del; 6 mod 8263555: use driver-mode to run ClassFileInstaller Reviewed-by: iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/2989 From github.com+71302734+amitdpawar at openjdk.java.net Sun Mar 14 13:27:26 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Sun, 14 Mar 2021 13:27:26 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: > In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. > > This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. > > Following minimum expansion size are seen during expansion. > 1. 512KB without largepages and without UseNUMA. > 2. 64MB without largepages and with UseNUMA, > 3. 2MB (on x86) with large pages and without UseNUMA, > 4. 64MB without large pages and with UseNUMA. > > When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. > > Jtreg all test passed. > > Please review this change. Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: Fixed build issues for some targets and updated with suggested changes. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2976/files - new: https://git.openjdk.java.net/jdk/pull/2976/files/831eab9b..01e5077e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2976&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2976&range=00-01 Stats: 22 lines in 5 files changed: 2 ins; 17 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/2976.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2976/head:pull/2976 PR: https://git.openjdk.java.net/jdk/pull/2976 From github.com+71302734+amitdpawar at openjdk.java.net Sun Mar 14 13:31:07 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Sun, 14 Mar 2021 13:31:07 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 19:56:23 GMT, Amit Pawar wrote: > In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. > > This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. > > Following minimum expansion size are seen during expansion. > 1. 512KB without largepages and without UseNUMA. > 2. 64MB without largepages and with UseNUMA, > 3. 2MB (on x86) with large pages and without UseNUMA, > 4. 64MB without large pages and with UseNUMA. > > When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. > > Jtreg all test passed. > > Please review this change. Thank you for your feedback. I have updated as per your suggestion and fixed build issues on some targets. Please check. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From dholmes at openjdk.java.net Sun Mar 14 21:36:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 14 Mar 2021 21:36:08 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 13:27:26 GMT, Amit Pawar wrote: >> In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. >> >> This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. >> >> Following minimum expansion size are seen during expansion. >> 1. 512KB without largepages and without UseNUMA. >> 2. 64MB without largepages and with UseNUMA, >> 3. 2MB (on x86) with large pages and without UseNUMA, >> 4. 64MB without large pages and with UseNUMA. >> >> When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. >> >> Jtreg all test passed. >> >> Please review this change. > > Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: > > Fixed build issues for some targets and updated with suggested changes. src/hotspot/share/gc/shared/pretouchTask.cpp line 66: > 64: > 65: // Following atomic loads are required to make other processor store > 66: // visible to all threads from this points. That is not what an Atomic::load does; they only protect against word-tearing (which is only a theoretical possibility on some platforms). If you want to guarantee those loads see the most recent store then the fence needs to come first. src/hotspot/share/gc/shared/pretouchTask.cpp line 94: > 92: > 93: if (thread_num == 0) { > 94: Atomic::release_store(&_cur_addr, _end_addr); The release is unnecessary given the Atomic::sub. src/hotspot/share/gc/shared/pretouchTask.hpp line 47: > 45: > 46: void set_task_status(TaskStatus status) { Atomic::release_store(&_task_status, (size_t)status); } > 47: void set_task_done() { Atomic::release_store(&_task_status, (size_t)Done); } There is a convention that when a setter embodies a release_store that release is included in the name e.g. release_set_task_done(). src/hotspot/share/gc/shared/pretouchTask.hpp line 51: > 49: void set_task_notready() { set_task_status(NotReady); } > 50: bool is_task_ready() { return Atomic::load(&_task_status) == Ready; } > 51: bool is_task_done() { return Atomic::load(&_task_status) == Done; } Given these fields are set with a release_store I would expect to see them read with a load_acquire to match with it. And named eg. is_task_done_acquire(). ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From dholmes at openjdk.java.net Sun Mar 14 21:42:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Sun, 14 Mar 2021 21:42:08 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 13:27:26 GMT, Amit Pawar wrote: >> In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. >> >> This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. >> >> Following minimum expansion size are seen during expansion. >> 1. 512KB without largepages and without UseNUMA. >> 2. 64MB without largepages and with UseNUMA, >> 3. 2MB (on x86) with large pages and without UseNUMA, >> 4. 64MB without large pages and with UseNUMA. >> >> When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. >> >> Jtreg all test passed. >> >> Please review this change. > > Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: > > Fixed build issues for some targets and updated with suggested changes. src/hotspot/share/gc/shared/pretouchTask.cpp line 96: > 94: Atomic::release_store(&_cur_addr, _end_addr); > 95: OrderAccess::storestore(); > 96: set_task_done(); The storestore barrier is not needed as the set_task_done() is a release_store which already has a storestore barrier. src/hotspot/share/gc/shared/pretouchTask.cpp line 56: > 54: void PretouchTask::reinitialize(char* start_addr, char* end_addr) { > 55: Atomic::release_store(&_cur_addr, start_addr); > 56: Atomic::release_store(&_end_addr, end_addr); Back-to-back release-stores have redundant loadstore semantics. You could just use a storestore() barrier after the first release_store, then use plain Atomic::store for the second field. src/hotspot/share/gc/parallel/mutableSpace.cpp line 149: > 147: // waiting to expand old-gen will join from PSOldGen::expand_for_allocate > 148: // function for pretouch work. > 149: pretouch_task->set_task_ready(); The storestore is not needed as it is included in the release_store of set_task_ready(). ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From zgu at openjdk.java.net Mon Mar 15 00:29:08 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 15 Mar 2021 00:29:08 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v3] In-Reply-To: References: Message-ID: <4qDJ9Cdv-DWx84RIVLL-CYL7peOUye65AY8HUXFeTbo=.a6113f50-5362-475a-92bf-fd754b3ec066@github.com> On Fri, 12 Mar 2021 14:06:27 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Correct order of rendezvous, global- and local-flag updates; cleanup rendezvous > - Verify correct weakroots-in-progress state (by Aleksey) src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1737: > 1735: // We need a rendezvous here to avoid the following race: > 1736: // 1. Java thread reads referent, sees non-null but unreachable oop > 1737: // 2. GC thread clears the referent How is this possible? GC threads (workers) are not running when rendezvous roots. What do I miss? ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From kim.barrett at oracle.com Mon Mar 15 00:35:20 2021 From: kim.barrett at oracle.com (Kim Barrett) Date: Mon, 15 Mar 2021 00:35:20 +0000 Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion In-Reply-To: References: Message-ID: > On Mar 12, 2021, at 3:01 PM, Amit Pawar wrote: > > In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. > > This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. > > Following minimum expansion size are seen during expansion. > 1. 512KB without largepages and without UseNUMA. > 2. 64MB without largepages and with UseNUMA, > 3. 2MB (on x86) with large pages and without UseNUMA, > 4. 64MB without large pages and with UseNUMA. > > When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. Sorry, but a change like this needs better motivation. What you say above suggests this change doesn't actually help. It's intentional that oldgen expansions aren't generally large, as the oldgen shouldn't be grown unnecessarily. There are already parameters such as MinHeapDeltaBytes to control and manipulate this. It is also preferable to complete an expansion request quickly to make the additional space available to other threads in the main allocation path, rather than making them go to the expand path. Making expansions larger could force more threads to take the slower expand path, which doesn't seem like a win even if they then help with the pretouch part of another thread's expansion. (And that also assumes UsePreTouch is even enabled.) So the followup change that you say is needed to make this one profitable seems questionable. The proposed change is also surprisingly large and intrusive for something that seems like it should be very localized. > Jtreg all test passed. A change like this needs a lot more testing than that, both functionally and performance. From kbarrett at openjdk.java.net Mon Mar 15 02:16:09 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 15 Mar 2021 02:16:09 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 13:27:26 GMT, Amit Pawar wrote: >> In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. >> >> This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. >> >> Following minimum expansion size are seen during expansion. >> 1. 512KB without largepages and without UseNUMA. >> 2. 64MB without largepages and with UseNUMA, >> 3. 2MB (on x86) with large pages and without UseNUMA, >> 4. 64MB without large pages and with UseNUMA. >> >> When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. >> >> Jtreg all test passed. >> >> Please review this change. > > Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: > > Fixed build issues for some targets and updated with suggested changes. Changes requested by kbarrett (Reviewer). src/hotspot/share/gc/parallel/psOldGen.cpp line 226: > 224: while (!pretouch()->is_task_done()) { > 225: if (pretouch()->is_task_ready()) { > 226: pretouch()->work(Thread::current()->osthread()->thread_id()); worker thread_id and os thread_id are entirely different things. This is another indication that reusing (abusing) PretouchTask in this way is a mistake. src/hotspot/share/gc/parallel/psOldGen.hpp line 49: > 47: PSGenerationCounters* _gen_counters; > 48: SpaceCounters* _space_counters; > 49: PretouchTask* _pretouch; // Used when old gen resized during scavenging. I think abusing PretouchTask in this way, completely outside the workgang protocol, is confusing and shouldn't be done. There might be some code that could be shared between this use and PretouchTask, but if so then it should be factored out for such sharing, rather than mangling PretouchTask in the way being proposed. src/hotspot/share/gc/shared/pretouchTask.cpp line 68: > 66: // visible to all threads from this points. > 67: char *cur_addr = Atomic::load(&_cur_addr); > 68: char *end_addr = Atomic::load(&_end_addr); end needs to be read before cur. Otherwise, cur could be from one pretouch instance and end could be from a later, unrelated, pretouch instance, leading to scribbling. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From kbarrett at openjdk.java.net Mon Mar 15 02:16:10 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 15 Mar 2021 02:16:10 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 21:29:59 GMT, David Holmes wrote: >> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed build issues for some targets and updated with suggested changes. > > src/hotspot/share/gc/shared/pretouchTask.cpp line 94: > >> 92: >> 93: if (thread_num == 0) { >> 94: Atomic::release_store(&_cur_addr, _end_addr); > > The release is unnecessary given the Atomic::sub. Besides what David said, I think this is also confusing. I think it's to account for the claim (using fetch_and_add) unconditionally adding the chunk size to cur_addr, which could lead to overshoot. I think it would be clearer to prevent the overshoot from occurring. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From iignatyev at openjdk.java.net Mon Mar 15 05:03:20 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 15 Mar 2021 05:03:20 GMT Subject: RFR: 8263556: remove `@modules java.base` from tests Message-ID: Hi all, could you please review this trivial cleanup? from JBS: > jtreg `@modules X` directive does two things: > - exclude a test from execution if JDK under test doesn't have module X > - if JDK under test has module X, make sure it's resolved > > both these things have no sense for `java.base` module as it's always available and is always resolved. Thanks, -- Igor ------------- Commit messages: - update copyright year - 8263556: remove `@modules java.base` from tests Changes: https://git.openjdk.java.net/jdk/pull/2990/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2990&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263556 Stats: 21 lines in 13 files changed: 0 ins; 13 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/2990.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2990/head:pull/2990 PR: https://git.openjdk.java.net/jdk/pull/2990 From dholmes at openjdk.java.net Mon Mar 15 06:41:22 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 15 Mar 2021 06:41:22 GMT Subject: RFR: 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed Message-ID: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> Please see bug report for gory details. For the specific issue here of the vm_assert gtests I propose to make two changes to the VM: 1. When core dumps are disabled, os::abort should call ::_exit not ::exit, as the former more closely models the abrupt termination of ::abort() but without the core dump. 2. The race condition when SupressFatalErrorMessages is true is fixed by placing the check after the atomic set/check of the thread-id. That way only a single thread can trigger the fatal error processing. I was debating whether to make a slight change so that even when SuppressFatalErrorMessage is true, secondary failures will report that such an error occurred but not with any details. But I've left the existing silence for now. It is possible someone uses the flag to hide a message I would like to expose. I suppose adding additional output in debug builds only may be an option - options welcomed. Testing: - fully manual I manually set up the conditions where a background thread could crash due to the atexit actions executing. I added special debug code to show what was happening in such cases, and that secondary errors were occurring. I then applied the fix for #2 and saw the second thread getting caught; then I applied fix #1 and the secondary crashes were gone. Also did tier 1-3 testing and local gtest testing just to sanity check things. Thanks, David ------------- Commit messages: - 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed Changes: https://git.openjdk.java.net/jdk/pull/3002/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3002&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8261916 Stats: 31 lines in 2 files changed: 20 ins; 3 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/3002.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3002/head:pull/3002 PR: https://git.openjdk.java.net/jdk/pull/3002 From jbachorik at openjdk.java.net Mon Mar 15 09:25:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 15 Mar 2021 09:25:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 15 Mar 2021 09:18:52 GMT, Jaroslav Bachorik wrote: >> src/hotspot/share/gc/shared/space.inline.hpp line 140: >> >>> 138: size_t get_dead_space() { >>> 139: return (_max_deadspace_words - _allowed_deadspace_words) * HeapWordSize; >>> 140: } >> >> Hotspot does not use a "get_" prefix for getters. Also not sure why this needs to be private (and the friend class), I would prefer this instead of the friending. Retrieving the actual amount of dead space from a class that calculates it does not seem something that needs hiding. > > The visibility for `_dead_space` was changed based on this comment: https://github.com/openjdk/jdk/pull/2579/files/f69541864e093bc5b250bf625ec75983764ba2bf#r585771280 by @shipilev Also, I see a bunch of methods named `get_*` in GC code alone. I have no problem renaming it to eg. `dead_space()` but it does not seem that this naming pattern is not used in hotspot. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 15 09:25:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 15 Mar 2021 09:25:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 11 Mar 2021 14:44:10 GMT, Thomas Schatzl wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unused field > > src/hotspot/share/gc/shared/space.inline.hpp line 140: > >> 138: size_t get_dead_space() { >> 139: return (_max_deadspace_words - _allowed_deadspace_words) * HeapWordSize; >> 140: } > > Hotspot does not use a "get_" prefix for getters. Also not sure why this needs to be private (and the friend class), I would prefer this instead of the friending. Retrieving the actual amount of dead space from a class that calculates it does not seem something that needs hiding. The visibility for `_dead_space` was changed based on this comment: https://github.com/openjdk/jdk/pull/2579/files/f69541864e093bc5b250bf625ec75983764ba2bf#r585771280 by @shipilev ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From rkennke at openjdk.java.net Mon Mar 15 09:30:12 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 15 Mar 2021 09:30:12 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v3] In-Reply-To: <4qDJ9Cdv-DWx84RIVLL-CYL7peOUye65AY8HUXFeTbo=.a6113f50-5362-475a-92bf-fd754b3ec066@github.com> References: <4qDJ9Cdv-DWx84RIVLL-CYL7peOUye65AY8HUXFeTbo=.a6113f50-5362-475a-92bf-fd754b3ec066@github.com> Message-ID: On Mon, 15 Mar 2021 00:25:54 GMT, Zhengyu Gu wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Correct order of rendezvous, global- and local-flag updates; cleanup rendezvous >> - Verify correct weakroots-in-progress state (by Aleksey) > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1737: > >> 1735: // We need a rendezvous here to avoid the following race: >> 1736: // 1. Java thread reads referent, sees non-null but unreachable oop >> 1737: // 2. GC thread clears the referent > > How is this possible? GC threads (workers) are not running when rendezvous roots. What do I miss? But they are running *before* rendezvous roots. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From jbachorik at openjdk.java.net Mon Mar 15 09:31:10 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 15 Mar 2021 09:31:10 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Thu, 11 Mar 2021 15:42:51 GMT, Thomas Schatzl wrote: >> I am leaving this as "request changes" for now as the question I had earlier about that after G1 Full gc the value of `_live_estimate` still seems unanswered and there does not seem to be code in this change for this. Is this intentional? (Not even setting the live bytes to `used()` which at that point would be a good estimate) >> >> There is another PR (#2760) that implements something like that although I haven't looked at it in detail. >> >> Otherwise looks okay. > > Started reviewing PR #2760, and it implements liveness calculation for G1 full gc. I also suggested [there](https://github.com/openjdk/jdk/pull/2760#discussion_r592449837) to extract this functionality out into an extra CR. Maybe you can work together. @tschatzl @Hamlin-Li Would it be ok to set the live estimate to the `used()` value at the end of `G1FullCollector::phase4_do_compaction()` method to have something suboptimal but working and refine in https://github.com/openjdk/jdk/pull/2760 (or a subsequent ticket/PR once both parts are ready)? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From rehn at openjdk.java.net Mon Mar 15 11:53:20 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Mon, 15 Mar 2021 11:53:20 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION Message-ID: When returning from the last Java frame back to vm and hitting a safepoint poll on that last return we sometimes have a last java frame but no vframe. This seem to be a bug in itself, handled in: 8263576 Other places which uses vframe NULL checks it before, so let's do that in GetCurrentLocationClosure also. Testing: nsk jdi/jvmti, jdk jdi, jck vm and t1-3. ------------- Commit messages: - Check vframe non-null Changes: https://git.openjdk.java.net/jdk/pull/3010/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3010&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8261262 Stats: 9 lines in 1 file changed: 0 ins; 3 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3010.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3010/head:pull/3010 PR: https://git.openjdk.java.net/jdk/pull/3010 From sjohanss at openjdk.java.net Mon Mar 15 12:27:17 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Mon, 15 Mar 2021 12:27:17 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 15 Mar 2021 09:26:32 GMT, Jaroslav Bachorik wrote: >> Started reviewing PR #2760, and it implements liveness calculation for G1 full gc. I also suggested [there](https://github.com/openjdk/jdk/pull/2760#discussion_r592449837) to extract this functionality out into an extra CR. Maybe you can work together. > > @tschatzl @Hamlin-Li > Would it be ok to set the live estimate to the `used()` value at the end of `G1FullCollector::phase4_do_compaction()` method to have something suboptimal but working and refine in https://github.com/openjdk/jdk/pull/2760 (or a subsequent ticket/PR once both parts are ready)? Sorry for being a bit late to the party. Looking at the suggested implementation for G1 I see a problem with only updating this after concurrent mark (and the Full GC). Say for example you have a concurrent mark cycle before the heap has expanded a lot and you get a low value stored in `G1CollectedHeap::_live`. Then the heap expands and your application get to a steady state that doesn't require any more marking cycles. In this case the same value will be reported for the entire run. For this to work the _live value would have to be updated at every GC, but this is a bit costly. Maybe the first version could just use `used()` for G1. Have you done any tests to see how off this would be compared to the other GCs? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From stefank at openjdk.java.net Mon Mar 15 12:43:26 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 15 Mar 2021 12:43:26 GMT Subject: RFR: 8263589: Introduce JavaValue::get_oop/set_oop Message-ID: JavaValue is a small wrapper class that wraps values used to pass arguments and results between native and Java. When JavaCalls::call returns an object, the value stored in the JavaValue is not a handliezed jobject. Instead it's a raw oop. So, most of the code handling the `result`, fetches the result as a jobject, and then immediately casts it to an oop. For example: oop res = (oop)result.get_jobject(); I'd like to change this code to be: oop res = result.get_oop(); The motivations for this patch is: 1) Minimize the places where we pass around oops in jobject variables. Maybe at some point we'll have converted the JVM to only use the jobject type when passing around JNI handle. We need to be stricter with the types when we continue develop our GCs and their barriers. 2) Limit the number of places in the code where we perform raw oop casts. We have a helper cast function for that, cast_to_oop, but not all code use it. I have future patches where the compiler will completely forbid raw cast to oops (in fastdebug builds). With that in place, I can then add more stricter oop verification code when oops are created. This helps catching bugs earlier. --- When reviewing this patch, take an extra look at the change to oopsHierarchy.hpp. This was done to support jvmciEnv.cpp code: JVMCIObject wrap(oop obj)... JVMCIObjectArray wrap(objArrayOop obj)... JVMCIPrimitiveArray wrap(typeArrayOop obj) ... Previously, `wrap((oop)result.get_jobject())` called the first function. When the code was changed to `wrap(result.get_oop())`, where `get_oop()` returns a `oopDesc*`, the compiler didn't know what conversion in oopsHierarchy.hpp to use. Therefore, I replaced the overly permissive `void*` constructor with a constructor that only takes the corresponding `type##OopDesc*`. An alternative would be to let get_oop() return an oop, but then that would add an unwanted a dependency between globalDefinitions.hpp and oopsHierarchy.hpp. An earlier version of this patch did return an oop instead of oopDesc*, but it also moved entire JavaValue class out of globalDefinitions.hpp into a new javaValue.hpp file, and had a corresponding javaValue.inline.hpp file. Even if we end up using the proposed `oopDesc* get_oop()` version, maybe moving the class to javaValues.hpp would still makes sense? ------------- Commit messages: - Commit unstaged changes - 8263589: Introduce JavaValue::get_oop/set_oop Changes: https://git.openjdk.java.net/jdk/pull/3013/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3013&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263589 Stats: 72 lines in 26 files changed: 5 ins; 0 del; 67 mod Patch: https://git.openjdk.java.net/jdk/pull/3013.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3013/head:pull/3013 PR: https://git.openjdk.java.net/jdk/pull/3013 From fparain at openjdk.java.net Mon Mar 15 13:02:10 2021 From: fparain at openjdk.java.net (Frederic Parain) Date: Mon, 15 Mar 2021 13:02:10 GMT Subject: Integrated: 8263544: Unused argument in ConstantPoolCacheEntry::set_field() In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 21:23:35 GMT, Frederic Parain wrote: > Please review this trivial fix removing an unused argument from ConstantPoolCacheEntry::set_field(). > > Thank you, > > Fred This pull request has now been integrated. Changeset: 80cdf788 Author: Frederic Parain URL: https://git.openjdk.java.net/jdk/commit/80cdf788 Stats: 6 lines in 3 files changed: 0 ins; 3 del; 3 mod 8263544: Unused argument in ConstantPoolCacheEntry::set_field() Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/2978 From fparain at openjdk.java.net Mon Mar 15 13:02:08 2021 From: fparain at openjdk.java.net (Frederic Parain) Date: Mon, 15 Mar 2021 13:02:08 GMT Subject: RFR: 8263544: Unused argument in ConstantPoolCacheEntry::set_field() In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 06:03:45 GMT, David Holmes wrote: >> Please review this trivial fix removing an unused argument from ConstantPoolCacheEntry::set_field(). >> >> Thank you, >> >> Fred > > Looks good and trivial. > > Thanks, > David Coleen, David, Thank you for the reviews. Fred ------------- PR: https://git.openjdk.java.net/jdk/pull/2978 From dcubed at openjdk.java.net Mon Mar 15 13:33:09 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 15 Mar 2021 13:33:09 GMT Subject: RFR: 8263556: remove `@modules java.base` from tests In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 20:26:42 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial cleanup? > from JBS: > >> jtreg `@modules X` directive does two things: >> - exclude a test from execution if JDK under test doesn't have module X >> - if JDK under test has module X, make sure it's resolved >> >> both these things have no sense for `java.base` module as it's always available and is always resolved. > > > Thanks, > -- Igor Thumbs up. I agree that this is a trivial change. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2990 From zgu at openjdk.java.net Mon Mar 15 14:08:09 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 15 Mar 2021 14:08:09 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v3] In-Reply-To: References: <4qDJ9Cdv-DWx84RIVLL-CYL7peOUye65AY8HUXFeTbo=.a6113f50-5362-475a-92bf-fd754b3ec066@github.com> Message-ID: <91kRqt6lWB9itbiGgBo0ZoHIbVjeumUFHoFOY_Pc2wY=.f4d5f185-f689-4cd8-a41c-85482aec676a@github.com> On Mon, 15 Mar 2021 09:27:01 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1737: >> >>> 1735: // We need a rendezvous here to avoid the following race: >>> 1736: // 1. Java thread reads referent, sees non-null but unreachable oop >>> 1737: // 2. GC thread clears the referent >> >> How is this possible? GC threads (workers) are not running when rendezvous roots. What do I miss? > > But they are running *before* rendezvous roots. Okay, looks good. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From stefank at openjdk.java.net Mon Mar 15 14:40:23 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 15 Mar 2021 14:40:23 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments Message-ID: JavaCallArguments has this code and comment: // Helper for push_oop and the like. The value argument is a // "handle" that refers to an oop. We record the address of the // handle rather than the designated oop. The handle is later // resolved to the oop by parameters(). This delays the exposure of // naked oops until it is GC-safe. template inline int push_oop_impl(T handle, int size) { // JNITypes::put_obj expects an oop value, so we play fast and // loose with the type system. The cast from handle type to oop // *must* use a C-style cast. In a product build it performs a // reinterpret_cast. In a debug build (more accurately, in a // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking // the debug-only oop class's conversion from void* constructor. JNITypes::put_obj((oop)handle, _value, size); // Updates size. return size; // Return the updated size. } The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. ------------- Commit messages: - 8263595: Remove oop type punning in JavaCallArguments Changes: https://git.openjdk.java.net/jdk/pull/3014/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3014&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263595 Stats: 46 lines in 8 files changed: 2 ins; 26 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/3014.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3014/head:pull/3014 PR: https://git.openjdk.java.net/jdk/pull/3014 From jbachorik at openjdk.java.net Mon Mar 15 15:24:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 15 Mar 2021 15:24:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <_1nv16MS2e-Ed1xHZyA8hewbCCCSCCkq0OafP4rRJq8=.75ee1e9d-17b3-46c3-87d1-c03ef2765cf3@github.com> On Mon, 15 Mar 2021 12:24:02 GMT, Stefan Johansson wrote: >> @tschatzl @Hamlin-Li >> Would it be ok to set the live estimate to the `used()` value at the end of `G1FullCollector::phase4_do_compaction()` method to have something suboptimal but working and refine in https://github.com/openjdk/jdk/pull/2760 (or a subsequent ticket/PR once both parts are ready)? > > Sorry for being a bit late to the party. Looking at the suggested implementation for G1 I see a problem with only updating this after concurrent mark (and the Full GC). Say for example you have a concurrent mark cycle before the heap has expanded a lot and you get a low value stored in `G1CollectedHeap::_live`. Then the heap expands and your application get to a steady state that doesn't require any more marking cycles. In this case the same value will be reported for the entire run. > > For this to work the _live value would have to be updated at every GC, but this is a bit costly. Maybe the first version could just use `used()` for G1. Have you done any tests to see how off this would be compared to the other GCs? @kstefanj > Then the heap expands and your application get to a steady state that doesn't require any more marking cycles. Is there a way to get the liveness info when the heap expands? If not that would mean we had no way to figure out the new live set size and would assume, conservatively, the last known value. As I mentioned in the PR description the live size value will be a 'best effort estimate' depending on what can each particular GC provide. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From tschatzl at openjdk.java.net Mon Mar 15 15:41:11 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Mon, 15 Mar 2021 15:41:11 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 15 Mar 2021 09:21:50 GMT, Jaroslav Bachorik wrote: >> The visibility for `_dead_space` was changed based on this comment: https://github.com/openjdk/jdk/pull/2579/files/f69541864e093bc5b250bf625ec75983764ba2bf#r585771280 by @shipilev > > Also, I see a bunch of methods named `get_*` in GC code alone. I have no problem renaming it to eg. `dead_space()` but it does not seem that this naming pattern is not used in hotspot. https://wiki.openjdk.java.net/display/HotSpot/StyleGuide > Nearly all of the guidelines mentioned below have many counter-examples in the Hotspot code base. Finding a counterexample is not sufficient justification for new code to follow the counterexample as a precedent, since readers of your code will rightfully expect your code to follow the greater bulk of precedents documented here. For more on counterexamples, see the section at the bottom of this page. > > When changing pre-existing code, it is reasonable to adjust it to match these conventions. Exception: If the pre-existing code clearly conforms locally to its own peculiar conventions, it is not worth reformatting the whole thing. and > Getter accessor names are noun phrases, with no "get_" noise word. Boolean getters can also begin with "is_" or "has_". Unless there is a good reason, please keep to few rules the official style guide for new code. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From github.com+168222+mgkwill at openjdk.java.net Mon Mar 15 15:45:14 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Mon, 15 Mar 2021 15:45:14 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v19] In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 07:34:26 GMT, Thomas Stuefe wrote: > Hi Markus, > > first off, starting to look better and better. > > About the assert, I'm quite sure this is the result of using `os::page_size_for_region_aligned()` - as Stefan mentioned above you should use the unaligned version of this function since the input size 21098496 is not aligned to 2M, which causes the function to return 4K even though we have space enough to fit 9-10 2M pages here: > > ``` > [0.825s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496 > ``` > > Sorry, I don't have time to dive deeper. I know @kstefanj had some more ideas and this ties in with his work, but he may be occupied with other things right now and may not be quick to reply. > > All in all this starts to look real good. I try to resist the urge to refactor everything on the back of this PR. There will be work left for future RFEs. > > Cheers, Thomas Thanks @tstuefe Thomas. Changed to os::page_size_for_region_unaligned at os::Linux::reserve_memory_special_huge_tlbfs_mixed. This helped. Working on new patch now. I'd really like to resist too much refactoring here also. I believe @kstefanj is looking at more in depth refactor. Thanks, Marcus ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From jbachorik at openjdk.java.net Mon Mar 15 16:03:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 15 Mar 2021 16:03:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 15 Mar 2021 15:38:48 GMT, Thomas Schatzl wrote: >> Also, I see a bunch of methods named `get_*` in GC code alone. I have no problem renaming it to eg. `dead_space()` but it does not seem that this naming pattern is not used in hotspot. > > https://wiki.openjdk.java.net/display/HotSpot/StyleGuide > >> Nearly all of the guidelines mentioned below have many counter-examples in the Hotspot code base. Finding a counterexample is not sufficient justification for new code to follow the counterexample as a precedent, since readers of your code will rightfully expect your code to follow the greater bulk of precedents documented here. For more on counterexamples, see the section at the bottom of this page. >> >> When changing pre-existing code, it is reasonable to adjust it to match these conventions. Exception: If the pre-existing code clearly conforms locally to its own peculiar conventions, it is not worth reformatting the whole thing. > > and > >> Getter accessor names are noun phrases, with no "get_" noise word. Boolean getters can also begin with "is_" or "has_". > > Unless there is a good reason, please keep to few rules the official style guide for new code. ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Mon Mar 15 16:17:23 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 15 Mar 2021 16:17:23 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v13] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Change get_dead_space() to dead_space() ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/67d78940..056f5fd7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=12 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=11-12 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From naoto at openjdk.java.net Mon Mar 15 16:21:08 2021 From: naoto at openjdk.java.net (Naoto Sato) Date: Mon, 15 Mar 2021 16:21:08 GMT Subject: RFR: 8263556: remove `@modules java.base` from tests In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 20:26:42 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial cleanup? > from JBS: > >> jtreg `@modules X` directive does two things: >> - exclude a test from execution if JDK under test doesn't have module X >> - if JDK under test has module X, make sure it's resolved >> >> both these things have no sense for `java.base` module as it's always available and is always resolved. > > > Thanks, > -- Igor Marked as reviewed by naoto (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2990 From iris at openjdk.java.net Mon Mar 15 16:28:08 2021 From: iris at openjdk.java.net (Iris Clark) Date: Mon, 15 Mar 2021 16:28:08 GMT Subject: RFR: 8263556: remove `@modules java.base` from tests In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 20:26:42 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial cleanup? > from JBS: > >> jtreg `@modules X` directive does two things: >> - exclude a test from execution if JDK under test doesn't have module X >> - if JDK under test has module X, make sure it's resolved >> >> both these things have no sense for `java.base` module as it's always available and is always resolved. > > > Thanks, > -- Igor Marked as reviewed by iris (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2990 From jbachorik at openjdk.java.net Mon Mar 15 16:32:31 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Mon, 15 Mar 2021 16:32:31 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v14] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Capture live estimate for G1 full cycle ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/056f5fd7..81250d1c Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=13 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=12-13 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From coleenp at openjdk.java.net Mon Mar 15 16:47:09 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 16:47:09 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: <94-niMB2FUmh-8Q26BJs3blnztaKgXP9TtdOwjPwSM0=.75a5ad09-4fce-4185-b4c2-d691626ca1ad@github.com> On Mon, 15 Mar 2021 14:34:48 GMT, Stefan Karlsson wrote: > JavaCallArguments has this code and comment: > > // Helper for push_oop and the like. The value argument is a > // "handle" that refers to an oop. We record the address of the > // handle rather than the designated oop. The handle is later > // resolved to the oop by parameters(). This delays the exposure of > // naked oops until it is GC-safe. > template > inline int push_oop_impl(T handle, int size) { > // JNITypes::put_obj expects an oop value, so we play fast and > // loose with the type system. The cast from handle type to oop > // *must* use a C-style cast. In a product build it performs a > // reinterpret_cast. In a debug build (more accurately, in a > // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking > // the debug-only oop class's conversion from void* constructor. > JNITypes::put_obj((oop)handle, _value, size); // Updates size. > return size; // Return the updated size. > } > The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. > > I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. > > I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. src/hotspot/share/runtime/javaCalls.hpp line 110: > 108: // handle rather than the designated oop. The handle is later > 109: // resolved to the oop by parameters(). This delays the exposure of > 110: // naked oops until it is GC-safe. I thought this was the reason we had to have this ugliness. Once you push an oop to the argument stack, you could have a GC and this was a naked oop. By having the Handle pointer or jobject pointer, you'd get a pointer to the oop for GC to process. Maybe this change does the same thing though, only a lot nicer. Yes, it looks like it does. Now to have Kim point out why it's wrong :( I hope not because I like this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From coleenp at openjdk.java.net Mon Mar 15 16:52:08 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 16:52:08 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 14:34:48 GMT, Stefan Karlsson wrote: > JavaCallArguments has this code and comment: > > // Helper for push_oop and the like. The value argument is a > // "handle" that refers to an oop. We record the address of the > // handle rather than the designated oop. The handle is later > // resolved to the oop by parameters(). This delays the exposure of > // naked oops until it is GC-safe. > template > inline int push_oop_impl(T handle, int size) { > // JNITypes::put_obj expects an oop value, so we play fast and > // loose with the type system. The cast from handle type to oop > // *must* use a C-style cast. In a product build it performs a > // reinterpret_cast. In a debug build (more accurately, in a > // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking > // the debug-only oop class's conversion from void* constructor. > JNITypes::put_obj((oop)handle, _value, size); // Updates size. > return size; // Return the updated size. > } > The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. > > I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. > > I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. I don't know what "punning" is. Does it mean "casting"? ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From iklam at openjdk.java.net Mon Mar 15 17:00:12 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 15 Mar 2021 17:00:12 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 14:34:48 GMT, Stefan Karlsson wrote: > JavaCallArguments has this code and comment: > > // Helper for push_oop and the like. The value argument is a > // "handle" that refers to an oop. We record the address of the > // handle rather than the designated oop. The handle is later > // resolved to the oop by parameters(). This delays the exposure of > // naked oops until it is GC-safe. > template > inline int push_oop_impl(T handle, int size) { > // JNITypes::put_obj expects an oop value, so we play fast and > // loose with the type system. The cast from handle type to oop > // *must* use a C-style cast. In a product build it performs a > // reinterpret_cast. In a debug build (more accurately, in a > // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking > // the debug-only oop class's conversion from void* constructor. > JNITypes::put_obj((oop)handle, _value, size); // Updates size. > return size; // Return the updated size. > } > The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. > > I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. > > I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. Can we consolidated the duplicated versions of put_obj() into a shared header file? ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From iignatyev at openjdk.java.net Mon Mar 15 17:08:09 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 15 Mar 2021 17:08:09 GMT Subject: RFR: 8263556: remove `@modules java.base` from tests In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 16:25:48 GMT, Iris Clark wrote: >> Hi all, >> >> could you please review this trivial cleanup? >> from JBS: >> >>> jtreg `@modules X` directive does two things: >>> - exclude a test from execution if JDK under test doesn't have module X >>> - if JDK under test has module X, make sure it's resolved >>> >>> both these things have no sense for `java.base` module as it's always available and is always resolved. >> >> >> Thanks, >> -- Igor > > Marked as reviewed by iris (Reviewer). Iris, Naoto, Dan, thank you for your reviews! -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/2990 From iignatyev at openjdk.java.net Mon Mar 15 17:08:10 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Mon, 15 Mar 2021 17:08:10 GMT Subject: Integrated: 8263556: remove `@modules java.base` from tests In-Reply-To: References: Message-ID: <47c19e0cAuCEtsxE3PtvwXJdmbdvqPSj7oZglqg5c_E=.6d397d21-b62d-4c77-a6fb-7d682b983870@github.com> On Sat, 13 Mar 2021 20:26:42 GMT, Igor Ignatyev wrote: > Hi all, > > could you please review this trivial cleanup? > from JBS: > >> jtreg `@modules X` directive does two things: >> - exclude a test from execution if JDK under test doesn't have module X >> - if JDK under test has module X, make sure it's resolved >> >> both these things have no sense for `java.base` module as it's always available and is always resolved. > > > Thanks, > -- Igor This pull request has now been integrated. Changeset: d825198e Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/d825198e Stats: 21 lines in 13 files changed: 0 ins; 13 del; 8 mod 8263556: remove `@modules java.base` from tests Reviewed-by: dcubed, naoto, iris ------------- PR: https://git.openjdk.java.net/jdk/pull/2990 From coleenp at openjdk.java.net Mon Mar 15 17:13:12 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 17:13:12 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 16:57:19 GMT, Ioi Lam wrote: > Can we consolidated the duplicated versions of put_obj() into a shared header file? I had a brief look at this and we should do try to do this as a follow up RFE. ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From github.com+168222+mgkwill at openjdk.java.net Mon Mar 15 17:37:30 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Mon, 15 Mar 2021 17:37:30 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v20] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: Fix reserve_memory_special_huge_tlbfs_mixed, remove logging Signed-off-by: Marcus G K Williams ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/1153/files - new: https://git.openjdk.java.net/jdk/pull/1153/files/90befbe1..f3b4b81e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=19 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=18-19 Stats: 17 lines in 1 file changed: 0 ins; 16 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From github.com+168222+mgkwill at openjdk.java.net Mon Mar 15 17:48:22 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Mon, 15 Mar 2021 17:48:22 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v21] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: Fix logging issues Signed-off-by: Marcus G K Williams ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/1153/files - new: https://git.openjdk.java.net/jdk/pull/1153/files/f3b4b81e..3cfeb95f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=20 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=19-20 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From stefank at openjdk.java.net Mon Mar 15 17:58:23 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 15 Mar 2021 17:58:23 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: <94-niMB2FUmh-8Q26BJs3blnztaKgXP9TtdOwjPwSM0=.75a5ad09-4fce-4185-b4c2-d691626ca1ad@github.com> References: <94-niMB2FUmh-8Q26BJs3blnztaKgXP9TtdOwjPwSM0=.75a5ad09-4fce-4185-b4c2-d691626ca1ad@github.com> Message-ID: On Mon, 15 Mar 2021 16:44:05 GMT, Coleen Phillimore wrote: >> JavaCallArguments has this code and comment: >> >> // Helper for push_oop and the like. The value argument is a >> // "handle" that refers to an oop. We record the address of the >> // handle rather than the designated oop. The handle is later >> // resolved to the oop by parameters(). This delays the exposure of >> // naked oops until it is GC-safe. >> template >> inline int push_oop_impl(T handle, int size) { >> // JNITypes::put_obj expects an oop value, so we play fast and >> // loose with the type system. The cast from handle type to oop >> // *must* use a C-style cast. In a product build it performs a >> // reinterpret_cast. In a debug build (more accurately, in a >> // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking >> // the debug-only oop class's conversion from void* constructor. >> JNITypes::put_obj((oop)handle, _value, size); // Updates size. >> return size; // Return the updated size. >> } >> The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. >> >> I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. >> >> I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. > > src/hotspot/share/runtime/javaCalls.hpp line 110: > >> 108: // handle rather than the designated oop. The handle is later >> 109: // resolved to the oop by parameters(). This delays the exposure of >> 110: // naked oops until it is GC-safe. > > I thought this was the reason we had to have this ugliness. Once you push an oop to the argument stack, you could have a GC and this was a naked oop. By having the Handle pointer or jobject pointer, you'd get a pointer to the oop for GC to process. Maybe this change does the same thing though, only a lot nicer. Yes, it looks like it does. Now to have Kim point out why it's wrong :( I hope not because I like this change. The patch does the same thing as previously, but it skips casting oop*/jobject to oop. ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From stefank at openjdk.java.net Mon Mar 15 18:01:11 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 15 Mar 2021 18:01:11 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 17:10:22 GMT, Coleen Phillimore wrote: > > Can we consolidated the duplicated versions of put_obj() into a shared header file? > > I had a brief look at this and we should do try to do this as a follow up RFE. I had the same thought as @iklam, but I skipped doing it for now. ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From github.com+168222+mgkwill at openjdk.java.net Mon Mar 15 18:17:27 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Mon, 15 Mar 2021 18:17:27 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v22] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: Fix logging issues 2 Signed-off-by: Marcus G K Williams ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/1153/files - new: https://git.openjdk.java.net/jdk/pull/1153/files/3cfeb95f..6a8f57f3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=21 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=20-21 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From akozlov at openjdk.java.net Mon Mar 15 18:23:35 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 15 Mar 2021 18:23:35 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v28] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 114 commits: - JDK-8262491: bsd_aarch64 part - JDK-8263002: bsd_aarch64 part - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos - Wider #ifdef block - Fix most of issues in java/foreign/ tests Failures related to va_args are tracked in JDK-8263512. - Add Azul copyright - Update Oracle copyright years - Use Thread::current_or_null_safe in SafeFetch - 8262903: [macos_aarch64] Thread::current() called on detached thread - Merge commit 'refs/pull/11/head' of https://github.com/AntonKozlov/jdk into jdk-macos - ... and 104 more: https://git.openjdk.java.net/jdk/compare/d825198e...806fc618 ------------- Changes: https://git.openjdk.java.net/jdk/pull/2200/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=27 Stats: 2949 lines in 75 files changed: 2839 ins; 27 del; 83 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From iklam at openjdk.java.net Mon Mar 15 18:35:10 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 15 Mar 2021 18:35:10 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 14:34:48 GMT, Stefan Karlsson wrote: > JavaCallArguments has this code and comment: > > // Helper for push_oop and the like. The value argument is a > // "handle" that refers to an oop. We record the address of the > // handle rather than the designated oop. The handle is later > // resolved to the oop by parameters(). This delays the exposure of > // naked oops until it is GC-safe. > template > inline int push_oop_impl(T handle, int size) { > // JNITypes::put_obj expects an oop value, so we play fast and > // loose with the type system. The cast from handle type to oop > // *must* use a C-style cast. In a product build it performs a > // reinterpret_cast. In a debug build (more accurately, in a > // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking > // the debug-only oop class's conversion from void* constructor. > JNITypes::put_obj((oop)handle, _value, size); // Updates size. > return size; // Return the updated size. > } > The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. > > I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. > > I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3014 From github.com+168222+mgkwill at openjdk.java.net Mon Mar 15 18:40:23 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Mon, 15 Mar 2021 18:40:23 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: Use SIZE_FORMAT in logging Signed-off-by: Marcus G K Williams ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/1153/files - new: https://git.openjdk.java.net/jdk/pull/1153/files/6a8f57f3..22b27b92 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=22 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=21-22 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From iklam at openjdk.java.net Mon Mar 15 18:41:34 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 15 Mar 2021 18:41:34 GMT Subject: RFR: 8263392: Allow current thread to be specified in ExceptionMark [v2] In-Reply-To: References: Message-ID: > `ExceptionMark`, usually used via the `EXCEPTION_MARK` marco, is used to guarantee that an exception is not thrown within a block of code. I made two changes to improve efficiency: > > - Avoid calling `Thread::current()` if a thread object is already available. > - Avoid passing a reference to the `ExceptionMark` constructor. This helps C++ generate slightly better code. > > This new variant of `ExceptionMark` is mainly intended for future clean up of `TRAPS/CHECK/THREAD` code, where an exception context is temporarily needed but we will guarantee that all exceptions will be handled. I modified `SharedRuntime::monitor_exit_helper()` to use this pattern: > > Old style: > > void a_func_that_never_throws() { > EXCEPTION_MARK; > a_func_that_could_throw(THREAD); > if (HAS_PENDING_EXCEPTION) { > // handle it > CLEAR_PENDING_EXCEPTION; > } > } > > New style: > > void a_func_that_never_throws(Thread* current) { // pass thread to avoid calling Thread::current() > ExceptionMark em(current); > Thread* THREAD = current; // For exception macros. > a_func_that_could_throw(THREAD); > if (HAS_PENDING_EXCEPTION) { > // handle it > CLEAR_PENDING_EXCEPTION; > } > } Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - Merge branch 'master' into 8263392-ExceptionMark-with-thread - removed THREAD declaration because ObjectSynchronizer::exit no longer needs it since JDK-8262910 - fixed build - 8263392: Allow current thread to be specified in ExceptionMark ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2950/files - new: https://git.openjdk.java.net/jdk/pull/2950/files/91eefd71..53a016d7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2950&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2950&range=00-01 Stats: 25563 lines in 1185 files changed: 21219 ins; 2025 del; 2319 mod Patch: https://git.openjdk.java.net/jdk/pull/2950.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2950/head:pull/2950 PR: https://git.openjdk.java.net/jdk/pull/2950 From ccheung at openjdk.java.net Mon Mar 15 18:46:09 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 15 Mar 2021 18:46:09 GMT Subject: RFR: 8263562: Checking if proxy_klass_head is still lambda_proxy_is_available In-Reply-To: References: Message-ID: <7nEEqXgCs7MXosNyszEhutamFo4I0JVA8DvfwOxr3NA=.bab47ab5-8bcd-4bc4-83ad-3780cb15821f@github.com> On Mon, 15 Mar 2021 02:17:36 GMT, Yi Yang wrote: > The `Shared Lambda Dictionary` section in the result of SharedLambdaDictionaryPrinter will mix normal klasses with lambda proxy klasses. Using the following commands can reproduce it: > > Proc1: `./jshell` > Proc2: `jcmd VM.systemdictionary -verbose` > > When all archived lambda proxy classes are used, proxy_klass_head(in RunTimeLambdaProxyClassInfo) is still referring to an instance klass that is no longer lambda_proxy_is_available, and its next_link will be set by classloader to link another normal class. Simply checking if proxy_klass_head is lambda_proxy_is_available can solve this problem. > > Best regards, > Yang Looks good and thanks for fixing it. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3001 From ccheung at openjdk.java.net Mon Mar 15 18:46:12 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 15 Mar 2021 18:46:12 GMT Subject: RFR: 8263562: Checking if proxy_klass_head is still lambda_proxy_is_available In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 02:22:19 GMT, Yi Yang wrote: > Furthermore, I think we should clear proxy_klass_head when it becomes no longer lambda_proxy_is_available. Any further accesses will be illegal. Just IMHO, I will hold this only if you think it's useful. I'd suggesting open another bug for the above since it fixes a different issue. ------------- PR: https://git.openjdk.java.net/jdk/pull/3001 From rkennke at openjdk.java.net Mon Mar 15 18:46:36 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 15 Mar 2021 18:46:36 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v4] In-Reply-To: References: Message-ID: > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. > > There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. > > This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) > > Testing: > - [x] New testcase failed without change, passes now > - [x] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Disable weakroots together with evacuation at safepoint, and in a separate vmop in shortcut cycles ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2945/files - new: https://git.openjdk.java.net/jdk/pull/2945/files/739c9b62..8c103236 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=02-03 Stats: 70 lines in 7 files changed: 52 ins; 18 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2945/head:pull/2945 PR: https://git.openjdk.java.net/jdk/pull/2945 From ccheung at openjdk.java.net Mon Mar 15 18:51:17 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 15 Mar 2021 18:51:17 GMT Subject: RFR: 8263392: Allow current thread to be specified in ExceptionMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 18:41:34 GMT, Ioi Lam wrote: >> `ExceptionMark`, usually used via the `EXCEPTION_MARK` marco, is used to guarantee that an exception is not thrown within a block of code. I made two changes to improve efficiency: >> >> - Avoid calling `Thread::current()` if a thread object is already available. >> - Avoid passing a reference to the `ExceptionMark` constructor. This helps C++ generate slightly better code. >> >> This new variant of `ExceptionMark` is mainly intended for future clean up of `TRAPS/CHECK/THREAD` code, where an exception context is temporarily needed but we will guarantee that all exceptions will be handled. I modified `SharedRuntime::monitor_exit_helper()` to use this pattern: >> >> Old style: >> >> void a_func_that_never_throws() { >> EXCEPTION_MARK; >> a_func_that_could_throw(THREAD); >> if (HAS_PENDING_EXCEPTION) { >> // handle it >> CLEAR_PENDING_EXCEPTION; >> } >> } >> >> New style: >> >> void a_func_that_never_throws(Thread* current) { // pass thread to avoid calling Thread::current() >> ExceptionMark em(current); >> Thread* THREAD = current; // For exception macros. >> a_func_that_could_throw(THREAD); >> if (HAS_PENDING_EXCEPTION) { >> // handle it >> CLEAR_PENDING_EXCEPTION; >> } >> } > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into 8263392-ExceptionMark-with-thread > - removed THREAD declaration because ObjectSynchronizer::exit no longer needs it since JDK-8262910 > - fixed build > - 8263392: Allow current thread to be specified in ExceptionMark Looks good. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2950 From shade at openjdk.java.net Mon Mar 15 18:51:15 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Mon, 15 Mar 2021 18:51:15 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v4] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 18:46:36 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Disable weakroots together with evacuation at safepoint, and in a separate vmop in shortcut cycles src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp line 102: > 100: \ > 101: f(disable_weakroots_gross, "Disable Weak Roots (G)") \ > 102: f(disable_weakroots, "Disable Weak Roots (N)") \ Should start with "Pause". And should probably be something generic, like "Pause Final Roots"? This probably percolates to the names of everything else... src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 226: > 224: } > 225: > 226: Double new line. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From akozlov at openjdk.java.net Mon Mar 15 18:56:18 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 15 Mar 2021 18:56:18 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v12] In-Reply-To: <3NYUmXmjyZFhGJwrHfEjSRX1VRaPjt5cCp9HRBxODbM=.4880b6d1-f6dd-45db-95f4-9064e9204d87@github.com> References: <8MnBLkES1lapB4b01NDzU9nhOk8_9_V--NSCM5H_bg8=.7bdb576b-4acd-4e5b-be14-b363a2ef47bf@github.com> <3NYUmXmjyZFhGJwrHfEjSRX1VRaPjt5cCp9HRBxODbM=.4880b6d1-f6dd-45db-95f4-9064e9204d87@github.com> Message-ID: On Thu, 11 Mar 2021 20:27:51 GMT, Stefan Karlsson wrote: >> The thread_bsd_aarch64.hpp describes a part of JavaThread, while this block belongs to Thread for now. Since W^X is an attribute of any operating system thread, I assumed Thread to be the right place for W^X bookkeeping. >> >> In most cases, we manage W^X state of JavaThread. But sometimes a GC thread needs the WXWrite state, or safefetch is called from non-JavaThread. Probably this can be dealt with (e.g. GCThread to always have the WXWrite state). But such change would be much more than a simple refactoring and it would require a significant amount of testing. Ideally, I would like to investigate this as a follow-up change, or at least after other fixes to this PR. > > Good point about Thread vs JavaThread. Yes, this can be looked into as follow-up cleanups. The enhancement is tracked in https://bugs.openjdk.java.net/browse/JDK-8263492 ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Mon Mar 15 18:56:17 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 15 Mar 2021 18:56:17 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v21] In-Reply-To: References: Message-ID: <4EryfweVy9Q97cbm7rAcsSIXuK2XImbIfuPeloJuXEA=.90a1f396-299c-43d2-922b-9af36ca43467@github.com> On Wed, 10 Mar 2021 11:21:44 GMT, Andrew Haley wrote: >> We always check for `R18_RESERVED` with `#if(n)def`, is there any reason to define the value for the macro? > > Robustness, clarity, maintainability, convention. Why not? I've tried to implement the suggestion, but it pulled more unnecessary changes. It makes the intended way to check the condition less clear (`#ifdef` and not `#if`). The rest of the defines in this file follows the pattern: a define without a value to be checked with `#ifdef` and define with a value to be checked with `#if`. To be consistent, I would need to add `#define R18_RESERVED false` to the `#else` clause and change every `#ifdef R18_RESERVED`/`#ifndef R18_RESERVED` to `#if R18_RESERVED`/`#if !R18_RESERVED`. I think we'll win in clarity in the long term if I will not implement the suggestion without related changes. (And related changes would introduce additional noise, which we are fighting with). ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Mon Mar 15 18:56:23 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 15 Mar 2021 18:56:23 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v28] In-Reply-To: References: Message-ID: <0dFnMCcGeahDTezdMqNQ2ZtipjeEaCpNezow6Kqy5xE=.ec72c0d8-8252-4abf-ab7a-8ce4b89ca527@github.com> On Sat, 13 Mar 2021 05:49:53 GMT, David Holmes wrote: >> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 114 commits: >> >> - JDK-8262491: bsd_aarch64 part >> - JDK-8263002: bsd_aarch64 part >> - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos >> - Wider #ifdef block >> - Fix most of issues in java/foreign/ tests >> >> Failures related to va_args are tracked in JDK-8263512. >> - Add Azul copyright >> - Update Oracle copyright years >> - Use Thread::current_or_null_safe in SafeFetch >> - 8262903: [macos_aarch64] Thread::current() called on detached thread >> - Merge commit 'refs/pull/11/head' of https://github.com/AntonKozlov/jdk into jdk-macos >> - ... and 104 more: https://git.openjdk.java.net/jdk/compare/d825198e...806fc618 > > src/hotspot/share/runtime/safefetch.inline.hpp line 35: > >> 33: inline int SafeFetch32(int* adr, int errValue) { >> 34: assert(StubRoutines::SafeFetch32_stub(), "stub not yet generated"); >> 35: Thread* thread = Thread::current_or_null_safe(); > > Sorry but this should be MACOS_AARCH64 only. All three lines need to be ifdef'd if you are going to include the assertion. > > Thanks, > David Right, thanks! Fixed in https://github.com/openjdk/jdk/pull/2200/commits/3d0f4d2342adc867eaf762fa83a9c3035d6439bd ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Mon Mar 15 19:01:13 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 19:01:13 GMT Subject: RFR: 8263392: Allow current thread to be specified in ExceptionMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 18:41:34 GMT, Ioi Lam wrote: >> `ExceptionMark`, usually used via the `EXCEPTION_MARK` marco, is used to guarantee that an exception is not thrown within a block of code. I made two changes to improve efficiency: >> >> - Avoid calling `Thread::current()` if a thread object is already available. >> - Avoid passing a reference to the `ExceptionMark` constructor. This helps C++ generate slightly better code. >> >> This new variant of `ExceptionMark` is mainly intended for future clean up of `TRAPS/CHECK/THREAD` code, where an exception context is temporarily needed but we will guarantee that all exceptions will be handled. I modified `SharedRuntime::monitor_exit_helper()` to use this pattern: >> >> Old style: >> >> void a_func_that_never_throws() { >> EXCEPTION_MARK; >> a_func_that_could_throw(THREAD); >> if (HAS_PENDING_EXCEPTION) { >> // handle it >> CLEAR_PENDING_EXCEPTION; >> } >> } >> >> New style: >> >> void a_func_that_never_throws(Thread* current) { // pass thread to avoid calling Thread::current() >> ExceptionMark em(current); >> Thread* THREAD = current; // For exception macros. >> a_func_that_could_throw(THREAD); >> if (HAS_PENDING_EXCEPTION) { >> // handle it >> CLEAR_PENDING_EXCEPTION; >> } >> } > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into 8263392-ExceptionMark-with-thread > - removed THREAD declaration because ObjectSynchronizer::exit no longer needs it since JDK-8262910 > - fixed build > - 8263392: Allow current thread to be specified in ExceptionMark This looks good. It would be nice to replace more EXCEPTION_MARK as we find them. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2950 From coleenp at openjdk.java.net Mon Mar 15 19:02:14 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 19:02:14 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 14:34:48 GMT, Stefan Karlsson wrote: > JavaCallArguments has this code and comment: > > // Helper for push_oop and the like. The value argument is a > // "handle" that refers to an oop. We record the address of the > // handle rather than the designated oop. The handle is later > // resolved to the oop by parameters(). This delays the exposure of > // naked oops until it is GC-safe. > template > inline int push_oop_impl(T handle, int size) { > // JNITypes::put_obj expects an oop value, so we play fast and > // loose with the type system. The cast from handle type to oop > // *must* use a C-style cast. In a product build it performs a > // reinterpret_cast. In a debug build (more accurately, in a > // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking > // the debug-only oop class's conversion from void* constructor. > JNITypes::put_obj((oop)handle, _value, size); // Updates size. > return size; // Return the updated size. > } > The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. > > I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. > > I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From coleenp at openjdk.java.net Mon Mar 15 19:02:16 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 19:02:16 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: <94-niMB2FUmh-8Q26BJs3blnztaKgXP9TtdOwjPwSM0=.75a5ad09-4fce-4185-b4c2-d691626ca1ad@github.com> Message-ID: On Mon, 15 Mar 2021 17:55:30 GMT, Stefan Karlsson wrote: >> src/hotspot/share/runtime/javaCalls.hpp line 110: >> >>> 108: // handle rather than the designated oop. The handle is later >>> 109: // resolved to the oop by parameters(). This delays the exposure of >>> 110: // naked oops until it is GC-safe. >> >> I thought this was the reason we had to have this ugliness. Once you push an oop to the argument stack, you could have a GC and this was a naked oop. By having the Handle pointer or jobject pointer, you'd get a pointer to the oop for GC to process. Maybe this change does the same thing though, only a lot nicer. Yes, it looks like it does. Now to have Kim point out why it's wrong :( I hope not because I like this change. > > The patch does the same thing as previously, but it skips casting oop*/jobject to oop. Ok! ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From zgu at openjdk.java.net Mon Mar 15 19:08:12 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Mon, 15 Mar 2021 19:08:12 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v4] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 18:46:36 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Disable weakroots together with evacuation at safepoint, and in a separate vmop in shortcut cycles Changes requested by zgu (Reviewer). src/hotspot/share/gc/shenandoah/shenandoahConcurrentGC.cpp line 909: > 907: } > 908: > 909: void ShenandoahConcurrentGC::op_rendezvous_roots() { This method is no longer needed, so as disable_concurrent_weak_root_in_progress_concurrently() ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From minqi at openjdk.java.net Mon Mar 15 19:11:11 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Mon, 15 Mar 2021 19:11:11 GMT Subject: RFR: 8263392: Allow current thread to be specified in ExceptionMark [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 18:41:34 GMT, Ioi Lam wrote: >> `ExceptionMark`, usually used via the `EXCEPTION_MARK` marco, is used to guarantee that an exception is not thrown within a block of code. I made two changes to improve efficiency: >> >> - Avoid calling `Thread::current()` if a thread object is already available. >> - Avoid passing a reference to the `ExceptionMark` constructor. This helps C++ generate slightly better code. >> >> This new variant of `ExceptionMark` is mainly intended for future clean up of `TRAPS/CHECK/THREAD` code, where an exception context is temporarily needed but we will guarantee that all exceptions will be handled. I modified `SharedRuntime::monitor_exit_helper()` to use this pattern: >> >> Old style: >> >> void a_func_that_never_throws() { >> EXCEPTION_MARK; >> a_func_that_could_throw(THREAD); >> if (HAS_PENDING_EXCEPTION) { >> // handle it >> CLEAR_PENDING_EXCEPTION; >> } >> } >> >> New style: >> >> void a_func_that_never_throws(Thread* current) { // pass thread to avoid calling Thread::current() >> ExceptionMark em(current); >> Thread* THREAD = current; // For exception macros. >> a_func_that_could_throw(THREAD); >> if (HAS_PENDING_EXCEPTION) { >> // handle it >> CLEAR_PENDING_EXCEPTION; >> } >> } > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Merge branch 'master' into 8263392-ExceptionMark-with-thread > - removed THREAD declaration because ObjectSynchronizer::exit no longer needs it since JDK-8262910 > - fixed build > - 8263392: Allow current thread to be specified in ExceptionMark This change is good. ------------- Marked as reviewed by minqi (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2950 From rkennke at openjdk.java.net Mon Mar 15 19:26:34 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Mon, 15 Mar 2021 19:26:34 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v5] In-Reply-To: References: Message-ID: > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. > > There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. > > This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) > > Testing: > - [x] New testcase failed without change, passes now > - [x] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Remove unused code. Move ShenandoahRendezvousClosure close to where it's used. - Rename 'disable weak roots' -> 'final roots' everywhere ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2945/files - new: https://git.openjdk.java.net/jdk/pull/2945/files/8c103236..5abeece7 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=03-04 Stats: 87 lines in 13 files changed: 7 ins; 57 del; 23 mod Patch: https://git.openjdk.java.net/jdk/pull/2945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2945/head:pull/2945 PR: https://git.openjdk.java.net/jdk/pull/2945 From manc at openjdk.java.net Mon Mar 15 19:53:09 2021 From: manc at openjdk.java.net (Man Cao) Date: Mon, 15 Mar 2021 19:53:09 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: <9IfAIpqnmfePZeYWjSTS1q_tbYLLZmWFAYW5yHh5YNE=.17034449-5017-42af-95f3-48886b98fa6e@github.com> Message-ID: <1BTERLsAv4AI0JhJLez77jWAskN5V1ghMK4ut_84C1w=.9efdac5f-24e6-4a14-a467-55b0b964612a@github.com> On Sun, 14 Mar 2021 00:53:03 GMT, Kim Barrett wrote: >> src/hotspot/share/utilities/lockFreeQueue.hpp line 99: >> >>> 97: } else { >>> 98: assert(get_next(*old_tail) == NULL, "invariant"); >>> 99: Atomic::store(next_ptr(*old_tail), &first); >> >> I changed this store from a normal store to an Atomic store. Otherwise there is a data race between this store and the load of _head->_next in pop(). > > As David said, Atomic::store doesn't indicate any ordering; it's a relaxed atomic store. The old code was `old_tail->set_next(&first)`, which was hard-wired to the element type being BufferNode (which was okay in its place). But the BufferNode code predates consistent use of Atomic when accessing atomic/volatile data, so don't currently use Atomic::load/store. In this generic hoist we no longer want to assume next and set_next functions, just the next_ptr function that returns a pointer to atomic/volatile. I see that you added get_next, which uses Atomic::load; I think there should be an associated set_next, as there are other places that need it. Thanks for the reviews and feedbacks. I will address the other comments soon. For this issue, agreed that I misused the term "data race". It is an inconsistent use of Atomic APIs but not really a data race. I will use set_next() here. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Mon Mar 15 20:06:09 2021 From: manc at openjdk.java.net (Man Cao) Date: Mon, 15 Mar 2021 20:06:09 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 10:41:44 GMT, Man Cao wrote: > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 60: > 58: LockFreeQueueCriticalSection cs(current_thread); > 59: > 60: T* result = Atomic::load_acquire(&_head); A related question about memory ordering. Are these two load_acquire() really necessary? They are not paired with any release_store(). I think they can be normal Atomic::load(), as append() and pop() already have Atomic::xchg() and Atomic::cmpxchg() to enforce ordering. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Mon Mar 15 20:06:10 2021 From: manc at openjdk.java.net (Man Cao) Date: Mon, 15 Mar 2021 20:06:10 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 02:25:34 GMT, Kim Barrett wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > src/hotspot/share/utilities/lockFreeQueue.hpp line 50: > >> 48: NONCOPYABLE(LockFreeQueue); >> 49: >> 50: protected: > > Protected members (to be accessible from a derived class) are inconsistent with a public non-virtual destructor (that may allow destructor slicing). I dislike classes that try to be both concrete implementation classes and base classes; they are hard to design well (and this class wasn't intended to be such). This was done to allow G1DirtyCardQueueSet to extend it with the `take_all` function; that seems like a useful operation in the generic form, even if it can't be made thread-safe. (The G1 function asserts_at_safepoint(), but that's not really appropriate for a generic form.) To clarify, do we want to move take_all() to LockFreeQueue, remove assert_at_safepoint() and put a big warning that it is not thread safe? It perhaps also involves adding the "struct HeadTail" to this class. Another option is to provide two getter methods that returns T** for &_head and &_tail. They are not thread-safe either, but maybe more generic than take_all(). ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From sjohanss at openjdk.java.net Mon Mar 15 21:25:10 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Mon, 15 Mar 2021 21:25:10 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 15 Mar 2021 12:24:02 GMT, Stefan Johansson wrote: >> @tschatzl @Hamlin-Li >> Would it be ok to set the live estimate to the `used()` value at the end of `G1FullCollector::phase4_do_compaction()` method to have something suboptimal but working and refine in https://github.com/openjdk/jdk/pull/2760 (or a subsequent ticket/PR once both parts are ready)? > > Sorry for being a bit late to the party. Looking at the suggested implementation for G1 I see a problem with only updating this after concurrent mark (and the Full GC). Say for example you have a concurrent mark cycle before the heap has expanded a lot and you get a low value stored in `G1CollectedHeap::_live`. Then the heap expands and your application get to a steady state that doesn't require any more marking cycles. In this case the same value will be reported for the entire run. > > For this to work the _live value would have to be updated at every GC, but this is a bit costly. Maybe the first version could just use `used()` for G1. Have you done any tests to see how off this would be compared to the other GCs? > @kstefanj > > > Then the heap expands and your application get to a steady state that doesn't require any more marking cycles. > > Is there a way to get the liveness info when the heap expands? If not that would mean we had no way to figure out the new live set size and would assume, conservatively, the last known value. > > As I mentioned in the PR description the live size value will be a 'best effort estimate' depending on what can each particular GC provide. Sure, and this is fair, my concern is just that this 'best effort estimate' for G1 will often be worse than just using `used()`. This is not only a problem for when the heap expands, that was just an example, the live value will become more and more stale the longer an application run without triggering a new concurrent cycle. For the liveness value to be useful it would have to be updated at each GC, and we need to investigate further to see how we can do that in a "cheap" way. I would prefer if G1 did just return `used()` in `live()` as a start and we can create a follow-up task to investigate how to best add a better estimate. Do you see any problem with this? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From coleenp at openjdk.java.net Mon Mar 15 21:31:08 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 21:31:08 GMT Subject: RFR: 8263589: Introduce JavaValue::get_oop/set_oop In-Reply-To: References: Message-ID: <73KvQ-wIvi6mX-538ngdN0mHKFs30QI7Y57TFHsfRLM=.89d76b9a-1b3d-401c-a942-86776321c1f9@github.com> On Mon, 15 Mar 2021 12:35:47 GMT, Stefan Karlsson wrote: > JavaValue is a small wrapper class that wraps values used to pass arguments and results between native and Java. > > When JavaCalls::call returns an object, the value stored in the JavaValue is not a handliezed jobject. Instead it's a raw oop. So, most of the code handling the `result`, fetches the result as a jobject, and then immediately casts it to an oop. For example: > oop res = (oop)result.get_jobject(); > > I'd like to change this code to be: > oop res = result.get_oop(); > > The motivations for this patch is: > > 1) Minimize the places where we pass around oops in jobject variables. Maybe at some point we'll have converted the JVM to only use the jobject type when passing around JNI handle. We need to be stricter with the types when we continue develop our GCs and their barriers. > > 2) Limit the number of places in the code where we perform raw oop casts. We have a helper cast function for that, cast_to_oop, but not all code use it. I have future patches where the compiler will completely forbid raw cast to oops (in fastdebug builds). With that in place, I can then add more stricter oop verification code when oops are created. This helps catching bugs earlier. > > --- > > When reviewing this patch, take an extra look at the change to oopsHierarchy.hpp. This was done to support jvmciEnv.cpp code: > JVMCIObject wrap(oop obj)... > JVMCIObjectArray wrap(objArrayOop obj)... > JVMCIPrimitiveArray wrap(typeArrayOop obj) ... > Previously, `wrap((oop)result.get_jobject())` called the first function. When the code was changed to `wrap(result.get_oop())`, where `get_oop()` returns a `oopDesc*`, the compiler didn't know what conversion in oopsHierarchy.hpp to use. Therefore, I replaced the overly permissive `void*` constructor with a constructor that only takes the corresponding `type##OopDesc*`. > > An alternative would be to let get_oop() return an oop, but then that would add an unwanted a dependency between globalDefinitions.hpp and oopsHierarchy.hpp. An earlier version of this patch did return an oop instead of oopDesc*, but it also moved entire JavaValue class out of globalDefinitions.hpp into a new javaValue.hpp file, and had a corresponding javaValue.inline.hpp file. > > Even if we end up using the proposed `oopDesc* get_oop()` version, maybe moving the class to javaValues.hpp would still makes sense? This change looks really good to me. I have no objection to oopDesc* in JavaCallValue. We use oopDesc* in all places where the class oop would interfere with values passed between Java and the vm. src/hotspot/share/utilities/globalDefinitions.hpp line 809: > 807: jint i; > 808: jlong l; > 809: jobject h; Do we still need jobject after this change? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3013 From dcubed at openjdk.java.net Mon Mar 15 21:36:10 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 15 Mar 2021 21:36:10 GMT Subject: RFR: 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed In-Reply-To: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> References: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> Message-ID: <0alNJ7v_vgvdW4lcTlJqUGOiStti0oIPRS6yN615x5Q=.56504516-1626-4267-956e-f132ead3c4bc@github.com> On Mon, 15 Mar 2021 06:26:23 GMT, David Holmes wrote: > Please see bug report for gory details. > > For the specific issue here of the vm_assert gtests I propose to make two changes to the VM: > > 1. When core dumps are disabled, os::abort should call ::_exit not ::exit, as the former more closely models the abrupt termination of ::abort() but without the core dump. > > 2. The race condition when SupressFatalErrorMessages is true is fixed by placing the check after the atomic set/check of the thread-id. That way only a single thread can trigger the fatal error processing. > > I was debating whether to make a slight change so that even when SuppressFatalErrorMessage is true, secondary failures will report that such an error occurred but not with any details. But I've left the existing silence for now. It is possible someone uses the flag to hide a message I would like to expose. I suppose adding additional output in debug builds only may be an option - options welcomed. > > Testing: > - fully manual > > I manually set up the conditions where a background thread could crash due to the atexit actions executing. I added special debug code to show what was happening in such cases, and that secondary errors were occurring. I then applied the fix for #2 and saw the second thread getting caught; then I applied fix #1 and the secondary crashes were gone. > > Also did tier 1-3 testing and local gtest testing just to sanity check things. > > Thanks, > David Thumbs up. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3002 From coleenp at openjdk.java.net Mon Mar 15 21:36:10 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 21:36:10 GMT Subject: RFR: 8263589: Introduce JavaValue::get_oop/set_oop In-Reply-To: <73KvQ-wIvi6mX-538ngdN0mHKFs30QI7Y57TFHsfRLM=.89d76b9a-1b3d-401c-a942-86776321c1f9@github.com> References: <73KvQ-wIvi6mX-538ngdN0mHKFs30QI7Y57TFHsfRLM=.89d76b9a-1b3d-401c-a942-86776321c1f9@github.com> Message-ID: On Mon, 15 Mar 2021 21:27:54 GMT, Coleen Phillimore wrote: >> JavaValue is a small wrapper class that wraps values used to pass arguments and results between native and Java. >> >> When JavaCalls::call returns an object, the value stored in the JavaValue is not a handliezed jobject. Instead it's a raw oop. So, most of the code handling the `result`, fetches the result as a jobject, and then immediately casts it to an oop. For example: >> oop res = (oop)result.get_jobject(); >> >> I'd like to change this code to be: >> oop res = result.get_oop(); >> >> The motivations for this patch is: >> >> 1) Minimize the places where we pass around oops in jobject variables. Maybe at some point we'll have converted the JVM to only use the jobject type when passing around JNI handle. We need to be stricter with the types when we continue develop our GCs and their barriers. >> >> 2) Limit the number of places in the code where we perform raw oop casts. We have a helper cast function for that, cast_to_oop, but not all code use it. I have future patches where the compiler will completely forbid raw cast to oops (in fastdebug builds). With that in place, I can then add more stricter oop verification code when oops are created. This helps catching bugs earlier. >> >> --- >> >> When reviewing this patch, take an extra look at the change to oopsHierarchy.hpp. This was done to support jvmciEnv.cpp code: >> JVMCIObject wrap(oop obj)... >> JVMCIObjectArray wrap(objArrayOop obj)... >> JVMCIPrimitiveArray wrap(typeArrayOop obj) ... >> Previously, `wrap((oop)result.get_jobject())` called the first function. When the code was changed to `wrap(result.get_oop())`, where `get_oop()` returns a `oopDesc*`, the compiler didn't know what conversion in oopsHierarchy.hpp to use. Therefore, I replaced the overly permissive `void*` constructor with a constructor that only takes the corresponding `type##OopDesc*`. >> >> An alternative would be to let get_oop() return an oop, but then that would add an unwanted a dependency between globalDefinitions.hpp and oopsHierarchy.hpp. An earlier version of this patch did return an oop instead of oopDesc*, but it also moved entire JavaValue class out of globalDefinitions.hpp into a new javaValue.hpp file, and had a corresponding javaValue.inline.hpp file. >> >> Even if we end up using the proposed `oopDesc* get_oop()` version, maybe moving the class to javaValues.hpp would still makes sense? > > This change looks really good to me. I have no objection to oopDesc* in JavaCallValue. We use oopDesc* in all places where the class oop would interfere with values passed between Java and the vm. Replacing the overly permissive void* oop conversion operator seems like a good effect of this change. I don't see the necessity of adding a new javaValues.hpp file for this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/3013 From dcubed at openjdk.java.net Mon Mar 15 22:08:08 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Mon, 15 Mar 2021 22:08:08 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 11:48:38 GMT, Robbin Ehn wrote: > When returning from the last Java frame back to vm and hitting a safepoint poll on that last return we sometimes have a last java frame but no vframe. > This seem to be a bug in itself, handled in: 8263576 > > Other places which uses vframe NULL checks it before, so let's do that in GetCurrentLocationClosure also. > > Testing: nsk jdi/jvmti, jdk jdi, jck vm and t1-3. Thumbs up! I agree that the code should have checked for "if (vf != NULL) {" instead of asserting that "(vf != NULL)". src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 263: > 261: // There can be a race condition between a handshake > 262: // and the target thread exiting from Java execution. > 263: // We must recheck the last Java frame still exists. Typo: s/recheck the last/recheck that the last/ (not your typo, but since you're in there...) src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 266: > 264: if (!jt->is_exiting() && jt->has_last_Java_frame()) { > 265: javaVFrame* vf = jt->last_java_vframe(&rm); > 266: assert(vf != NULL, "must have last java frame"); The code before we converted to handshakes also had this assert. The pre-handshake code did the work in the doit() function for the VM_GetCurrentLocation VM-op. This makes me wonder if we always had frames here when this was previously done via VM-op? And that makes me wonder whether handshakes is doing something different so we don't always have a frame here? ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3010 From iklam at openjdk.java.net Mon Mar 15 22:17:10 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 15 Mar 2021 22:17:10 GMT Subject: RFR: 8263562: Checking if proxy_klass_head is still lambda_proxy_is_available In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 02:17:36 GMT, Yi Yang wrote: > The `Shared Lambda Dictionary` section in the result of SharedLambdaDictionaryPrinter will mix normal klasses with lambda proxy klasses. Using the following commands can reproduce it: > > Proc1: `./jshell` > Proc2: `jcmd VM.systemdictionary -verbose` > > When all archived lambda proxy classes are used, proxy_klass_head(in RunTimeLambdaProxyClassInfo) is still referring to an instance klass that is no longer lambda_proxy_is_available, and its next_link will be set by classloader to link another normal class. Simply checking if proxy_klass_head is lambda_proxy_is_available can solve this problem. > > Best regards, > Yang Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3001 From github.com+168222+mgkwill at openjdk.java.net Mon Mar 15 22:21:13 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Mon, 15 Mar 2021 22:21:13 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v19] In-Reply-To: References: Message-ID: <-pcw5i2kUzZy4bk7rgc4vzdrH3bPof0KfUacc1wVKGk=.60eafc5e-7c72-464c-bf90-3512cab2f49d@github.com> On Mon, 15 Mar 2021 15:42:15 GMT, Marcus G K Williams wrote: >> Hi Markus, >> >> first off, starting to look better and better. >> >> About the assert, I'm quite sure this is the result of using `os::page_size_for_region_aligned()` - as Stefan mentioned above you should use the unaligned version of this function since the input size 21098496 is not aligned to 2M, which causes the function to return 4K even though we have space enough to fit 9-10 2M pages here: >> >> [0.825s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496 >> >> Sorry, I don't have time to dive deeper. I know @kstefanj had some more ideas and this ties in with his work, but he may be occupied with other things right now and may not be quick to reply. >> >> All in all this starts to look real good. I try to resist the urge to refactor everything on the back of this PR. There will be work left for future RFEs. >> >> Cheers, Thomas > >> Hi Markus, >> >> first off, starting to look better and better. >> >> About the assert, I'm quite sure this is the result of using `os::page_size_for_region_aligned()` - as Stefan mentioned above you should use the unaligned version of this function since the input size 21098496 is not aligned to 2M, which causes the function to return 4K even though we have space enough to fit 9-10 2M pages here: >> >> ``` >> [0.825s][info][pagesize] Large page size returned from os::page_size_for_region_aligned: 4096, for bytes: 21098496 >> ``` >> >> Sorry, I don't have time to dive deeper. I know @kstefanj had some more ideas and this ties in with his work, but he may be occupied with other things right now and may not be quick to reply. >> >> All in all this starts to look real good. I try to resist the urge to refactor everything on the back of this PR. There will be work left for future RFEs. >> >> Cheers, Thomas > > Thanks @tstuefe Thomas. > > Changed to os::page_size_for_region_unaligned at os::Linux::reserve_memory_special_huge_tlbfs_mixed. This helped. Working on new patch now. > > I'd really like to resist too much refactoring here also. I believe @kstefanj is looking at more in depth refactor. > > Thanks, > Marcus 2 Issues: - Page sizes reported in higher level functions can be differ from page_size actually present. In my current setup 1G large pages are present and os::page_size_for_region_unaligned picks them and uses them even thought 2M pages are set in large_page_init. Should we force large_pages to the largest page configured instead of using largest available? Or should we update how we calculate the large page used? Even though `Large page size returned from os::page_size_for_region_unaligned: 1073741824, for bytes: 16206790656` and 1G large pages padded with 4k pages is used, trace_page_sizes reports `Heap: min=8M max=15456M base=0x000000043a000000 page_size=2M size=15456M` [0.022s][info][pagesize] os::Linux::reserve_memory_special_huge_tlbfs_mixed [0.022s][info][pagesize] Large page size returned from os::page_size_for_region_unaligned: 1073741824, for bytes: 16206790656 [0.022s][info][pagesize] alignment: 8388608 [0.022s][info][pagesize] Page size returned from (size_t)os::vm_page_size(): 4096 [0.022s][info][pagesize] req_addr: 0x000000043a000000 [0.022s][info][pagesize] Memory: 4k page, physical 131844416k(50816280k free), swap 0k(0k free) [0.022s][info][pagesize] 2048k default large page [0.022s][info][pagesize] Page Sizes: 4k, 2M, 1G [0.022s][info][pagesize] start addr: 0x000000043a000000 [0.022s][info][pagesize] LP start addr: 0x0000000440000000 [0.022s][info][pagesize] LP end addr: 0x0000000800000000 [0.022s][info][pagesize] end addr: 0x0000000800000000 [0.022s][info][pagesize] Heap: min=8M max=15456M base=0x000000043a000000 page_size=2M size=15456M src/hotspot/share/memory/virtualspace.cpp at 286 size_t ReservedSpace::actual_reserved_page_size(const ReservedSpace& rs) { size_t page_size = os::vm_page_size(); if (UseLargePages) { // There are two ways to manage large page memory. // 1. OS supports committing large page memory. // 2. OS doesn't support committing large page memory so ReservedSpace manages it. // And ReservedSpace calls it 'special'. If we failed to set 'special', // we reserved memory without large page. if (os::can_commit_large_page_memory() || rs.special()) { // An alignment at ReservedSpace comes from preferred page size or // heap alignment, and if the alignment came from heap alignment, it could be // larger than large pages size. So need to cap with the large page size. page_size = MIN2(rs.alignment(), os::large_page_size()); } } - I don't think that test/hotspot/jtreg/runtime/os/TestTracePageSizes.java takes into account mixed large page reservation using os::Linux::reserve_memory_special_huge_tlbfs_mixed where large pages are padded at the start and end with small pages. It seems to me that the test needs to be updated. Test Log Output with Debug Added range: [43a400000, 440000000) pageSize=4KB isTHP=false isHUGETLB=false Added range: [440000000, 800000000) pageSize=1048576KB isTHP=false isHUGETLB=true ... [0.019s][info][pagesize] Heap: min=8M max=15452M base=0x000000043a400000 page_size=2M size=15452M >From smaps: [43a400000, 440000000) pageSize=4KB isTHP=false isHUGETLB=false Failure: 4 != 2048 Extra Debug Logging: [0.022s][info][pagesize] os::Linux::reserve_memory_special_huge_tlbfs_mixed [0.022s][info][pagesize] Large page size returned from os::page_size_for_region_unaligned: 1073741824, for bytes: 16206790656 [0.022s][info][pagesize] start addr: 0x000000043a000000 [0.022s][info][pagesize] LP start addr: 0x0000000440000000 [0.022s][info][pagesize] LP end addr: 0x0000000800000000 [0.022s][info][pagesize] end addr: 0x0000000800000000 [0.022s][info][pagesize] Heap: min=8M max=15456M base=0x000000043a000000 page_size=2M size=15456M Function where the compare of address and page size allocation: test/hotspot/jtreg/runtime/os/TestTracePageSizes.java at 195 if (pageSizeFromSmaps != pageSizeFromTrace) { if (pageSizeFromTrace > pageSizeFromSmaps && range.isTransparentHuge()) { // Page sizes mismatch because we can't know what underlying page size will // be used when THP is enabled. So this is not a failure. debug("Success: " + pageSizeFromTrace + " > " + pageSizeFromSmaps + " and THP enabled"); } else { debug("Failure: " + pageSizeFromSmaps + " != " + pageSizeFromTrace); throw new AssertionError("Page sizes mismatch: " + pageSizeFromSmaps + " != " + pageSizeFromTrace); } } else { debug("Success: " + pageSizeFromSmaps + " == " + pageSizeFromTrace); } ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From iklam at openjdk.java.net Mon Mar 15 22:23:10 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 15 Mar 2021 22:23:10 GMT Subject: RFR: 8263392: Allow current thread to be specified in ExceptionMark [v2] In-Reply-To: References: Message-ID: On Fri, 12 Mar 2021 06:02:49 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Merge branch 'master' into 8263392-ExceptionMark-with-thread >> - removed THREAD declaration because ObjectSynchronizer::exit no longer needs it since JDK-8262910 >> - fixed build >> - 8263392: Allow current thread to be specified in ExceptionMark > > Looks good! Thanks for doing this cleanup. > > David Thanks @dholmes-ora @calvinccheung @coleenp @yminqi for the review. ------------- PR: https://git.openjdk.java.net/jdk/pull/2950 From iklam at openjdk.java.net Mon Mar 15 22:23:11 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 15 Mar 2021 22:23:11 GMT Subject: Integrated: 8263392: Allow current thread to be specified in ExceptionMark In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 20:36:41 GMT, Ioi Lam wrote: > `ExceptionMark`, usually used via the `EXCEPTION_MARK` marco, is used to guarantee that an exception is not thrown within a block of code. I made two changes to improve efficiency: > > - Avoid calling `Thread::current()` if a thread object is already available. > - Avoid passing a reference to the `ExceptionMark` constructor. This helps C++ generate slightly better code. > > This new variant of `ExceptionMark` is mainly intended for future clean up of `TRAPS/CHECK/THREAD` code, where an exception context is temporarily needed but we will guarantee that all exceptions will be handled. I modified `SharedRuntime::monitor_exit_helper()` to use this pattern: > > Old style: > > void a_func_that_never_throws() { > EXCEPTION_MARK; > a_func_that_could_throw(THREAD); > if (HAS_PENDING_EXCEPTION) { > // handle it > CLEAR_PENDING_EXCEPTION; > } > } > > New style: > > void a_func_that_never_throws(Thread* current) { // pass thread to avoid calling Thread::current() > ExceptionMark em(current); > Thread* THREAD = current; // For exception macros. > a_func_that_could_throw(THREAD); > if (HAS_PENDING_EXCEPTION) { > // handle it > CLEAR_PENDING_EXCEPTION; > } > } This pull request has now been integrated. Changeset: 1e570870 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/1e570870 Stats: 34 lines in 8 files changed: 16 ins; 3 del; 15 mod 8263392: Allow current thread to be specified in ExceptionMark Reviewed-by: dholmes, ccheung, coleenp, minqi ------------- PR: https://git.openjdk.java.net/jdk/pull/2950 From sspitsyn at openjdk.java.net Mon Mar 15 22:23:08 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Mon, 15 Mar 2021 22:23:08 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 11:48:38 GMT, Robbin Ehn wrote: > When returning from the last Java frame back to vm and hitting a safepoint poll on that last return we sometimes have a last java frame but no vframe. > This seem to be a bug in itself, handled in: 8263576 > > Other places which uses vframe NULL checks it before, so let's do that in GetCurrentLocationClosure also. > > Testing: nsk jdi/jvmti, jdk jdi, jck vm and t1-3. Robbin, The fix looks good to me. Thank you for taking care about this issue! Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3010 From stefank at openjdk.java.net Mon Mar 15 22:24:14 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 15 Mar 2021 22:24:14 GMT Subject: RFR: 8263589: Introduce JavaValue::get_oop/set_oop In-Reply-To: <73KvQ-wIvi6mX-538ngdN0mHKFs30QI7Y57TFHsfRLM=.89d76b9a-1b3d-401c-a942-86776321c1f9@github.com> References: <73KvQ-wIvi6mX-538ngdN0mHKFs30QI7Y57TFHsfRLM=.89d76b9a-1b3d-401c-a942-86776321c1f9@github.com> Message-ID: On Mon, 15 Mar 2021 21:25:30 GMT, Coleen Phillimore wrote: >> JavaValue is a small wrapper class that wraps values used to pass arguments and results between native and Java. >> >> When JavaCalls::call returns an object, the value stored in the JavaValue is not a handliezed jobject. Instead it's a raw oop. So, most of the code handling the `result`, fetches the result as a jobject, and then immediately casts it to an oop. For example: >> oop res = (oop)result.get_jobject(); >> >> I'd like to change this code to be: >> oop res = result.get_oop(); >> >> The motivations for this patch is: >> >> 1) Minimize the places where we pass around oops in jobject variables. Maybe at some point we'll have converted the JVM to only use the jobject type when passing around JNI handle. We need to be stricter with the types when we continue develop our GCs and their barriers. >> >> 2) Limit the number of places in the code where we perform raw oop casts. We have a helper cast function for that, cast_to_oop, but not all code use it. I have future patches where the compiler will completely forbid raw cast to oops (in fastdebug builds). With that in place, I can then add more stricter oop verification code when oops are created. This helps catching bugs earlier. >> >> --- >> >> When reviewing this patch, take an extra look at the change to oopsHierarchy.hpp. This was done to support jvmciEnv.cpp code: >> JVMCIObject wrap(oop obj)... >> JVMCIObjectArray wrap(objArrayOop obj)... >> JVMCIPrimitiveArray wrap(typeArrayOop obj) ... >> Previously, `wrap((oop)result.get_jobject())` called the first function. When the code was changed to `wrap(result.get_oop())`, where `get_oop()` returns a `oopDesc*`, the compiler didn't know what conversion in oopsHierarchy.hpp to use. Therefore, I replaced the overly permissive `void*` constructor with a constructor that only takes the corresponding `type##OopDesc*`. >> >> An alternative would be to let get_oop() return an oop, but then that would add an unwanted a dependency between globalDefinitions.hpp and oopsHierarchy.hpp. An earlier version of this patch did return an oop instead of oopDesc*, but it also moved entire JavaValue class out of globalDefinitions.hpp into a new javaValue.hpp file, and had a corresponding javaValue.inline.hpp file. >> >> Even if we end up using the proposed `oopDesc* get_oop()` version, maybe moving the class to javaValues.hpp would still makes sense? > > src/hotspot/share/utilities/globalDefinitions.hpp line 809: > >> 807: jint i; >> 808: jlong l; >> 809: jobject h; > > Do we still need jobject after this change? We still use jobject for the arguments. We also converted the result to a jobject when it is passed back to Java. We have a few places in JFR where the code seems to take a detour and fetch the oop from the result JavaValue, create a JNI handle, puts it back with set_jobject, and then reads it out with get_jobject. I have a patch where the code simply returns the created JNI handle without installing it into the result JavaValue. I've left that as a separate patch for the JFR team to review. There are some usages in C1, but I haven't tried to figure that out. ------------- PR: https://git.openjdk.java.net/jdk/pull/3013 From coleenp at openjdk.java.net Mon Mar 15 22:49:09 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 15 Mar 2021 22:49:09 GMT Subject: RFR: 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed In-Reply-To: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> References: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> Message-ID: <51RelsuBlMEd0M5ZPhffJjUDg13UxfW6ScU-KLEpuWg=.3d28ae21-eff9-4306-98f7-402ba998978b@github.com> On Mon, 15 Mar 2021 06:26:23 GMT, David Holmes wrote: > Please see bug report for gory details. > > For the specific issue here of the vm_assert gtests I propose to make two changes to the VM: > > 1. When core dumps are disabled, os::abort should call ::_exit not ::exit, as the former more closely models the abrupt termination of ::abort() but without the core dump. > > 2. The race condition when SupressFatalErrorMessages is true is fixed by placing the check after the atomic set/check of the thread-id. That way only a single thread can trigger the fatal error processing. > > I was debating whether to make a slight change so that even when SuppressFatalErrorMessage is true, secondary failures will report that such an error occurred but not with any details. But I've left the existing silence for now. It is possible someone uses the flag to hide a message I would like to expose. I suppose adding additional output in debug builds only may be an option - options welcomed. > > Testing: > - fully manual > > I manually set up the conditions where a background thread could crash due to the atexit actions executing. I added special debug code to show what was happening in such cases, and that secondary errors were occurring. I then applied the fix for #2 and saw the second thread getting caught; then I applied fix #1 and the secondary crashes were gone. > > Also did tier 1-3 testing and local gtest testing just to sanity check things. > > Thanks, > David This looks good. src/hotspot/share/utilities/vmError.cpp line 1455: > 1453: // If we already hit a secondary error during abort, then calling > 1454: // it again is likely to hit another one. But eventually, if we > 1455: // don't deadlock somewhere, we will call os::die() above. Should os::die call _exit or will ::abort kill the process? ie. are the comments in os::die corect? ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3002 From sspitsyn at openjdk.java.net Mon Mar 15 22:41:08 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Mon, 15 Mar 2021 22:41:08 GMT Subject: RFR: 8263589: Introduce JavaValue::get_oop/set_oop In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 12:35:47 GMT, Stefan Karlsson wrote: > JavaValue is a small wrapper class that wraps values used to pass arguments and results between native and Java. > > When JavaCalls::call returns an object, the value stored in the JavaValue is not a handliezed jobject. Instead it's a raw oop. So, most of the code handling the `result`, fetches the result as a jobject, and then immediately casts it to an oop. For example: > oop res = (oop)result.get_jobject(); > > I'd like to change this code to be: > oop res = result.get_oop(); > > The motivations for this patch is: > > 1) Minimize the places where we pass around oops in jobject variables. Maybe at some point we'll have converted the JVM to only use the jobject type when passing around JNI handle. We need to be stricter with the types when we continue develop our GCs and their barriers. > > 2) Limit the number of places in the code where we perform raw oop casts. We have a helper cast function for that, cast_to_oop, but not all code use it. I have future patches where the compiler will completely forbid raw cast to oops (in fastdebug builds). With that in place, I can then add more stricter oop verification code when oops are created. This helps catching bugs earlier. > > --- > > When reviewing this patch, take an extra look at the change to oopsHierarchy.hpp. This was done to support jvmciEnv.cpp code: > JVMCIObject wrap(oop obj)... > JVMCIObjectArray wrap(objArrayOop obj)... > JVMCIPrimitiveArray wrap(typeArrayOop obj) ... > Previously, `wrap((oop)result.get_jobject())` called the first function. When the code was changed to `wrap(result.get_oop())`, where `get_oop()` returns a `oopDesc*`, the compiler didn't know what conversion in oopsHierarchy.hpp to use. Therefore, I replaced the overly permissive `void*` constructor with a constructor that only takes the corresponding `type##OopDesc*`. > > An alternative would be to let get_oop() return an oop, but then that would add an unwanted a dependency between globalDefinitions.hpp and oopsHierarchy.hpp. An earlier version of this patch did return an oop instead of oopDesc*, but it also moved entire JavaValue class out of globalDefinitions.hpp into a new javaValue.hpp file, and had a corresponding javaValue.inline.hpp file. > > Even if we end up using the proposed `oopDesc* get_oop()` version, maybe moving the class to javaValues.hpp would still makes sense? Stefan, It is a good move in general and tt looks good to me. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3013 From dholmes at openjdk.java.net Mon Mar 15 23:14:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 15 Mar 2021 23:14:08 GMT Subject: RFR: 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed In-Reply-To: <51RelsuBlMEd0M5ZPhffJjUDg13UxfW6ScU-KLEpuWg=.3d28ae21-eff9-4306-98f7-402ba998978b@github.com> References: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> <51RelsuBlMEd0M5ZPhffJjUDg13UxfW6ScU-KLEpuWg=.3d28ae21-eff9-4306-98f7-402ba998978b@github.com> Message-ID: On Mon, 15 Mar 2021 22:45:29 GMT, Coleen Phillimore wrote: >> Please see bug report for gory details. >> >> For the specific issue here of the vm_assert gtests I propose to make two changes to the VM: >> >> 1. When core dumps are disabled, os::abort should call ::_exit not ::exit, as the former more closely models the abrupt termination of ::abort() but without the core dump. >> >> 2. The race condition when SupressFatalErrorMessages is true is fixed by placing the check after the atomic set/check of the thread-id. That way only a single thread can trigger the fatal error processing. >> >> I was debating whether to make a slight change so that even when SuppressFatalErrorMessage is true, secondary failures will report that such an error occurred but not with any details. But I've left the existing silence for now. It is possible someone uses the flag to hide a message I would like to expose. I suppose adding additional output in debug builds only may be an option - options welcomed. >> >> Testing: >> - fully manual >> >> I manually set up the conditions where a background thread could crash due to the atexit actions executing. I added special debug code to show what was happening in such cases, and that secondary errors were occurring. I then applied the fix for #2 and saw the second thread getting caught; then I applied fix #1 and the secondary crashes were gone. >> >> Also did tier 1-3 testing and local gtest testing just to sanity check things. >> >> Thanks, >> David > > src/hotspot/share/utilities/vmError.cpp line 1455: > >> 1453: // If we already hit a secondary error during abort, then calling >> 1454: // it again is likely to hit another one. But eventually, if we >> 1455: // don't deadlock somewhere, we will call os::die() above. > > Should os::die call _exit or will ::abort kill the process? ie. are the comments in os::die corect? The comments in os::die() are correct. The abort() will kill the process (and generate a core dump). ------------- PR: https://git.openjdk.java.net/jdk/pull/3002 From dholmes at openjdk.java.net Mon Mar 15 23:14:06 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 15 Mar 2021 23:14:06 GMT Subject: RFR: 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed In-Reply-To: <0alNJ7v_vgvdW4lcTlJqUGOiStti0oIPRS6yN615x5Q=.56504516-1626-4267-956e-f132ead3c4bc@github.com> References: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> <0alNJ7v_vgvdW4lcTlJqUGOiStti0oIPRS6yN615x5Q=.56504516-1626-4267-956e-f132ead3c4bc@github.com> Message-ID: On Mon, 15 Mar 2021 21:32:56 GMT, Daniel D. Daugherty wrote: >> Please see bug report for gory details. >> >> For the specific issue here of the vm_assert gtests I propose to make two changes to the VM: >> >> 1. When core dumps are disabled, os::abort should call ::_exit not ::exit, as the former more closely models the abrupt termination of ::abort() but without the core dump. >> >> 2. The race condition when SupressFatalErrorMessages is true is fixed by placing the check after the atomic set/check of the thread-id. That way only a single thread can trigger the fatal error processing. >> >> I was debating whether to make a slight change so that even when SuppressFatalErrorMessage is true, secondary failures will report that such an error occurred but not with any details. But I've left the existing silence for now. It is possible someone uses the flag to hide a message I would like to expose. I suppose adding additional output in debug builds only may be an option - options welcomed. >> >> Testing: >> - fully manual >> >> I manually set up the conditions where a background thread could crash due to the atexit actions executing. I added special debug code to show what was happening in such cases, and that secondary errors were occurring. I then applied the fix for #2 and saw the second thread getting caught; then I applied fix #1 and the secondary crashes were gone. >> >> Also did tier 1-3 testing and local gtest testing just to sanity check things. >> >> Thanks, >> David > > Thumbs up. Thanks for the quick reviews @dcubed-ojdk and @coleenp ! ------------- PR: https://git.openjdk.java.net/jdk/pull/3002 From dholmes at openjdk.java.net Mon Mar 15 23:14:09 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 15 Mar 2021 23:14:09 GMT Subject: Integrated: 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed In-Reply-To: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> References: <4ZvPchEPt95dq3xyNPVFmfRRyPF3jVfbwY0HyfOJIFI=.0cbe5444-0b5e-438a-86ec-c4f3006b3676@github.com> Message-ID: <1kPAfwKSVUZAAGwtmb2ViniseXw-nlQEv-2o1Nn18oE=.d26eff98-e1e6-4cc6-af59-2543d0f9da75@github.com> On Mon, 15 Mar 2021 06:26:23 GMT, David Holmes wrote: > Please see bug report for gory details. > > For the specific issue here of the vm_assert gtests I propose to make two changes to the VM: > > 1. When core dumps are disabled, os::abort should call ::_exit not ::exit, as the former more closely models the abrupt termination of ::abort() but without the core dump. > > 2. The race condition when SupressFatalErrorMessages is true is fixed by placing the check after the atomic set/check of the thread-id. That way only a single thread can trigger the fatal error processing. > > I was debating whether to make a slight change so that even when SuppressFatalErrorMessage is true, secondary failures will report that such an error occurred but not with any details. But I've left the existing silence for now. It is possible someone uses the flag to hide a message I would like to expose. I suppose adding additional output in debug builds only may be an option - options welcomed. > > Testing: > - fully manual > > I manually set up the conditions where a background thread could crash due to the atexit actions executing. I added special debug code to show what was happening in such cases, and that secondary errors were occurring. I then applied the fix for #2 and saw the second thread getting caught; then I applied fix #1 and the secondary crashes were gone. > > Also did tier 1-3 testing and local gtest testing just to sanity check things. > > Thanks, > David This pull request has now been integrated. Changeset: 8c1112a6 Author: David Holmes URL: https://git.openjdk.java.net/jdk/commit/8c1112a6 Stats: 31 lines in 2 files changed: 20 ins; 3 del; 8 mod 8261916: gtest/GTestWrapper.java vmErrorTest.unimplemented1_vm_assert failed Reviewed-by: dcubed, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/3002 From kbarrett at openjdk.java.net Mon Mar 15 23:28:12 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 15 Mar 2021 23:28:12 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 20:01:42 GMT, Man Cao wrote: >> src/hotspot/share/utilities/lockFreeQueue.hpp line 50: >> >>> 48: NONCOPYABLE(LockFreeQueue); >>> 49: >>> 50: protected: >> >> Protected members (to be accessible from a derived class) are inconsistent with a public non-virtual destructor (that may allow destructor slicing). I dislike classes that try to be both concrete implementation classes and base classes; they are hard to design well (and this class wasn't intended to be such). This was done to allow G1DirtyCardQueueSet to extend it with the `take_all` function; that seems like a useful operation in the generic form, even if it can't be made thread-safe. (The G1 function asserts_at_safepoint(), but that's not really appropriate for a generic form.) > > To clarify, do we want to move take_all() to LockFreeQueue, remove assert_at_safepoint() and put a big warning that it is not thread safe? It perhaps also involves adding the "struct HeadTail" to this class. > > Another option is to provide two getter methods that returns T** for &_head and &_tail. They are not thread-safe either, but maybe more generic than take_all(). Either of those options seems okay, though the two getters exposes more of the implementation details. The existing take_all is already described as not thread-safe. HeadTail => `Pair` would work. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Mon Mar 15 23:55:08 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 15 Mar 2021 23:55:08 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 19:54:35 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 60: > >> 58: LockFreeQueueCriticalSection cs(current_thread); >> 59: >> 60: T* result = Atomic::load_acquire(&_head); > > A related question about memory ordering. Are these two load_acquire() really necessary? They are not paired with any release_store(). > I think they can be normal Atomic::load(), as append() and pop() already have Atomic::xchg() and Atomic::cmpxchg() to enforce ordering. The ordering of xchg in append only affects the writing thread. It does nothing for the reader side. The second load_acquire pairs with the set_next (a release_store) in append. However, it's always (one way or another) followed by a conservative cmpxchg, so does seem possible to weaken. The first load_acquire is a "consume", but we don't have that and upgrade to acquire. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Tue Mar 16 00:09:09 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 16 Mar 2021 00:09:09 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 14:34:48 GMT, Stefan Karlsson wrote: > JavaCallArguments has this code and comment: > > // Helper for push_oop and the like. The value argument is a > // "handle" that refers to an oop. We record the address of the > // handle rather than the designated oop. The handle is later > // resolved to the oop by parameters(). This delays the exposure of > // naked oops until it is GC-safe. > template > inline int push_oop_impl(T handle, int size) { > // JNITypes::put_obj expects an oop value, so we play fast and > // loose with the type system. The cast from handle type to oop > // *must* use a C-style cast. In a product build it performs a > // reinterpret_cast. In a debug build (more accurately, in a > // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking > // the debug-only oop class's conversion from void* constructor. > JNITypes::put_obj((oop)handle, _value, size); // Updates size. > return size; // Return the updated size. > } > The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. > > I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. > > I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3014 From yyang at openjdk.java.net Tue Mar 16 02:11:09 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 16 Mar 2021 02:11:09 GMT Subject: RFR: 8263562: Checking if proxy_klass_head is still lambda_proxy_is_available In-Reply-To: <7nEEqXgCs7MXosNyszEhutamFo4I0JVA8DvfwOxr3NA=.bab47ab5-8bcd-4bc4-83ad-3780cb15821f@github.com> References: <7nEEqXgCs7MXosNyszEhutamFo4I0JVA8DvfwOxr3NA=.bab47ab5-8bcd-4bc4-83ad-3780cb15821f@github.com> Message-ID: On Mon, 15 Mar 2021 18:43:03 GMT, Calvin Cheung wrote: >> The `Shared Lambda Dictionary` section in the result of SharedLambdaDictionaryPrinter will mix normal klasses with lambda proxy klasses. Using the following commands can reproduce it: >> >> Proc1: `./jshell` >> Proc2: `jcmd VM.systemdictionary -verbose` >> >> When all archived lambda proxy classes are used, proxy_klass_head(in RunTimeLambdaProxyClassInfo) is still referring to an instance klass that is no longer lambda_proxy_is_available, and its next_link will be set by classloader to link another normal class. Simply checking if proxy_klass_head is lambda_proxy_is_available can solve this problem. >> >> Best regards, >> Yang > > Looks good and thanks for fixing it. Thanks @calvinccheung and @iklam for reviews. ------------- PR: https://git.openjdk.java.net/jdk/pull/3001 From yyang at openjdk.java.net Tue Mar 16 02:11:08 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 16 Mar 2021 02:11:08 GMT Subject: RFR: 8263562: Checking if proxy_klass_head is still lambda_proxy_is_available In-Reply-To: References: Message-ID: <0yMdM282GQx40X0DypJXmSwjxTocn8UBksltvnRShYg=.9b72a45d-c190-4134-b8c9-8c7942f92dcc@github.com> On Mon, 15 Mar 2021 22:14:09 GMT, Ioi Lam wrote: >> The `Shared Lambda Dictionary` section in the result of SharedLambdaDictionaryPrinter will mix normal klasses with lambda proxy klasses. Using the following commands can reproduce it: >> >> Proc1: `./jshell` >> Proc2: `jcmd VM.systemdictionary -verbose` >> >> When all archived lambda proxy classes are used, proxy_klass_head(in RunTimeLambdaProxyClassInfo) is still referring to an instance klass that is no longer lambda_proxy_is_available, and its next_link will be set by classloader to link another normal class. Simply checking if proxy_klass_head is lambda_proxy_is_available can solve this problem. >> >> Best regards, >> Yang > > Marked as reviewed by iklam (Reviewer). > Looks good and thanks for fixing it. > > Furthermore, I think we should clear proxy_klass_head when it becomes no longer lambda_proxy_is_available. Any further accesses will be illegal. Just IMHO, I will hold this only if you think it's useful. > > I'd suggesting open another bug for the above since it fixes a different issue. Hi @calvinccheung, Thanks for your advice, I will create another PR to do this stuff. Thanks! Yang ------------- PR: https://git.openjdk.java.net/jdk/pull/3001 From kbarrett at openjdk.java.net Tue Mar 16 03:51:19 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 16 Mar 2021 03:51:19 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier Message-ID: Please review and vote on this change to the HotSpot Style Guide to permit the use of `override` virtual specifiers. The virtual specifiers `override` and `final` were added in C++11, and use of `final` is already permitted in HotSpot code. Using the `override` specifier provides error checking that the function is indeed overriding a virtual function declared in a base class. This can prevent some often surprisingly difficult to spot bugs. This is a modification of the Style Guide, so rough consensus among the HotSpot Group members is required to make this change. Only Group members should vote for approval (via the github PR), though reasoned objections or comments from anyone will be considered. A decision on this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. Since we're piggybacking on github PRs here, please use the PR review process to approve (click on Review Changes > Approve), rather than sending a "vote: yes" email reply that would be normal for a CFV. Other responses can still use email of course. ------------- Commit messages: - permit override specifier Changes: https://git.openjdk.java.net/jdk/pull/3021/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3021&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8254050 Stats: 8 lines in 1 file changed: 5 ins; 3 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3021.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3021/head:pull/3021 PR: https://git.openjdk.java.net/jdk/pull/3021 From dholmes at openjdk.java.net Tue Mar 16 04:10:08 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 16 Mar 2021 04:10:08 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 03:44:56 GMT, Kim Barrett wrote: > Please review and vote on this change to the HotSpot Style Guide to permit > the use of `override` virtual specifiers. The virtual specifiers `override` > and `final` were added in C++11, and use of `final` is already permitted in > HotSpot code. > > Using the `override` specifier provides error checking that the function is > indeed overriding a virtual function declared in a base class. This can > prevent some often surprisingly difficult to spot bugs. > > This is a modification of the Style Guide, so rough consensus among > the HotSpot Group members is required to make this change. Only Group > members should vote for approval (via the github PR), though reasoned > objections or comments from anyone will be considered. A decision on > this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review > process to approve (click on Review Changes > Approve), rather than > sending a "vote: yes" email reply that would be normal for a CFV. > Other responses can still use email of course. Fine by me. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3021 From manc at openjdk.java.net Tue Mar 16 05:52:14 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 16 Mar 2021 05:52:14 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 23:52:39 GMT, Kim Barrett wrote: >> src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 60: >> >>> 58: LockFreeQueueCriticalSection cs(current_thread); >>> 59: >>> 60: T* result = Atomic::load_acquire(&_head); >> >> A related question about memory ordering. Are these two load_acquire() really necessary? They are not paired with any release_store(). >> I think they can be normal Atomic::load(), as append() and pop() already have Atomic::xchg() and Atomic::cmpxchg() to enforce ordering. > > The ordering of xchg in append only affects the writing thread. It does nothing for the reader side. The second load_acquire pairs with the set_next (a release_store) in append. However, it's always (one way or another) followed by a conservative cmpxchg, so does seem possible to weaken. The first load_acquire is a "consume", but we don't have that and upgrade to acquire. Agreed that the second load_acquire can be weakened, and thanks for noting the first one is a "consume". However, I'm a bit confused about the explanation. > The second load_acquire pairs with the set_next (a release_store) in append. The set_next() is Atomic::store() as in LockFreeStack, right? Then it is a relaxed store, but not a release_store. In this case the full fence provided by xchg is necessary to make set_next() a release_store. IIUC, the following is sufficient to establish a release-acquire ordering: Writer thread: Reader thread: StoreStoreFence(); relaxed_store(p); relaxed_load(p); LoadLoadFence(); In this case: append() (Writer thread): pop() (Reader thread): // Provides full fence Atomic::xchg(&_tail, ...); Atomic::store(&p._next, ...); // Suppose we don't use load_acqurie Atomic::load(&p._next); // Provides full fence Atomic::cmpxchg(&_head /* or &_tail */, ...); I think it is the two full fences that enables us to use relaxed store and relaxed load. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Tue Mar 16 06:17:09 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 16 Mar 2021 06:17:09 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 04:06:55 GMT, David Holmes wrote: >> Please review and vote on this change to the HotSpot Style Guide to permit >> the use of `override` virtual specifiers. The virtual specifiers `override` >> and `final` were added in C++11, and use of `final` is already permitted in >> HotSpot code. >> >> Using the `override` specifier provides error checking that the function is >> indeed overriding a virtual function declared in a base class. This can >> prevent some often surprisingly difficult to spot bugs. >> >> This is a modification of the Style Guide, so rough consensus among >> the HotSpot Group members is required to make this change. Only Group >> members should vote for approval (via the github PR), though reasoned >> objections or comments from anyone will be considered. A decision on >> this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. >> >> Since we're piggybacking on github PRs here, please use the PR review >> process to approve (click on Review Changes > Approve), rather than >> sending a "vote: yes" email reply that would be normal for a CFV. >> Other responses can still use email of course. > > Fine by me. > > Thanks, > David I forgot to mention that I didn't bother including the generated html in the PR, because it doesn't really add anything for the review, but will include it before integrating. ------------- PR: https://git.openjdk.java.net/jdk/pull/3021 From github.com+71302734+amitdpawar at openjdk.java.net Tue Mar 16 06:19:09 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Tue, 16 Mar 2021 06:19:09 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: <28o3upN8Q_EaXwVx-EknI_izHqMYQMCTA1YYdzU7Bi8=.e8b6a484-e233-40b8-b64e-6b5c49436c3b@github.com> On Sun, 14 Mar 2021 21:27:50 GMT, David Holmes wrote: >> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed build issues for some targets and updated with suggested changes. > > src/hotspot/share/gc/shared/pretouchTask.cpp line 66: > >> 64: >> 65: // Following atomic loads are required to make other processor store >> 66: // visible to all threads from this points. > > That is not what an Atomic::load does; they only protect against word-tearing (which is only a theoretical possibility on some platforms). If you want to guarantee those loads see the most recent store then the fence needs to come first. Thank you David for your reply and will do as suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From jrose at openjdk.java.net Tue Mar 16 06:28:06 2021 From: jrose at openjdk.java.net (John R Rose) Date: Tue, 16 Mar 2021 06:28:06 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 03:44:56 GMT, Kim Barrett wrote: > Please review and vote on this change to the HotSpot Style Guide to permit > the use of `override` virtual specifiers. The virtual specifiers `override` > and `final` were added in C++11, and use of `final` is already permitted in > HotSpot code. > > Using the `override` specifier provides error checking that the function is > indeed overriding a virtual function declared in a base class. This can > prevent some often surprisingly difficult to spot bugs. > > This is a modification of the Style Guide, so rough consensus among > the HotSpot Group members is required to make this change. Only Group > members should vote for approval (via the github PR), though reasoned > objections or comments from anyone will be considered. A decision on > this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review > process to approve (click on Review Changes > Approve), rather than > sending a "vote: yes" email reply that would be normal for a CFV. > Other responses can still use email of course. Marked as reviewed by jrose (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3021 From github.com+71302734+amitdpawar at openjdk.java.net Tue Mar 16 06:39:10 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Tue, 16 Mar 2021 06:39:10 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 21:31:32 GMT, David Holmes wrote: >> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed build issues for some targets and updated with suggested changes. > > src/hotspot/share/gc/shared/pretouchTask.hpp line 47: > >> 45: >> 46: void set_task_status(TaskStatus status) { Atomic::release_store(&_task_status, (size_t)status); } >> 47: void set_task_done() { Atomic::release_store(&_task_status, (size_t)Done); } > > There is a convention that when a setter embodies a release_store that release is included in the name e.g. release_set_task_done(). OK, will change. > src/hotspot/share/gc/shared/pretouchTask.hpp line 51: > >> 49: void set_task_notready() { set_task_status(NotReady); } >> 50: bool is_task_ready() { return Atomic::load(&_task_status) == Ready; } >> 51: bool is_task_done() { return Atomic::load(&_task_status) == Done; } > > Given these fields are set with a release_store I would expect to see them read with a load_acquire to match with it. And named eg. is_task_done_acquire(). OK, will change. > src/hotspot/share/gc/shared/pretouchTask.cpp line 96: > >> 94: Atomic::release_store(&_cur_addr, _end_addr); >> 95: OrderAccess::storestore(); >> 96: set_task_done(); > > The storestore barrier is not needed as the set_task_done() is a release_store which already has a storestore barrier. OK. > src/hotspot/share/gc/shared/pretouchTask.cpp line 56: > >> 54: void PretouchTask::reinitialize(char* start_addr, char* end_addr) { >> 55: Atomic::release_store(&_cur_addr, start_addr); >> 56: Atomic::release_store(&_end_addr, end_addr); > > Back-to-back release-stores have redundant loadstore semantics. You could just use a storestore() barrier after the first release_store, then use plain Atomic::store for the second field. OK, will change as per your suggestion. > src/hotspot/share/gc/parallel/mutableSpace.cpp line 149: > >> 147: // waiting to expand old-gen will join from PSOldGen::expand_for_allocate >> 148: // function for pretouch work. >> 149: pretouch_task->set_task_ready(); > > The storestore is not needed as it is included in the release_store of set_task_ready(). OK. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From github.com+71302734+amitdpawar at openjdk.java.net Tue Mar 16 06:39:07 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Tue, 16 Mar 2021 06:39:07 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 02:07:27 GMT, Kim Barrett wrote: >> src/hotspot/share/gc/shared/pretouchTask.cpp line 94: >> >>> 92: >>> 93: if (thread_num == 0) { >>> 94: Atomic::release_store(&_cur_addr, _end_addr); >> >> The release is unnecessary given the Atomic::sub. > > Besides what David said, I think this is also confusing. I think it's to account for the claim (using fetch_and_add) unconditionally adding the chunk size to cur_addr, which could lead to overshoot. I think it would be clearer to prevent the overshoot from occurring. David, will update as suggested. Kim, thanks for your feedback. Updating **_cur_addr** was not needed here but thought to keep the markers in clean state. Again, **_cur_addr** & **_end_addr** gets updated before the start of next pretouch resize so will remove this. If something breaks then will fix as suggested. I hope this should be OK. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From github.com+71302734+amitdpawar at openjdk.java.net Tue Mar 16 06:45:12 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Tue, 16 Mar 2021 06:45:12 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 01:43:38 GMT, Kim Barrett wrote: >> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed build issues for some targets and updated with suggested changes. > > src/hotspot/share/gc/parallel/psOldGen.cpp line 226: > >> 224: while (!pretouch()->is_task_done()) { >> 225: if (pretouch()->is_task_ready()) { >> 226: pretouch()->work(Thread::current()->osthread()->thread_id()); > > worker thread_id and os thread_id are entirely different things. This is another indication that reusing (abusing) PretouchTask in this way is a mistake. Is following change OK ? pretouch()->work(static_cast(Thread::current())->id()); otherwise please suggest. > src/hotspot/share/gc/shared/pretouchTask.cpp line 68: > >> 66: // visible to all threads from this points. >> 67: char *cur_addr = Atomic::load(&_cur_addr); >> 68: char *end_addr = Atomic::load(&_end_addr); > > end needs to be read before cur. Otherwise, cur could be from one pretouch instance and end could be from a later, unrelated, pretouch instance, leading to scribbling. didn't realize this. will change per your suggestion. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From stuefe at openjdk.java.net Tue Mar 16 06:56:07 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 16 Mar 2021 06:56:07 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 03:44:56 GMT, Kim Barrett wrote: > Please review and vote on this change to the HotSpot Style Guide to permit > the use of `override` virtual specifiers. The virtual specifiers `override` > and `final` were added in C++11, and use of `final` is already permitted in > HotSpot code. > > Using the `override` specifier provides error checking that the function is > indeed overriding a virtual function declared in a base class. This can > prevent some often surprisingly difficult to spot bugs. > > This is a modification of the Style Guide, so rough consensus among > the HotSpot Group members is required to make this change. Only Group > members should vote for approval (via the github PR), though reasoned > objections or comments from anyone will be considered. A decision on > this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review > process to approve (click on Review Changes > Approve), rather than > sending a "vote: yes" email reply that would be normal for a CFV. > Other responses can still use email of course. Looks good and makes sense. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3021 From github.com+71302734+amitdpawar at openjdk.java.net Tue Mar 16 07:22:08 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Tue, 16 Mar 2021 07:22:08 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 01:46:08 GMT, Kim Barrett wrote: >> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed build issues for some targets and updated with suggested changes. > > src/hotspot/share/gc/parallel/psOldGen.hpp line 49: > >> 47: PSGenerationCounters* _gen_counters; >> 48: SpaceCounters* _space_counters; >> 49: PretouchTask* _pretouch; // Used when old gen resized during scavenging. > > I think abusing PretouchTask in this way, completely outside the workgang protocol, is confusing and shouldn't be done. There might be some code that could be shared between this use and PretouchTask, but if so then it should be factored out for such sharing, rather than mangling PretouchTask in the way being proposed. Thanks for your suggestion. It will be better if you can give some direction as G1GC also need similar fix during the expansion. I think G1 has array of regions to be touched during the expansion and that task needs to be shared across the threads. pretouch class can be put inside another class and that can be shared across the threads to synchronize the pretouch task and not sure whether that will be OK again. I explored this approach after your suggestion and thought to update after the feedback during the review process. Please suggest. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From stefank at openjdk.java.net Tue Mar 16 07:47:11 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 16 Mar 2021 07:47:11 GMT Subject: RFR: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 18:58:59 GMT, Coleen Phillimore wrote: >> JavaCallArguments has this code and comment: >> >> // Helper for push_oop and the like. The value argument is a >> // "handle" that refers to an oop. We record the address of the >> // handle rather than the designated oop. The handle is later >> // resolved to the oop by parameters(). This delays the exposure of >> // naked oops until it is GC-safe. >> template >> inline int push_oop_impl(T handle, int size) { >> // JNITypes::put_obj expects an oop value, so we play fast and >> // loose with the type system. The cast from handle type to oop >> // *must* use a C-style cast. In a product build it performs a >> // reinterpret_cast. In a debug build (more accurately, in a >> // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking >> // the debug-only oop class's conversion from void* constructor. >> JNITypes::put_obj((oop)handle, _value, size); // Updates size. >> return size; // Return the updated size. >> } >> The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. >> >> I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. >> >> I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. > > Marked as reviewed by coleenp (Reviewer). Thanks @coleenp @iklam @kimbarrett for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From stefank at openjdk.java.net Tue Mar 16 07:47:08 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 16 Mar 2021 07:47:08 GMT Subject: RFR: 8263589: Introduce JavaValue::get_oop/set_oop In-Reply-To: <73KvQ-wIvi6mX-538ngdN0mHKFs30QI7Y57TFHsfRLM=.89d76b9a-1b3d-401c-a942-86776321c1f9@github.com> References: <73KvQ-wIvi6mX-538ngdN0mHKFs30QI7Y57TFHsfRLM=.89d76b9a-1b3d-401c-a942-86776321c1f9@github.com> Message-ID: On Mon, 15 Mar 2021 21:27:54 GMT, Coleen Phillimore wrote: >> JavaValue is a small wrapper class that wraps values used to pass arguments and results between native and Java. >> >> When JavaCalls::call returns an object, the value stored in the JavaValue is not a handliezed jobject. Instead it's a raw oop. So, most of the code handling the `result`, fetches the result as a jobject, and then immediately casts it to an oop. For example: >> oop res = (oop)result.get_jobject(); >> >> I'd like to change this code to be: >> oop res = result.get_oop(); >> >> The motivations for this patch is: >> >> 1) Minimize the places where we pass around oops in jobject variables. Maybe at some point we'll have converted the JVM to only use the jobject type when passing around JNI handle. We need to be stricter with the types when we continue develop our GCs and their barriers. >> >> 2) Limit the number of places in the code where we perform raw oop casts. We have a helper cast function for that, cast_to_oop, but not all code use it. I have future patches where the compiler will completely forbid raw cast to oops (in fastdebug builds). With that in place, I can then add more stricter oop verification code when oops are created. This helps catching bugs earlier. >> >> --- >> >> When reviewing this patch, take an extra look at the change to oopsHierarchy.hpp. This was done to support jvmciEnv.cpp code: >> JVMCIObject wrap(oop obj)... >> JVMCIObjectArray wrap(objArrayOop obj)... >> JVMCIPrimitiveArray wrap(typeArrayOop obj) ... >> Previously, `wrap((oop)result.get_jobject())` called the first function. When the code was changed to `wrap(result.get_oop())`, where `get_oop()` returns a `oopDesc*`, the compiler didn't know what conversion in oopsHierarchy.hpp to use. Therefore, I replaced the overly permissive `void*` constructor with a constructor that only takes the corresponding `type##OopDesc*`. >> >> An alternative would be to let get_oop() return an oop, but then that would add an unwanted a dependency between globalDefinitions.hpp and oopsHierarchy.hpp. An earlier version of this patch did return an oop instead of oopDesc*, but it also moved entire JavaValue class out of globalDefinitions.hpp into a new javaValue.hpp file, and had a corresponding javaValue.inline.hpp file. >> >> Even if we end up using the proposed `oopDesc* get_oop()` version, maybe moving the class to javaValues.hpp would still makes sense? > > This change looks really good to me. I have no objection to oopDesc* in JavaCallValue. We use oopDesc* in all places where the class oop would interfere with values passed between Java and the vm. Thanks @coleenp @sspitsyn for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/3013 From rehn at openjdk.java.net Tue Mar 16 07:57:08 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 16 Mar 2021 07:57:08 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 21:45:34 GMT, Daniel D. Daugherty wrote: >> When returning from the last Java frame back to vm and hitting a safepoint poll on that last return we sometimes have a last java frame but no vframe. >> This seem to be a bug in itself, handled in: 8263576 >> >> Other places which uses vframe NULL checks it before, so let's do that in GetCurrentLocationClosure also. >> >> Testing: nsk jdi/jvmti, jdk jdi, jck vm and t1-3. > > src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 263: > >> 261: // There can be a race condition between a handshake >> 262: // and the target thread exiting from Java execution. >> 263: // We must recheck the last Java frame still exists. > > Typo: s/recheck the last/recheck that the last/ > (not your typo, but since you're in there...) Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/3010 From rehn at openjdk.java.net Tue Mar 16 08:07:40 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 16 Mar 2021 08:07:40 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 22:05:32 GMT, Daniel D. Daugherty wrote: > Thumbs up! > > I agree that the code should have checked for "if (vf != NULL) {" > instead of asserting that "(vf != NULL)". Thanks Dan! > src/hotspot/share/prims/jvmtiEnvThreadState.cpp line 266: > >> 264: if (!jt->is_exiting() && jt->has_last_Java_frame()) { >> 265: javaVFrame* vf = jt->last_java_vframe(&rm); >> 266: assert(vf != NULL, "must have last java frame"); > > The code before we converted to handshakes also had this assert. > > The pre-handshake code did the work in the doit() function for the > VM_GetCurrentLocation VM-op. This makes me wonder if we always > had frames here when this was previously done via VM-op? And that > makes me wonder whether handshakes is doing something different > so we don't always have a frame here? The differences is 8253180 (JDK 16), which turns return polls into branches to SafepointBlob instead of going via signal handler. When setting up the last java frame in the SafepointBlob we get different result than before. To look at that potential bug I opened 8263576. ------------- PR: https://git.openjdk.java.net/jdk/pull/3010 From rehn at openjdk.java.net Tue Mar 16 08:07:39 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 16 Mar 2021 08:07:39 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION [v2] In-Reply-To: References: Message-ID: > When returning from the last Java frame back to vm and hitting a safepoint poll on that last return we sometimes have a last java frame but no vframe. > This seem to be a bug in itself, handled in: 8263576 > > Other places which uses vframe NULL checks it before, so let's do that in GetCurrentLocationClosure also. > > Testing: nsk jdi/jvmti, jdk jdi, jck vm and t1-3. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Typo - Merge branch 'master' into 8261262 - Check vframe non-null ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3010/files - new: https://git.openjdk.java.net/jdk/pull/3010/files/50348e80..068364b6 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3010&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3010&range=00-01 Stats: 937 lines in 72 files changed: 385 ins; 370 del; 182 mod Patch: https://git.openjdk.java.net/jdk/pull/3010.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3010/head:pull/3010 PR: https://git.openjdk.java.net/jdk/pull/3010 From rehn at openjdk.java.net Tue Mar 16 08:07:41 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 16 Mar 2021 08:07:41 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 22:20:38 GMT, Serguei Spitsyn wrote: > Robbin, > The fix looks good to me. Thank you for taking care about this issue! > Thanks, > Serguei Thanks Serguei! ------------- PR: https://git.openjdk.java.net/jdk/pull/3010 From tschatzl at openjdk.java.net Tue Mar 16 08:11:08 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 16 Mar 2021 08:11:08 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: <7Hm2ZNTcE2p-Na3HQvdWLOc8FvHIcy3sMmC-hDbihOE=.857043c6-63a1-4a36-b35c-41efd643db95@github.com> On Tue, 16 Mar 2021 03:44:56 GMT, Kim Barrett wrote: > Please review and vote on this change to the HotSpot Style Guide to permit > the use of `override` virtual specifiers. The virtual specifiers `override` > and `final` were added in C++11, and use of `final` is already permitted in > HotSpot code. > > Using the `override` specifier provides error checking that the function is > indeed overriding a virtual function declared in a base class. This can > prevent some often surprisingly difficult to spot bugs. > > This is a modification of the Style Guide, so rough consensus among > the HotSpot Group members is required to make this change. Only Group > members should vote for approval (via the github PR), though reasoned > objections or comments from anyone will be considered. A decision on > this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review > process to approve (click on Review Changes > Approve), rather than > sending a "vote: yes" email reply that would be normal for a CFV. > Other responses can still use email of course. Marked as reviewed by tschatzl (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3021 From stefank at openjdk.java.net Tue Mar 16 08:32:06 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 16 Mar 2021 08:32:06 GMT Subject: Integrated: 8263589: Introduce JavaValue::get_oop/set_oop In-Reply-To: References: Message-ID: <_-Q848_viOn6nedZYPXtEncvHRurcKq2AGOLL4jdddI=.c87c7dff-98fa-41e7-a910-a016e7c5deda@github.com> On Mon, 15 Mar 2021 12:35:47 GMT, Stefan Karlsson wrote: > JavaValue is a small wrapper class that wraps values used to pass arguments and results between native and Java. > > When JavaCalls::call returns an object, the value stored in the JavaValue is not a handliezed jobject. Instead it's a raw oop. So, most of the code handling the `result`, fetches the result as a jobject, and then immediately casts it to an oop. For example: > oop res = (oop)result.get_jobject(); > > I'd like to change this code to be: > oop res = result.get_oop(); > > The motivations for this patch is: > > 1) Minimize the places where we pass around oops in jobject variables. Maybe at some point we'll have converted the JVM to only use the jobject type when passing around JNI handle. We need to be stricter with the types when we continue develop our GCs and their barriers. > > 2) Limit the number of places in the code where we perform raw oop casts. We have a helper cast function for that, cast_to_oop, but not all code use it. I have future patches where the compiler will completely forbid raw cast to oops (in fastdebug builds). With that in place, I can then add more stricter oop verification code when oops are created. This helps catching bugs earlier. > > --- > > When reviewing this patch, take an extra look at the change to oopsHierarchy.hpp. This was done to support jvmciEnv.cpp code: > JVMCIObject wrap(oop obj)... > JVMCIObjectArray wrap(objArrayOop obj)... > JVMCIPrimitiveArray wrap(typeArrayOop obj) ... > Previously, `wrap((oop)result.get_jobject())` called the first function. When the code was changed to `wrap(result.get_oop())`, where `get_oop()` returns a `oopDesc*`, the compiler didn't know what conversion in oopsHierarchy.hpp to use. Therefore, I replaced the overly permissive `void*` constructor with a constructor that only takes the corresponding `type##OopDesc*`. > > An alternative would be to let get_oop() return an oop, but then that would add an unwanted a dependency between globalDefinitions.hpp and oopsHierarchy.hpp. An earlier version of this patch did return an oop instead of oopDesc*, but it also moved entire JavaValue class out of globalDefinitions.hpp into a new javaValue.hpp file, and had a corresponding javaValue.inline.hpp file. > > Even if we end up using the proposed `oopDesc* get_oop()` version, maybe moving the class to javaValues.hpp would still makes sense? This pull request has now been integrated. Changeset: a1f6591f Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/a1f6591f Stats: 72 lines in 26 files changed: 5 ins; 0 del; 67 mod 8263589: Introduce JavaValue::get_oop/set_oop Reviewed-by: coleenp, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/3013 From stefank at openjdk.java.net Tue Mar 16 08:32:23 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 16 Mar 2021 08:32:23 GMT Subject: Integrated: 8263595: Remove oop type punning in JavaCallArguments In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 14:34:48 GMT, Stefan Karlsson wrote: > JavaCallArguments has this code and comment: > > // Helper for push_oop and the like. The value argument is a > // "handle" that refers to an oop. We record the address of the > // handle rather than the designated oop. The handle is later > // resolved to the oop by parameters(). This delays the exposure of > // naked oops until it is GC-safe. > template > inline int push_oop_impl(T handle, int size) { > // JNITypes::put_obj expects an oop value, so we play fast and > // loose with the type system. The cast from handle type to oop > // *must* use a C-style cast. In a product build it performs a > // reinterpret_cast. In a debug build (more accurately, in a > // CHECK_UNHANDLED_OOPS build) it performs a static_cast, invoking > // the debug-only oop class's conversion from void* constructor. > JNITypes::put_obj((oop)handle, _value, size); // Updates size. > return size; // Return the updated size. > } > The type T is either an oop* or jobject (JNI handle). This puts something that isn't an oop inside an oop. > > I propose that we don't do this. Instead we could pass the handle (address containing the oop), and then in put_obj convert that address to an intptr_t, which matches well with the `to` argument of those functions. > > I've been running this (and some other changes) with ZGC on Linux x64 through tier1-tier7. This pull request has now been integrated. Changeset: a31a23d5 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/a31a23d5 Stats: 46 lines in 8 files changed: 2 ins; 26 del; 18 mod 8263595: Remove oop type punning in JavaCallArguments Reviewed-by: iklam, coleenp, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/3014 From github.com+71302734+amitdpawar at openjdk.java.net Tue Mar 16 08:35:08 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Tue, 16 Mar 2021 08:35:08 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 02:13:46 GMT, Kim Barrett wrote: >> Amit Pawar has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed build issues for some targets and updated with suggested changes. > > Changes requested by kbarrett (Reviewer). > > > _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > > On Mar 12, 2021, at 3:01 PM, Amit Pawar wrote: > > In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. > > This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. > > Following minimum expansion size are seen during expansion. > > 1. 512KB without largepages and without UseNUMA. > > 2. 64MB without largepages and with UseNUMA, > > 3. 2MB (on x86) with large pages and without UseNUMA, > > 4. 64MB without large pages and with UseNUMA. > > When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. > > Sorry, but a change like this needs better motivation. What you say > above suggests this change doesn't actually help. > > It's intentional that oldgen expansions aren't generally large, as the > oldgen shouldn't be grown unnecessarily. There are already parameters > such as MinHeapDeltaBytes to control and manipulate this. > > It is also preferable to complete an expansion request quickly to make > the additional space available to other threads in the main allocation > path, rather than making them go to the expand path. Making expansions > larger could force more threads to take the slower expand path, which > doesn't seem like a win even if they then help with the pretouch part > of another thread's expansion. (And that also assumes UsePreTouch is > even enabled.) > > So the followup change that you say is needed to make this one > profitable seems questionable. > > The proposed change is also surprisingly large and intrusive for > something that seems like it should be very localized. > > > Jtreg all test passed. > > A change like this needs a lot more testing than that, both functionally > and performance. [https://bugs.openjdk.java.net/browse/JDK-8254699](JDK-8254699) contains test results in XL file to show PreTouchParallelChunkSize was recently changed from 1GB to 4MB on Linux after testing various sizes. I have downloaded the same XL file and same is updated for Oldgen case during resize and it gives some rough idea about the improvement for this fix and follow up fix. Please check "PretouchOldgenDuringResize" sheet for "Co-operative Fix" and "Adaptive Resize Fix" columns. [PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx](https://github.com/openjdk/jdk/files/6147180/PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx) Running SPECJbb composite shows 30-40% reduction in GC pause time when old-gen expands upto ~1GB for UseNUMA case (I tested for minimum oldgen size to trigger the resize). Non UseNUMA case will show improvement only when resize expands more than minimum "PreTouchParallelChunkSize" size to let other thread participate in pretouch work. Two cases 1 & 3 (without UseNUMA and with/without hugpages) above didnt shows any improvement because of "PreTouchParallelChunkSize" limit and that is the reason why I suggested another fix. The "Adaptive Resize Fix" column in the sheet is for next suggested fix and may possibly help to improve further. For server JVM, expansion size of 512KB, 2MB (hugepages) and 64MB looks good for first resize but later needs some attention I think. JVM flag "MinHeapDeltaBytes" needs to be known by the user and need to set it upfront. I think this can be consider for first resize in every GC and later dynamically go for higher size like double the previous size to adopt to application nature. This way it may help to reduce the GC pause time during the expansion. I thought to share my observation and my understanding could be wrong. So please check and suggest. Again, thanks for your feedback. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From manc at openjdk.java.net Tue Mar 16 09:06:38 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 16 Mar 2021 09:06:38 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man Man Cao has updated the pull request incrementally with one additional commit since the last revision: Address comment and add a gtest. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2986/files - new: https://git.openjdk.java.net/jdk/pull/2986/files/d1d05f8a..91f22bbd Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2986&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2986&range=00-01 Stats: 513 lines in 7 files changed: 418 ins; 46 del; 49 mod Patch: https://git.openjdk.java.net/jdk/pull/2986.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2986/head:pull/2986 PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Tue Mar 16 09:06:39 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 16 Mar 2021 09:06:39 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 01:12:16 GMT, Kim Barrett wrote: >> Man Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Address comment and add a gtest. > > src/hotspot/share/utilities/lockFreeQueue.hpp line 46: > >> 44: // >> 45: // \tparam rcu_pop true if use GlobalCounter critical section in pop(). >> 46: template > > I think this is the wrong place for the rcu parameterization. Among other things, it violates the SCARY principle for template design, making the entire class dependent on this parameter that is only relevant to the one operation. I think it would be better if the parameterization was on the pop operation. Thanks. My experience on template design is limited. Good to learn about the SCARY principle! > src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 58: > >> 56: // CS could lead to excessive allocation of objects, because the CS >> 57: // may block return of released objects to a free list for reuse. >> 58: LockFreeQueueCriticalSection cs(current_thread); > > The comment about excessive allocation is closely tied to the use in G1DirtyCardQueueSet. The purpose of a critical section here needs further description and generalization. I'm wondering whether it's actually important (maybe it is, just not sure and haven't though about it for a while), but I'm also thinking LockFreeQueue/Stack ought to be consistent about this. That would suggest a common utility for optional critical sections. I added a ConditionalCriticalSection class to globalCounter.hpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From jbachorik at openjdk.java.net Tue Mar 16 09:40:11 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 16 Mar 2021 09:40:11 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 15 Mar 2021 21:22:44 GMT, Stefan Johansson wrote: > For the liveness value to be useful it would have to be updated at each GC, and we need to investigate further to see how we can do that in a "cheap" way. I would prefer if G1 did just return used() in live() as a start and we can create a follow-up task to investigate how to best add a better estimate. Do you see any problem with this? Actually yes - this event was meant to be a cheap way to see whether the *known* live size is growing (has an upwards trend) which would using the `used()` value make more difficult and unreliable. I would prefer keeping the implementation to return the lower bound of the live size as is the case for other GCs as well. May add explanatory comments to the code and the event definition to make this clear. Also, the `used()` value is already captured in the event so we would have it duplicated. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From shade at openjdk.java.net Tue Mar 16 09:53:14 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 16 Mar 2021 09:53:14 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v5] In-Reply-To: References: Message-ID: <-EcS4FJW792kabI0eFg2odUsAs23suq4qfIAWFdmaYQ=.c0574380-200a-4065-8bd2-3e2ffa2f13eb@github.com> On Mon, 15 Mar 2021 19:26:34 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Remove unused code. Move ShenandoahRendezvousClosure close to where it's used. > - Rename 'disable weak roots' -> 'final roots' everywhere This looks good to me, with a minor nit. src/hotspot/share/gc/shenandoah/shenandoahNMethod.inline.hpp line 31: > 29: #include "gc/shared/barrierSetNMethod.hpp" > 30: #include "gc/shenandoah/shenandoahNMethod.hpp" > 31: #include "gc/shenandoah/shenandoahClosures.inline.hpp" Is this include necessary? I cannot see why. ------------- Marked as reviewed by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2945 From shade at openjdk.java.net Tue Mar 16 09:53:14 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 16 Mar 2021 09:53:14 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v5] In-Reply-To: <-EcS4FJW792kabI0eFg2odUsAs23suq4qfIAWFdmaYQ=.c0574380-200a-4065-8bd2-3e2ffa2f13eb@github.com> References: <-EcS4FJW792kabI0eFg2odUsAs23suq4qfIAWFdmaYQ=.c0574380-200a-4065-8bd2-3e2ffa2f13eb@github.com> Message-ID: <8XxRz-fmzcV1d4HKJYKMBMsw2Cgh0foiGY3E05hfZNA=.eeea1051-976c-4ba4-afbd-f1067b8cd777@github.com> On Tue, 16 Mar 2021 09:50:32 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove unused code. Move ShenandoahRendezvousClosure close to where it's used. >> - Rename 'disable weak roots' -> 'final roots' everywhere > > This looks good to me, with a minor nit. Also, pull from master to get Windows builds fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From sjohanss at openjdk.java.net Tue Mar 16 10:02:13 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 16 Mar 2021 10:02:13 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: References: Message-ID: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> On Mon, 15 Mar 2021 18:40:23 GMT, Marcus G K Williams wrote: >> When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using >> Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). >> >> This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). >> >> In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. > > Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: > > Use SIZE_FORMAT in logging > > Signed-off-by: Marcus G K Williams I agree that this change should not include to much refactoring. We should do that as separate changes. Some possibly have to be done before this one. Regarding your problems: 1. This is a somewhat known problem and we have [JDK-8261527](https://bugs.openjdk.java.net/browse/JDK-8261527) for tracking this. I hope to get this in for JDK 17. Up until your change we have always known that only one large page size can be used, so if a ReservedSpace is "special" that page size was used. With your change this has to change somehow and I think it will be hard to achieve without some other refactoring first. 2. Correct, this test does not yet handle mixed-mappings since none of the mappings traced has used mixed mappings before. The problem you run into is that your new code doesn't honor the request in Parallel to used 2M pages: https://github.com/openjdk/jdk/blob/a31a23d5e72c4618b5b67e854ef4909110a1b5b4/src/hotspot/share/gc/parallel/parallelArguments.cpp#L120 When deciding on a new page size we should honor the passed in alignment if it is larger than the allocation granularity. Because in such case the upper layer has made a decision that the lower layer should honor. One other way forward (not waiting for refactoring) would be to see this change as just enabling use of multiple page sizes and then actually using them will be added later when needed refactoring has been done. src/hotspot/os/linux/os_linux.cpp line 3760: > 3758: void os::large_page_init() { > 3759: size_t default_large_page_size = scan_default_large_page_size(); > 3760: os::Linux::_default_large_page_size = default_large_page_size; I would move this below the checking if large pages are enabled. I'm also not sure if this refactoring made the setup easier to follow. I would have preferred the old way, but if you and Thomas prefer this I'm ok with it. src/hotspot/os/linux/os_linux.cpp line 3809: > 3807: ls.print("\n"); > 3808: ls.print("Available large page sizes: "); > 3809: all_large_pages.print_on(&ls); Use `ls.cr()` instead of `print("\n")` and I'm also not sure there is any value to print the large page sizes separately as they are part of the first set. ------------- Changes requested by sjohanss (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/1153 From sjohanss at openjdk.java.net Tue Mar 16 10:02:14 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 16 Mar 2021 10:02:14 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v15] In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 07:18:05 GMT, Thomas Stuefe wrote: >> src/hotspot/os/linux/os_linux.cpp line 4013: >> >>> 4011: assert(UseLargePages && UseHugeTLBFS, "only for Huge TLBFS large pages"); >>> 4012: assert(is_aligned(bytes, large_page_size), "Unaligned size"); >>> 4013: assert(is_aligned(req_addr, large_page_size), "Unaligned address"); >> >> Adding an assert here that `large_page_size` is larger than os::vm_page_size (small page size) to ensure we actually get a large page size from `page_size_for_region_aligned()`. Otherwise the passed in a size wasn't correctly aligned. > > @kstefanj > Hmm. > > `os::page_size_for_region_xxx` can return any page size, including the base page size. Caller may reasonably pass in any reserve size; we may run on a system where the only large page available is > caller size, or we specified LargePageSizeInBytes=1G. > > I actually would prefer this function to graciously handle the case of too small input size and just allocate whatever fits the caller size best; if its only 4K pages so be it. But this also could be done in a future RFE. As you both know I've been playing around in this area quite a bit lately and I think the final version will differ from what this PR will produce. The problem I see is that it will be quite hard to achieve what this PR wants without doing quite significant refactoring. The code above should work if all other layers setup the mapping correctly, we should only end up in this function if the size is a multiple of a large page size. I think this part of the code would be significantly easier if I first pushed a change that I have lined up to fix the strange strange condition in `reserve_memory_special_huge_tlbfs()`. The fix merges the `*_only` and `*_mixed` helpers into the main function to simplify the logic: https://github.com/openjdk/jdk/compare/master...kstefanj:8262291-one-special-alloc If we had this function we could use the alignment parameter to figure out the largest possible page size to use. ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From rkennke at openjdk.java.net Tue Mar 16 10:57:39 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 16 Mar 2021 10:57:39 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v6] In-Reply-To: References: Message-ID: <6SJB82pxF38mpJu9xMMgZNE00rlpkRE0NeflRjb7aic=.90f927f9-e6b2-40b1-8472-13995c765eb9@github.com> > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. > > There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. > > This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) > > Testing: > - [x] New testcase failed without change, passes now > - [x] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - Merge branch 'master' into JDK-8263427 - Remove unused code. Move ShenandoahRendezvousClosure close to where it's used. - Rename 'disable weak roots' -> 'final roots' everywhere - Disable weakroots together with evacuation at safepoint, and in a separate vmop in shortcut cycles - Correct order of rendezvous, global- and local-flag updates; cleanup rendezvous - Verify correct weakroots-in-progress state (by Aleksey) - Ensure test does a complete GC cycle before verification - 8263427: Shenandoah: Trigger weak-LRB even when heap is stable ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2945/files - new: https://git.openjdk.java.net/jdk/pull/2945/files/5abeece7..bb76b16e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2945&range=04-05 Stats: 26071 lines in 1258 files changed: 21363 ins; 2179 del; 2529 mod Patch: https://git.openjdk.java.net/jdk/pull/2945.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2945/head:pull/2945 PR: https://git.openjdk.java.net/jdk/pull/2945 From shade at openjdk.java.net Tue Mar 16 10:57:47 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 16 Mar 2021 10:57:47 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v6] In-Reply-To: <6SJB82pxF38mpJu9xMMgZNE00rlpkRE0NeflRjb7aic=.90f927f9-e6b2-40b1-8472-13995c765eb9@github.com> References: <6SJB82pxF38mpJu9xMMgZNE00rlpkRE0NeflRjb7aic=.90f927f9-e6b2-40b1-8472-13995c765eb9@github.com> Message-ID: On Tue, 16 Mar 2021 10:54:49 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8263427 > - Remove unused code. Move ShenandoahRendezvousClosure close to where it's used. > - Rename 'disable weak roots' -> 'final roots' everywhere > - Disable weakroots together with evacuation at safepoint, and in a separate vmop in shortcut cycles > - Correct order of rendezvous, global- and local-flag updates; cleanup rendezvous > - Verify correct weakroots-in-progress state (by Aleksey) > - Ensure test does a complete GC cycle before verification > - 8263427: Shenandoah: Trigger weak-LRB even when heap is stable Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From shade at openjdk.java.net Tue Mar 16 10:57:49 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 16 Mar 2021 10:57:49 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v5] In-Reply-To: References: <-EcS4FJW792kabI0eFg2odUsAs23suq4qfIAWFdmaYQ=.c0574380-200a-4065-8bd2-3e2ffa2f13eb@github.com> Message-ID: On Tue, 16 Mar 2021 10:52:56 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shenandoah/shenandoahNMethod.inline.hpp line 31: >> >>> 29: #include "gc/shared/barrierSetNMethod.hpp" >>> 30: #include "gc/shenandoah/shenandoahNMethod.hpp" >>> 31: #include "gc/shenandoah/shenandoahClosures.inline.hpp" >> >> Is this include necessary? I cannot see why. > > I removed the same include in shenandoahUnload.cpp because it is no longer needed there. But it's needed here because of ShenandoahEvacuateUpdateMetadataClosure. It was only reachable through the include in shenandoahUnload.cpp before. Ah. Makes sense then. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From rkennke at openjdk.java.net Tue Mar 16 10:57:47 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 16 Mar 2021 10:57:47 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v5] In-Reply-To: <-EcS4FJW792kabI0eFg2odUsAs23suq4qfIAWFdmaYQ=.c0574380-200a-4065-8bd2-3e2ffa2f13eb@github.com> References: <-EcS4FJW792kabI0eFg2odUsAs23suq4qfIAWFdmaYQ=.c0574380-200a-4065-8bd2-3e2ffa2f13eb@github.com> Message-ID: On Tue, 16 Mar 2021 09:50:16 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: >> >> - Remove unused code. Move ShenandoahRendezvousClosure close to where it's used. >> - Rename 'disable weak roots' -> 'final roots' everywhere > > src/hotspot/share/gc/shenandoah/shenandoahNMethod.inline.hpp line 31: > >> 29: #include "gc/shared/barrierSetNMethod.hpp" >> 30: #include "gc/shenandoah/shenandoahNMethod.hpp" >> 31: #include "gc/shenandoah/shenandoahClosures.inline.hpp" > > Is this include necessary? I cannot see why. I removed the same include in shenandoahUnload.cpp because it is no longer needed there. But it's needed here because of ShenandoahEvacuateUpdateMetadataClosure. It was only reachable through the include in shenandoahUnload.cpp before. ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From sjohanss at openjdk.java.net Tue Mar 16 10:50:09 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 16 Mar 2021 10:50:09 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Tue, 16 Mar 2021 09:37:08 GMT, Jaroslav Bachorik wrote: >>> @kstefanj >>> >>> > Then the heap expands and your application get to a steady state that doesn't require any more marking cycles. >>> >>> Is there a way to get the liveness info when the heap expands? If not that would mean we had no way to figure out the new live set size and would assume, conservatively, the last known value. >>> >>> As I mentioned in the PR description the live size value will be a 'best effort estimate' depending on what can each particular GC provide. >> >> Sure, and this is fair, my concern is just that this 'best effort estimate' for G1 will often be worse than just using `used()`. This is not only a problem for when the heap expands, that was just an example, the live value will become more and more stale the longer an application run without triggering a new concurrent cycle. >> >> For the liveness value to be useful it would have to be updated at each GC, and we need to investigate further to see how we can do that in a "cheap" way. I would prefer if G1 did just return `used()` in `live()` as a start and we can create a follow-up task to investigate how to best add a better estimate. Do you see any problem with this? > >> For the liveness value to be useful it would have to be updated at each GC, and we need to investigate further to see how we can do that in a "cheap" way. I would prefer if G1 did just return used() in live() as a start and we can create a follow-up task to investigate how to best add a better estimate. Do you see any problem with this? > > Actually yes - this event was meant to be a cheap way to see whether the *known* live size is growing (has an upwards trend) which would using the `used()` value make more difficult and unreliable. > > I would prefer keeping the implementation to return the lower bound of the live size as is the case for other GCs as well. May add explanatory comments to the code and the event definition to make this clear. > > Also, the `used()` value is already captured in the event so we would have it duplicated. > > For the liveness value to be useful it would have to be updated at each GC, and we need to investigate further to see how we can do that in a "cheap" way. I would prefer if G1 did just return used() in live() as a start and we can create a follow-up task to investigate how to best add a better estimate. Do you see any problem with this? > > Actually yes - this event was meant to be a cheap way to see whether the _known_ live size is growing (has an upwards trend) which would using the `used()` value make more difficult and unreliable. > > I would prefer keeping the implementation to return the lower bound of the live size as is the case for other GCs as well. May add explanatory comments to the code and the event definition to make this clear. > > Also, the `used()` value is already captured in the event so we would have it duplicated. Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be `used()` at the end of the GC. I know `used()` is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. So as a middle road I would suggest to update `G1CollectedHeap::gc_epilogue(bool full)` to include: set_live(used()); With this you don't need the changes for the `G1FullCollector`. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 16 11:09:09 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 16 Mar 2021 11:09:09 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Tue, 16 Mar 2021 10:47:22 GMT, Stefan Johansson wrote: > Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be used() at the end of the GC. I know used() is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. Mostly because `used()` will report all live instances and potential garbage and will make it inconsistent with what the other GCs would report. > So as a middle road I would suggest to update G1CollectedHeap::gc_epilogue(bool full) to include: > > set_live(used()); With this you don't need the changes for the G1FullCollector. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. This sounds interesting. Let me try this out. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From sjohanss at openjdk.java.net Tue Mar 16 11:32:14 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 16 Mar 2021 11:32:14 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> On Tue, 16 Mar 2021 11:06:22 GMT, Jaroslav Bachorik wrote: > > Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be used() at the end of the GC. I know used() is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. > > Mostly because `used()` will report all live instances and potential garbage and will make it inconsistent with what the other GCs would report. > The other STW GCs do report the same, right? > > So as a middle road I would suggest to update G1CollectedHeap::gc_epilogue(bool full) to include: > > set_live(used()); > > With this you don't need the changes for the G1FullCollector. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. > > This sounds interesting. Let me try this out. Glad you like the idea :) I did a quick test locally and it shows the trend ok, even if it is an over estimate of live: live = 1.1 GB live = 1.2 GB live = 1.5 GB live = 1.7 GB live = 2.1 GB name = "G1Old" live = 1.4 GB live = 1.6 GB live = 1.8 GB live = 2.0 GB live = 2.3 GB live = 2.5 GB live = 2.8 GB live = 3.1 GB live = 3.3 GB live = 3.7 GB live = 4.0 GB live = 4.3 GB name = "G1Old" live = 1.2 GB G1Old is from concurrent mark events. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 16 11:41:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 16 Mar 2021 11:41:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi 8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> Message-ID: On Tue, 16 Mar 2021 11:28:50 GMT, Stefan Johansson wrote: > The other STW GCs do report the same, right? They report the lower bound - basically the real live size right at the end of a GC cycle (either gathered during marking or the used size after compaction). Of course, a few moments later it might (and probably will) not be 100% correct but that's why it is just an estimate. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From tschatzl at openjdk.java.net Tue Mar 16 11:54:09 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 16 Mar 2021 11:54:09 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <_XwpqPW-sKwuvZG26bSgdW6RKtQaIJz-0fpRR5wGa0c=.d6c5394b-7fea-4f0e-ab5f-9092286d3db1@github.com> On Tue, 16 Mar 2021 10:47:22 GMT, Stefan Johansson wrote: > > For the liveness value to be useful it would have to be updated at each GC, and we need to investigate further to see how we can do that in a "cheap" way. I would prefer if G1 did just return used() in live() as a start and we can create a follow-up task to investigate how to best add a better estimate. Do you see any problem with this? > > Actually yes - this event was meant to be a cheap way to see whether the _known_ live size is growing (has an upwards trend) which would using the `used()` value make more difficult and unreliable. So one of the actual purposes seems to be some kind of leak detection: there is this JFR leak detector (I only know the feature name, not completely how it works and what its overhead is) for this purpose, wouldn't that work? Also, for this purpose, why would used() not be a good substitute for liveness? If e.g. used() average grows over time you can deduce the same I would assume (particularly used() after mixed gc phase in g1). Do you have any numbers on what the impact of using used() vs. this live() would be in such a use case? What I'm afraid of is that mixing values taken at different times - used and capacity are taken at the time of the event, and the liveness estimated updated at other, irregular intervals may cause significiant amount of confusion in interpreting this value. It might be obvious to you, but there will be other users. One option could be detaching the liveness estimate from used()/capacity() (I see a value in having some heap usage summary at regular intervals) and send the liveness estimate event just when they are generated? Then the various collectors could send this liveness value at times when they think they are fairly accurate, not when the collectors must and particularly not in conjunction with samples taken at completely different times. Independent of whether used/capacity and liveness are sent, the receiver needs to do statistics (trend lines) on those anyway. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From zgu at openjdk.java.net Tue Mar 16 12:23:17 2021 From: zgu at openjdk.java.net (Zhengyu Gu) Date: Tue, 16 Mar 2021 12:23:17 GMT Subject: RFR: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable [v6] In-Reply-To: <6SJB82pxF38mpJu9xMMgZNE00rlpkRE0NeflRjb7aic=.90f927f9-e6b2-40b1-8472-13995c765eb9@github.com> References: <6SJB82pxF38mpJu9xMMgZNE00rlpkRE0NeflRjb7aic=.90f927f9-e6b2-40b1-8472-13995c765eb9@github.com> Message-ID: On Tue, 16 Mar 2021 10:57:39 GMT, Roman Kennke wrote: >> We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. >> >> I believe this might be the root cause for JDK-8262852. >> >> The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. >> >> There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. >> >> This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) >> >> Testing: >> - [x] New testcase failed without change, passes now >> - [x] hotspot_gc_shenandoah >> - [ ] tier1 (+Shenandoah) > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'master' into JDK-8263427 > - Remove unused code. Move ShenandoahRendezvousClosure close to where it's used. > - Rename 'disable weak roots' -> 'final roots' everywhere > - Disable weakroots together with evacuation at safepoint, and in a separate vmop in shortcut cycles > - Correct order of rendezvous, global- and local-flag updates; cleanup rendezvous > - Verify correct weakroots-in-progress state (by Aleksey) > - Ensure test does a complete GC cycle before verification > - 8263427: Shenandoah: Trigger weak-LRB even when heap is stable Looks good. ------------- Marked as reviewed by zgu (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2945 From jbachorik at openjdk.java.net Tue Mar 16 12:26:14 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 16 Mar 2021 12:26:14 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: <_XwpqPW-sKwuvZG26bSgdW6RKtQaIJz-0fpRR5wGa0c=.d6c5394b-7fea-4f0e-ab5f-9092286d3db1@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <_XwpqPW-sKwuvZG26bSgdW6RKtQaIJz-0fpRR5wGa0c=.d6c5394b-7fea-4f0e-ab5f-9092286d3db1@github.com> Message-ID: On Tue, 16 Mar 2021 11:51:45 GMT, Thomas Schatzl wrote: > So one of the actual purposes seems to be some kind of leak detection: there is this JFR leak detector (I only know the feature name, not completely how it works and what its overhead is) for this purpose, wouldn't that work? Yes. But enabling that comes with an extra price so it is more of a focused tool than something you could use in continuous monitoring/signal evaluation. > Also, for this purpose, why would used() not be a good substitute for liveness? If e.g. used() average grows over time you can deduce the same I would assume (particularly used() after mixed gc phase in g1). The major problem is that eg. for g1 given large enough heap the used value can keep on growing for quite long time, possibly generating wrong signal about potential memory leak. If the live estimate is set to `used()` after mixed gc phase in g1 I think it still will be a good estimate. The only thing I am opposing is having `live()` call return the current `used()` value which, IMO, might become rather confusing. > Do you have any numbers on what the impact of using used() vs. this live() would be in such a use case? Nope. Do you mean perf impact? > What I'm afraid of is that mixing values taken at different times - used and capacity are taken at the time of the event, and the liveness estimated updated at other, irregular intervals may cause significiant amount of confusion in interpreting this value. It might be obvious to you, but there will be other users. IDK. If the event field would explicitly mention that this is the **last known live size estimate** it should set the expectations right. > >One option could be detaching the liveness estimate from used()/capacity() (I see a value in having some heap usage summary at regular intervals) and send the liveness estimate event just when they are generated? Then the various collectors could send this liveness value at times when they think they are fairly accurate, not when the collectors must and particularly not in conjunction with samples taken at completely different times. The problem is the irregularity - when the live size is reported only when it is calculated there might be long periods in the recording missing the live size data at all. In order for this information to be useful it should be reported at least at the beginning and end of a JFR chunk. > Independent of whether used/capacity and liveness are sent, the receiver needs to do statistics (trend lines) on those anyway. Yes. It's just that with the live size estimate one wouldn't be getting the false positives one would get with used heap trend. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 16 12:32:12 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 16 Mar 2021 12:32:12 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi 8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> Message-ID: On Tue, 16 Mar 2021 11:28:50 GMT, Stefan Johansson wrote: >>> Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be used() at the end of the GC. I know used() is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. >> >> Mostly because `used()` will report all live instances and potential garbage and will make it inconsistent with what the other GCs would report. >> >>> So as a middle road I would suggest to update G1CollectedHeap::gc_epilogue(bool full) to include: >>> >>> set_live(used()); >> With this you don't need the changes for the G1FullCollector. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. >> >> This sounds interesting. Let me try this out. > >> > Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be used() at the end of the GC. I know used() is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. >> >> Mostly because `used()` will report all live instances and potential garbage and will make it inconsistent with what the other GCs would report. >> > The other STW GCs do report the same, right? > >> > So as a middle road I would suggest to update G1CollectedHeap::gc_epilogue(bool full) to include: >> > set_live(used()); >> > With this you don't need the changes for the G1FullCollector. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. >> >> This sounds interesting. Let me try this out. > > Glad you like the idea :) I did a quick test locally and it shows the trend ok, even if it is an over estimate of live: > live = 1.1 GB > live = 1.2 GB > live = 1.5 GB > live = 1.7 GB > live = 2.1 GB > name = "G1Old" > live = 1.4 GB > live = 1.6 GB > live = 1.8 GB > live = 2.0 GB > live = 2.3 GB > live = 2.5 GB > live = 2.8 GB > live = 3.1 GB > live = 3.3 GB > live = 3.7 GB > live = 4.0 GB > live = 4.3 GB > name = "G1Old" > live = 1.2 GB > G1Old is from concurrent mark events. @kstefanj Just to make sure - `set_live(used())` should be the last call in `G1CollectedHeap::gc_prologue(bool full)` ? I am getting some funny numbers with this change - basically, last known live size is getting bigger than the current used size ?? ![Screen Shot 2021-03-16 at 1 28 27 PM](https://user-images.githubusercontent.com/738413/111308939-97b06d80-865b-11eb-9e2f-d595b49d6401.png) ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From sjohanss at openjdk.java.net Tue Mar 16 14:15:33 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 16 Mar 2021 14:15:33 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi 8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> Message-ID: On Tue, 16 Mar 2021 11:28:50 GMT, Stefan Johansson wrote: >>> Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be used() at the end of the GC. I know used() is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. >> >> Mostly because `used()` will report all live instances and potential garbage and will make it inconsistent with what the other GCs would report. >> >>> So as a middle road I would suggest to update G1CollectedHeap::gc_epilogue(bool full) to include: >>> >>> set_live(used()); >> With this you don't need the changes for the G1FullCollector. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. >> >> This sounds interesting. Let me try this out. > >> > Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be used() at the end of the GC. I know used() is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. >> >> Mostly because `used()` will report all live instances and potential garbage and will make it inconsistent with what the other GCs would report. >> > The other STW GCs do report the same, right? > >> > So as a middle road I would suggest to update G1CollectedHeap::gc_epilogue(bool full) to include: >> > set_live(used()); >> > With this you don't need the changes for the G1FullCollector. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. >> >> This sounds interesting. Let me try this out. > > Glad you like the idea :) I did a quick test locally and it shows the trend ok, even if it is an over estimate of live: > live = 1.1 GB > live = 1.2 GB > live = 1.5 GB > live = 1.7 GB > live = 2.1 GB > name = "G1Old" > live = 1.4 GB > live = 1.6 GB > live = 1.8 GB > live = 2.0 GB > live = 2.3 GB > live = 2.5 GB > live = 2.8 GB > live = 3.1 GB > live = 3.3 GB > live = 3.7 GB > live = 4.0 GB > live = 4.3 GB > name = "G1Old" > live = 1.2 GB > G1Old is from concurrent mark events. > @kstefanj > Just to make sure - `set_live(used())` should be the last call in `G1CollectedHeap::gc_prologue(bool full)` ? > Anywhere in there should be fine, we should look if there is anything related in there it could be grouped with. > I am getting some funny numbers with this change - basically, last known live size is getting bigger than the current used size ?? > > ![Screen Shot 2021-03-16 at 1 28 27 PM](https://user-images.githubusercontent.com/738413/111308939-97b06d80-865b-11eb-9e2f-d595b49d6401.png) This is strange, what kind of GCs are happening around those samples? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 16 14:15:32 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 16 Mar 2021 14:15:32 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v15] In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: <2GhOBO4qaEDTMS1tOwAj3u_-_3__n-M_HVFLE9cSJ-s=.9d7830c1-1f65-46a3-8b16-ce6b77367559@github.com> > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: Update liveness for G1 mixed GC ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2579/files - new: https://git.openjdk.java.net/jdk/pull/2579/files/81250d1c..f767f257 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=14 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2579&range=13-14 Stats: 5 lines in 2 files changed: 3 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2579.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2579/head:pull/2579 PR: https://git.openjdk.java.net/jdk/pull/2579 From sjohanss at openjdk.java.net Tue Mar 16 14:15:33 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 16 Mar 2021 14:15:33 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi 8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> Message-ID: On Tue, 16 Mar 2021 13:24:38 GMT, Stefan Johansson wrote: >>> > Sure, but for the event to be useful we want the reported value to be as close to the reality as possible. I don't understand why you want the lower bound, can you explain why? I would go for the upper bound, which in that case would be used() at the end of the GC. I know used() is not perfect, but for G1 this is the best "cheap" value we have for liveness at the end of any GC. >>> >>> Mostly because `used()` will report all live instances and potential garbage and will make it inconsistent with what the other GCs would report. >>> >> The other STW GCs do report the same, right? >> >>> > So as a middle road I would suggest to update G1CollectedHeap::gc_epilogue(bool full) to include: >>> > set_live(used()); >>> > With this you don't need the changes for the G1FullCollector. The liveness calculated at Remark would be used until the next young collection and I think here is where some improvements could be made. During the mixed phase a better solution would make use of the liveness information we have for the old regions together with what is newly allocated, but this needs further investigation. >>> >>> This sounds interesting. Let me try this out. >> >> Glad you like the idea :) I did a quick test locally and it shows the trend ok, even if it is an over estimate of live: >> live = 1.1 GB >> live = 1.2 GB >> live = 1.5 GB >> live = 1.7 GB >> live = 2.1 GB >> name = "G1Old" >> live = 1.4 GB >> live = 1.6 GB >> live = 1.8 GB >> live = 2.0 GB >> live = 2.3 GB >> live = 2.5 GB >> live = 2.8 GB >> live = 3.1 GB >> live = 3.3 GB >> live = 3.7 GB >> live = 4.0 GB >> live = 4.3 GB >> name = "G1Old" >> live = 1.2 GB >> G1Old is from concurrent mark events. > >> @kstefanj >> Just to make sure - `set_live(used())` should be the last call in `G1CollectedHeap::gc_prologue(bool full)` ? >> > Anywhere in there should be fine, we should look if there is anything related in there it could be grouped with. > >> I am getting some funny numbers with this change - basically, last known live size is getting bigger than the current used size ?? >> >> ![Screen Shot 2021-03-16 at 1 28 27 PM](https://user-images.githubusercontent.com/738413/111308939-97b06d80-865b-11eb-9e2f-d595b49d6401.png) > > This is strange, what kind of GCs are happening around those samples? Oh, sorry! I messed up, you should put the code in `gc_epilogue()` ?? ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Tue Mar 16 14:15:33 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Tue, 16 Mar 2021 14:15:33 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <8kTv1ENK1FbxIt7NVkNJNgT55bLw5ao4lWi 8uq0nktQ=.a969127a-351a-4393-a932-934fbe9b6924@github.com> Message-ID: <6T_A5WVMZCvYpceI82Ho8ltG-fLddego0zquojRykOI=.36314a7c-1e28-4e0c-8425-ab583d78c8d8@github.com> On Tue, 16 Mar 2021 13:25:35 GMT, Stefan Johansson wrote: > Oh, sorry! I messed up, you should put the code in gc_epilogue() ?? Np! Cool, this version works as expected. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From rkennke at openjdk.java.net Tue Mar 16 14:16:11 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Tue, 16 Mar 2021 14:16:11 GMT Subject: Integrated: 8263427: Shenandoah: Trigger weak-LRB even when heap is stable In-Reply-To: References: Message-ID: <8heQXgfbyTP4HrJoFnWQh0hleVOEMgpfCsqyhxuYYTE=.e902fffd-aac2-491b-a326-71a7d99f230c@github.com> On Thu, 11 Mar 2021 18:38:26 GMT, Roman Kennke wrote: > We currently guard all LRBs, including weak-LRB, by a test for heap-stable and only enter the LRB when heap is unstable (e.g. evacuation or update-refs in progress). However, the weak LRB must also be entered when heap is stable and concurrent refs is in progress, otherwise we may accidentally resurrect otherwise unreachable weak referents. This can happen when we take the shortcut cycle and skip evac&update-refs. > > I believe this might be the root cause for JDK-8262852. > > The way out of it is change conc-weakroots-in-progress flag to a bit in gc-state, and test for this in weak-LRB gc-state-check, and enter weak-LRB even when heap is stable, but conc-weakroots-in-progress. > > There's one gotcha here: we used to change gc-state only at safepoints so that the flag can safely be propagated to all Java threads. But conc-weakroots-in-progress is turned-off concurrently. I deal with this by propagating the flag change to Java threads via the rendevouz (that we do anyway), and change the global flag only once all threads got the thread-local flag change. > > This stuff makes the verifier unhappy, because it doesn't know about the new bit. And it'd be difficult to properly verify it, because sometimes it is set (conc-cycle) and sometimes it is not (degen-cycle), so instead of additing extra verification, I figured we could keep ignoring the flag (for now?) > > Testing: > - [x] New testcase failed without change, passes now > - [x] hotspot_gc_shenandoah > - [ ] tier1 (+Shenandoah) This pull request has now been integrated. Changeset: 75ef6f58 Author: Roman Kennke URL: https://git.openjdk.java.net/jdk/commit/75ef6f58 Stats: 292 lines in 23 files changed: 215 ins; 66 del; 11 mod 8263427: Shenandoah: Trigger weak-LRB even when heap is stable Reviewed-by: shade, zgu ------------- PR: https://git.openjdk.java.net/jdk/pull/2945 From dcubed at openjdk.java.net Tue Mar 16 14:30:11 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 16 Mar 2021 14:30:11 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 03:44:56 GMT, Kim Barrett wrote: > Please review and vote on this change to the HotSpot Style Guide to permit > the use of `override` virtual specifiers. The virtual specifiers `override` > and `final` were added in C++11, and use of `final` is already permitted in > HotSpot code. > > Using the `override` specifier provides error checking that the function is > indeed overriding a virtual function declared in a base class. This can > prevent some often surprisingly difficult to spot bugs. > > This is a modification of the Style Guide, so rough consensus among > the HotSpot Group members is required to make this change. Only Group > members should vote for approval (via the github PR), though reasoned > objections or comments from anyone will be considered. A decision on > this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review > process to approve (click on Review Changes > Approve), rather than > sending a "vote: yes" email reply that would be normal for a CFV. > Other responses can still use email of course. Marked as reviewed by dcubed (Reviewer). doc/hotspot-style.md line 753: > 751: ([n2928](http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2928.htm)), > 752: ([n3206](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3206.htm)), > 753: ([n3272](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3272.htm)) nit - The difference in case caught my attention: JTC1 <-> jtc1 SC22 <-> sc22 WG21 <-> wg21 any particular significance to the difference? ------------- PR: https://git.openjdk.java.net/jdk/pull/3021 From iklam at openjdk.java.net Tue Mar 16 15:14:09 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 16 Mar 2021 15:14:09 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: <2ByDPwnZoGg47l2hXnyygmTsKUsbANV7SZEcg2aR6OQ=.0437246f-1223-4340-8c89-de8b642bad67@github.com> On Tue, 16 Mar 2021 03:44:56 GMT, Kim Barrett wrote: > Please review and vote on this change to the HotSpot Style Guide to permit > the use of `override` virtual specifiers. The virtual specifiers `override` > and `final` were added in C++11, and use of `final` is already permitted in > HotSpot code. > > Using the `override` specifier provides error checking that the function is > indeed overriding a virtual function declared in a base class. This can > prevent some often surprisingly difficult to spot bugs. > > This is a modification of the Style Guide, so rough consensus among > the HotSpot Group members is required to make this change. Only Group > members should vote for approval (via the github PR), though reasoned > objections or comments from anyone will be considered. A decision on > this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review > process to approve (click on Review Changes > Approve), rather than > sending a "vote: yes" email reply that would be normal for a CFV. > Other responses can still use email of course. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3021 From enikitin at openjdk.java.net Tue Mar 16 16:12:25 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Tue, 16 Mar 2021 16:12:25 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v6] In-Reply-To: References: Message-ID: > Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes: > > * Code cache size getters are added to WhiteBox; > * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance); > * Dependencies on WhiteBox added for all affected tests; > * The test cases in question un-problemlisted. > > Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86. Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision: Move a comment a bit higher ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2523/files - new: https://git.openjdk.java.net/jdk/pull/2523/files/76b02724..64ae20d5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=04-05 Stats: 3 lines in 1 file changed: 2 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/2523.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2523/head:pull/2523 PR: https://git.openjdk.java.net/jdk/pull/2523 From iignatyev at openjdk.java.net Tue Mar 16 16:19:08 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Tue, 16 Mar 2021 16:19:08 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v6] In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 16:12:25 GMT, Evgeny Nikitin wrote: >> Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes: >> >> * Code cache size getters are added to WhiteBox; >> * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance); >> * Dependencies on WhiteBox added for all affected tests; >> * The test cases in question un-problemlisted. >> >> Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86. > > Evgeny Nikitin has updated the pull request incrementally with one additional commit since the last revision: > > Move a comment a bit higher Marked as reviewed by iignatyev (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2523 From github.com+168222+mgkwill at openjdk.java.net Tue Mar 16 18:01:22 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Tue, 16 Mar 2021 18:01:22 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v24] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: Addressed kstefanj review suggestions Signed-off-by: Marcus G K Williams ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/1153/files - new: https://git.openjdk.java.net/jdk/pull/1153/files/22b27b92..1ecb4e65 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=23 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=22-23 Stats: 12 lines in 1 file changed: 5 ins; 6 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From github.com+168222+mgkwill at openjdk.java.net Tue Mar 16 18:01:23 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Tue, 16 Mar 2021 18:01:23 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> References: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> Message-ID: On Tue, 16 Mar 2021 09:27:45 GMT, Stefan Johansson wrote: >> Marcus G K Williams has updated the pull request incrementally with one additional commit since the last revision: >> >> Use SIZE_FORMAT in logging >> >> Signed-off-by: Marcus G K Williams > > src/hotspot/os/linux/os_linux.cpp line 3760: > >> 3758: void os::large_page_init() { >> 3759: size_t default_large_page_size = scan_default_large_page_size(); >> 3760: os::Linux::_default_large_page_size = default_large_page_size; > > I would move this below the checking if large pages are enabled. I'm also not sure if this refactoring made the setup easier to follow. I would have preferred the old way, but if you and Thomas prefer this I'm ok with it. I don't have strong feelings either way but I think the new refactoring is slightly clearer. ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From github.com+168222+mgkwill at openjdk.java.net Tue Mar 16 18:20:09 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Tue, 16 Mar 2021 18:20:09 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> References: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> Message-ID: On Tue, 16 Mar 2021 09:59:34 GMT, Stefan Johansson wrote: > I agree that this change should not include to much refactoring. We should do that as separate changes. Some possibly have to be done before this one. > > Regarding your problems: > > 1. This is a somewhat known problem and we have [JDK-8261527](https://bugs.openjdk.java.net/browse/JDK-8261527) for tracking this. I hope to get this in for JDK 17. Up until your change we have always known that only one large page size can be used, so if a ReservedSpace is "special" that page size was used. With your change this has to change somehow and I think it will be hard to achieve without some other refactoring first. > 2. Correct, this test does not yet handle mixed-mappings since none of the mappings traced has used mixed mappings before. The problem you run into is that your new code doesn't honor the request in Parallel to used 2M pages: > https://github.com/openjdk/jdk/blob/a31a23d5e72c4618b5b67e854ef4909110a1b5b4/src/hotspot/share/gc/parallel/parallelArguments.cpp#L120 > > When deciding on a new page size we should honor the passed in alignment if it is larger than the allocation granularity. Because in such case the upper layer has made a decision that the lower layer should honor. > > One other way forward (not waiting for refactoring) would be to see this change as just enabling use of multiple page sizes and then actually using them will be added later when needed refactoring has been done. Hi Stefan ( @kstefanj ). Thanks for your review. I've addressed the specific suggestions. > When deciding on a new page size we should honor the passed in alignment if it is larger than the allocation granularity. Because in such case the upper layer has made a decision that the lower layer should honor. Also, I'm looking at this today, though not sure we can do this effectively without your refactoring. Thanks for pointing to JDK-8261527. I knew you were looking at large pages code and refactoring but I didn't know about JDK-8261527 or the other issues and PRs. I did run into the TestTracePageSizes.java while testing and saw some of the breadcrumbs. ?? I would be a favor of using https://github.com/openjdk/jdk/compare/master...kstefanj:8262291-one-special-alloc as this looks like a good way forward. However, can we use my current PR as enabling of multiple page sizes and then later incorporate your refactoring? > One other way forward (not waiting for refactoring) would be to see this change as just enabling use of multiple page sizes and then actually using them will be added later when needed refactoring has been done. How would we go about doing your suggestion? There are two reasons I would like to get this PR in sooner rather than waiting, there is some pressure for me to put a bow on this change and I would really appreciate getting some changes in so that I can get my authorship in OpenJDK. I am happy to continue working on incremental changes to make page_size part of the code better and if I had a username I would suggest assigning some of the JDK-Issues to me. As an example I started looking at solving JDK-8263236, but couldn't comment or assign to myself. Anyways, let me know how we can move forward and what the best path to get this change in for JDK-17. Thanks, Marcus ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From yyang at openjdk.java.net Tue Mar 16 18:51:06 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 16 Mar 2021 18:51:06 GMT Subject: Integrated: 8263562: Checking if proxy_klass_head is still lambda_proxy_is_available In-Reply-To: References: Message-ID: <7v6iZfqO39tIkoM_iB4eTCDxDHrXPHte7DgtyrRxTpI=.392e86aa-aab6-4330-b700-193941cd5b61@github.com> On Mon, 15 Mar 2021 02:17:36 GMT, Yi Yang wrote: > The `Shared Lambda Dictionary` section in the result of SharedLambdaDictionaryPrinter will mix normal klasses with lambda proxy klasses. Using the following commands can reproduce it: > > Proc1: `./jshell` > Proc2: `jcmd VM.systemdictionary -verbose` > > When all archived lambda proxy classes are used, proxy_klass_head(in RunTimeLambdaProxyClassInfo) is still referring to an instance klass that is no longer lambda_proxy_is_available, and its next_link will be set by classloader to link another normal class. Simply checking if proxy_klass_head is lambda_proxy_is_available can solve this problem. > > Best regards, > Yang This pull request has now been integrated. Changeset: 0d2f87e4 Author: Yi Yang Committer: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/0d2f87e4 Stats: 8 lines in 1 file changed: 2 ins; 0 del; 6 mod 8263562: Checking if proxy_klass_head is still lambda_proxy_is_available Reviewed-by: ccheung, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/3001 From github.com+168222+mgkwill at openjdk.java.net Tue Mar 16 20:59:27 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Tue, 16 Mar 2021 20:59:27 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v25] In-Reply-To: References: Message-ID: > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 36 commits: - Merge branch 'master' into update_hlp - Addressed kstefanj review suggestions Signed-off-by: Marcus G K Williams - Use SIZE_FORMAT in logging Signed-off-by: Marcus G K Williams - Fix logging issues 2 Signed-off-by: Marcus G K Williams - Fix logging issues Signed-off-by: Marcus G K Williams - Fix reserve_memory_special_huge_tlbfs_mixed, remove logging Signed-off-by: Marcus G K Williams - Fix whitespace error Signed-off-by: Marcus G K Williams - Fix first set of TestTracePageSizes.java issues Signed-off-by: Marcus G K Williams - Update LargePage Setup per review comments Signed-off-by: Marcus G K Williams - Merge remote-tracking branch 'upstream/master' into update_hlp - ... and 26 more: https://git.openjdk.java.net/jdk/compare/9cb9af68...08bf288c ------------- Changes: https://git.openjdk.java.net/jdk/pull/1153/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=24 Stats: 154 lines in 3 files changed: 76 ins; 47 del; 31 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From mseledtsov at openjdk.java.net Wed Mar 17 00:48:07 2021 From: mseledtsov at openjdk.java.net (Mikhailo Seledtsov) Date: Wed, 17 Mar 2021 00:48:07 GMT Subject: RFR: 8246494: introduce vm.flagless at-requires property In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 23:27:21 GMT, Igor Ignatyev wrote: > resurrecting old [RFR](https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/041981.html): > >> Hi all, >> >> could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? >> >> the idea behind this patch is to have a way to clearly mark tests which ignore flags, so >> a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; >> b) they can be easily excluded from runs w/ flags. >> >> @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. >> >> this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. >> >> please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 >> webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >> testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8151707 >> [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 >> [3] https://bugs.openjdk.java.net/browse/JDK-8246387 >> > > after offline discussion with @pliden, it has been decided to reduce the scope of [8246499](https://bugs.openjdk.java.net/browse/JDK-8246499) and not mark the tests that use `UseXGC` flags for selection, e.g. `test/hotspot/jtreg/gc/z/TestSmallHeap.java`. > > Thanks, > -- Igor These changes look good to me. ------------- Marked as reviewed by mseledtsov (Committer). PR: https://git.openjdk.java.net/jdk/pull/2800 From iignatyev at openjdk.java.net Wed Mar 17 00:58:08 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Wed, 17 Mar 2021 00:58:08 GMT Subject: RFR: 8246494: introduce vm.flagless at-requires property In-Reply-To: References: Message-ID: On Wed, 17 Mar 2021 00:45:00 GMT, Mikhailo Seledtsov wrote: >> resurrecting old [RFR](https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/041981.html): >> >>> Hi all, >>> >>> could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? >>> >>> the idea behind this patch is to have a way to clearly mark tests which ignore flags, so >>> a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; >>> b) they can be easily excluded from runs w/ flags. >>> >>> @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. >>> >>> this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. >>> >>> please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 >>> webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >>> testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8151707 >>> [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8246387 >>> >> >> after offline discussion with @pliden, it has been decided to reduce the scope of [8246499](https://bugs.openjdk.java.net/browse/JDK-8246499) and not mark the tests that use `UseXGC` flags for selection, e.g. `test/hotspot/jtreg/gc/z/TestSmallHeap.java`. >> >> Thanks, >> -- Igor > > These changes look good to me. Thanks, Misha! -- Igor ------------- PR: https://git.openjdk.java.net/jdk/pull/2800 From sspitsyn at openjdk.java.net Wed Mar 17 01:44:07 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 17 Mar 2021 01:44:07 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP [v2] In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 13:50:24 GMT, Aleksey Shipilev wrote: >> SonarCloud reports the following problem in MethodComparator::methods_EMCP: >> "Address of stack memory associated with local variable 's_new' is still referred to by the global variable '_s_new' upon returning to the caller. This will be a dangling reference" >> >> Code inspection reveals the assignment to static variables is only needed to pass them to helper methods. So, while this is not a detectable bug (yet), it is still cleaner not to expose stack variables in globals. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` >> - [x] Linux x86_64 fastdebug, `vmTestbase_nsk_jvmti` > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Sprinkling consts Aleksey, Sorry for being late to the party. This looks good to me. One nit: unneeded extra '()' what came from the original code: ` if ((old_cp->klass_at_noresolve(cpi_old) != new_cp->klass_at_noresolve(cpi_new)))` Thanks, Serguei ------------- Changes requested by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2937 From shade at openjdk.java.net Wed Mar 17 06:35:34 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 17 Mar 2021 06:35:34 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP [v3] In-Reply-To: References: Message-ID: > SonarCloud reports the following problem in MethodComparator::methods_EMCP: > "Address of stack memory associated with local variable 's_new' is still referred to by the global variable '_s_new' upon returning to the caller. This will be a dangling reference" > > Code inspection reveals the assignment to static variables is only needed to pass them to helper methods. So, while this is not a detectable bug (yet), it is still cleaner not to expose stack variables in globals. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Linux x86_64 fastdebug, `vmTestbase_nsk_jvmti` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Update copyrights - Drop excess parentheses - Merge branch 'master' into JDK-8263434-dangling-methodcomparator-ecmp - Sprinkling consts - 8263434: Dangling references after MethodComparator::methods_EMCP ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2937/files - new: https://git.openjdk.java.net/jdk/pull/2937/files/5acc4807..82bf43e0 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2937&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2937&range=01-02 Stats: 33632 lines in 1544 files changed: 24507 ins; 4957 del; 4168 mod Patch: https://git.openjdk.java.net/jdk/pull/2937.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2937/head:pull/2937 PR: https://git.openjdk.java.net/jdk/pull/2937 From shade at openjdk.java.net Wed Mar 17 06:35:35 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 17 Mar 2021 06:35:35 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP [v2] In-Reply-To: References: Message-ID: <_1ZolMabvo2HNY7oQvMH9I_Ncg8Vjwm343YPlyyux8A=.051230b8-206e-42c2-9be4-c495111bf7d8@github.com> On Wed, 17 Mar 2021 01:40:56 GMT, Serguei Spitsyn wrote: > One nit: unneeded extra '()' what came from the original code: > ` if ((old_cp->klass_at_noresolve(cpi_old) != new_cp->klass_at_noresolve(cpi_new)))` Sure, updated. Also updated copyrights. ------------- PR: https://git.openjdk.java.net/jdk/pull/2937 From sspitsyn at openjdk.java.net Wed Mar 17 07:06:08 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 17 Mar 2021 07:06:08 GMT Subject: RFR: 8246494: introduce vm.flagless at-requires property In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 23:27:21 GMT, Igor Ignatyev wrote: > resurrecting old [RFR](https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/041981.html): > >> Hi all, >> >> could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? >> >> the idea behind this patch is to have a way to clearly mark tests which ignore flags, so >> a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; >> b) they can be easily excluded from runs w/ flags. >> >> @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. >> >> this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. >> >> please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 >> webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >> testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8151707 >> [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 >> [3] https://bugs.openjdk.java.net/browse/JDK-8246387 >> > > after offline discussion with @pliden, it has been decided to reduce the scope of [8246499](https://bugs.openjdk.java.net/browse/JDK-8246499) and not mark the tests that use `UseXGC` flags for selection, e.g. `test/hotspot/jtreg/gc/z/TestSmallHeap.java`. > > Thanks, > -- Igor Igor, The fix looks good to me. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2800 From sspitsyn at openjdk.java.net Wed Mar 17 07:24:13 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 17 Mar 2021 07:24:13 GMT Subject: RFR: 8263434: Dangling references after MethodComparator::methods_EMCP [v3] In-Reply-To: References: Message-ID: On Wed, 17 Mar 2021 06:35:34 GMT, Aleksey Shipilev wrote: >> SonarCloud reports the following problem in MethodComparator::methods_EMCP: >> "Address of stack memory associated with local variable 's_new' is still referred to by the global variable '_s_new' upon returning to the caller. This will be a dangling reference" >> >> Code inspection reveals the assignment to static variables is only needed to pass them to helper methods. So, while this is not a detectable bug (yet), it is still cleaner not to expose stack variables in globals. >> >> Additional testing: >> - [x] Linux x86_64 fastdebug `tier1` >> - [x] Linux x86_64 fastdebug, `vmTestbase_nsk_jvmti` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Update copyrights > - Drop excess parentheses > - Merge branch 'master' into JDK-8263434-dangling-methodcomparator-ecmp > - Sprinkling consts > - 8263434: Dangling references after MethodComparator::methods_EMCP Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2937 From sspitsyn at openjdk.java.net Wed Mar 17 07:30:11 2021 From: sspitsyn at openjdk.java.net (Serguei Spitsyn) Date: Wed, 17 Mar 2021 07:30:11 GMT Subject: RFR: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION [v2] In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 08:07:39 GMT, Robbin Ehn wrote: >> When returning from the last Java frame back to vm and hitting a safepoint poll on that last return we sometimes have a last java frame but no vframe. >> This seem to be a bug in itself, handled in: 8263576 >> >> Other places which uses vframe NULL checks it before, so let's do that in GetCurrentLocationClosure also. >> >> Testing: nsk jdi/jvmti, jdk jdi, jck vm and t1-3. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Typo > - Merge branch 'master' into 8261262 > - Check vframe non-null Marked as reviewed by sspitsyn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3010 From rehn at openjdk.java.net Wed Mar 17 07:30:12 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Wed, 17 Mar 2021 07:30:12 GMT Subject: Integrated: 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION In-Reply-To: References: Message-ID: On Mon, 15 Mar 2021 11:48:38 GMT, Robbin Ehn wrote: > When returning from the last Java frame back to vm and hitting a safepoint poll on that last return we sometimes have a last java frame but no vframe. > This seem to be a bug in itself, handled in: 8263576 > > Other places which uses vframe NULL checks it before, so let's do that in GetCurrentLocationClosure also. > > Testing: nsk jdi/jvmti, jdk jdi, jck vm and t1-3. This pull request has now been integrated. Changeset: 7b9d2562 Author: Robbin Ehn URL: https://git.openjdk.java.net/jdk/commit/7b9d2562 Stats: 10 lines in 1 file changed: 0 ins; 3 del; 7 mod 8261262: Kitchensink24HStress.java crashed with EXCEPTION_ACCESS_VIOLATION Reviewed-by: dcubed, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/3010 From ysuenaga at openjdk.java.net Wed Mar 17 08:37:31 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 17 Mar 2021 08:37:31 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp Message-ID: I tried to build OpenJDK with g++-10.2.1_pre1-r3 on Alpine Linux 3.13.2, but I saw following warning: 668 | alloca(((pid ^ counter++) & 7) * 128); | ^ cc1plus: all warnings being treated as errors ------------- Commit messages: - 8263718: unused-result warning happens at os_linux.cpp Changes: https://git.openjdk.java.net/jdk/pull/3042/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3042&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263718 Stats: 11 lines in 3 files changed: 9 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3042.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3042/head:pull/3042 PR: https://git.openjdk.java.net/jdk/pull/3042 From david.holmes at oracle.com Wed Mar 17 09:02:05 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 17 Mar 2021 19:02:05 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: Message-ID: <0547b635-9fe2-5c74-16a6-e30c3edb607e@oracle.com> Hi Yasumasa, On 17/03/2021 6:37 pm, Yasumasa Suenaga wrote: > I tried to build OpenJDK with g++-10.2.1_pre1-r3 on Alpine Linux 3.13.2, but I saw following warning: > > > 668 | alloca(((pid ^ counter++) & 7) * 128); > | ^ > cc1plus: all warnings being treated as errors First I have to wonder whether that alloca actually serves any useful purpose in this day and age? I wonder what was used to measure the performance difference... I'll see what I can find out. But second, will doing: void* padding = alloca(...); not suffice to suppress the warning, or will the compiler then complain about "padding" being unused? The pragma no doubt works, but is a bit ugly. :) Thanks, David > ------------- > > Commit messages: > - 8263718: unused-result warning happens at os_linux.cpp > > Changes: https://git.openjdk.java.net/jdk/pull/3042/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3042&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8263718 > Stats: 11 lines in 3 files changed: 9 ins; 0 del; 2 mod > Patch: https://git.openjdk.java.net/jdk/pull/3042.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/3042/head:pull/3042 > > PR: https://git.openjdk.java.net/jdk/pull/3042 > From ysuenaga at openjdk.java.net Wed Mar 17 09:29:08 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 17 Mar 2021 09:29:08 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: Message-ID: On Wed, 17 Mar 2021 06:31:28 GMT, Yasumasa Suenaga wrote: > I tried to build OpenJDK with g++-10.2.1_pre1-r3 on Alpine Linux 3.13.2, but I saw following warning: > > > 668 | alloca(((pid ^ counter++) & 7) * 128); > | ^ > cc1plus: all warnings being treated as errors > First I have to wonder whether that alloca actually serves any useful > purpose in this day and age? I wonder what was used to measure the > performance difference... I'll see what I can find out. I also thought we can remove it, but I could not find any reason to do so. > But second, will doing: > > void* padding = alloca(...); > > not suffice to suppress the warning, or will the compiler then complain > about "padding" being unused? > > The pragma no doubt works, but is a bit ugly. :) Yes, the warning was gone with `void* padding =`, but I don't want to do so because GCC has [-Wunused-variable](https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html). GCC might report the warning in future. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From stefank at openjdk.java.net Wed Mar 17 10:20:49 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 17 Mar 2021 10:20:49 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: Message-ID: On Wed, 17 Mar 2021 09:26:33 GMT, Yasumasa Suenaga wrote: >> I tried to build OpenJDK with g++-10.2.1_pre1-r3 on Alpine Linux 3.13.2, but I saw following warning: >> >> >> 668 | alloca(((pid ^ counter++) & 7) * 128); >> | ^ >> cc1plus: all warnings being treated as errors > >> First I have to wonder whether that alloca actually serves any useful >> purpose in this day and age? I wonder what was used to measure the >> performance difference... I'll see what I can find out. > > I also thought we can remove it, but I could not find any reason to do so. > >> But second, will doing: >> >> void* padding = alloca(...); >> >> not suffice to suppress the warning, or will the compiler then complain >> about "padding" being unused? >> >> The pragma no doubt works, but is a bit ugly. :) > > Yes, the warning was gone with `void* padding =`, but I don't want to do so because GCC has [-Wunused-variable](https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html). GCC might report the warning in future. Isn't the standard way to silence these kind of problem to cast the result to (void): (void)alloca(((pid ^ counter++) & 7) * 128); ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Wed Mar 17 10:31:47 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 17 Mar 2021 10:31:47 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: Message-ID: On Wed, 17 Mar 2021 10:18:25 GMT, Stefan Karlsson wrote: > Isn't the standard way to silence these kind of problem to cast the result to (void): > > ``` > (void)alloca(((pid ^ counter++) & 7) * 128); > ``` I tried it at first, but I still saw the warning. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From shade at openjdk.java.net Wed Mar 17 11:01:49 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 17 Mar 2021 11:01:49 GMT Subject: Integrated: 8263434: Dangling references after MethodComparator::methods_EMCP In-Reply-To: References: Message-ID: On Thu, 11 Mar 2021 11:02:43 GMT, Aleksey Shipilev wrote: > SonarCloud reports the following problem in MethodComparator::methods_EMCP: > "Address of stack memory associated with local variable 's_new' is still referred to by the global variable '_s_new' upon returning to the caller. This will be a dangling reference" > > Code inspection reveals the assignment to static variables is only needed to pass them to helper methods. So, while this is not a detectable bug (yet), it is still cleaner not to expose stack variables in globals. > > Additional testing: > - [x] Linux x86_64 fastdebug `tier1` > - [x] Linux x86_64 fastdebug, `vmTestbase_nsk_jvmti` This pull request has now been integrated. Changeset: f9f2eef9 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/f9f2eef9 Stats: 93 lines in 2 files changed: 4 ins; 7 del; 82 mod 8263434: Dangling references after MethodComparator::methods_EMCP Reviewed-by: coleenp, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/2937 From sjohanss at openjdk.java.net Wed Mar 17 11:39:51 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 17 Mar 2021 11:39:51 GMT Subject: RFR: JDK-8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: References: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> Message-ID: On Tue, 16 Mar 2021 18:15:24 GMT, Marcus G K Williams wrote: > > Hi Stefan ( @kstefanj ). Thanks for your review. I've addressed the specific suggestions. > > > When deciding on a new page size we should honor the passed in alignment if it is larger than the allocation granularity. Because in such case the upper layer has made a decision that the lower layer should honor. > > Also, I'm looking at this today, though not sure we can do this effectively without your refactoring. > I agree, we need to have the alignment available when choosing page size for this to work. > Thanks for pointing to JDK-8261527. I knew you were looking at large pages code and refactoring but I didn't know about JDK-8261527 or the other issues and PRs. I did run into the TestTracePageSizes.java while testing and saw some of the breadcrumbs. ?? > > I would be a favor of using [master...kstefanj:8262291-one-special-alloc](https://github.com/openjdk/jdk/compare/master...kstefanj:8262291-one-special-alloc) as this looks like a good way forward. However, can we use my current PR as enabling of multiple page sizes and then later incorporate your refactoring? > Depends on what you mean by enabling. I don't really see a way we can get to a point where we use multiple large page sizes without doing some refactoring. But if you by enable mean, that we just populate the page_size bitmap with all available large page sizes, I think this would be ok. We should rename the issue to reflect this in that case. > > One other way forward (not waiting for refactoring) would be to see this change as just enabling use of multiple page sizes and then actually using them will be added later when needed refactoring has been done. > > How would we go about doing your suggestion? There are two reasons I would like to get this PR in sooner rather than waiting, there is some pressure for me to put a bow on this change and I would really appreciate getting some changes in so that I can get my authorship in OpenJDK. I am happy to continue working on incremental changes to make page_size part of the code better and if I had a username I would suggest assigning some of the JDK-Issues to me. As an example I started looking at solving JDK-8263236, but couldn't comment or assign to myself. > Totally understand that you want to get this in, I just want to make sure we make things fit together in a good way. > Anyways, let me know how we can move forward and what the best path to get this change in for JDK-17. > I would prefer to split out the scanning part that add the large page sizes and then later on when everything else is in place, make use of this. This way we should be able to review this change pretty quickly and I can rebase my changes on yours. This would also give us to to think about a question I haven't made up my mind around. What will `LargePageSizeInBytes` mean after this change? Should the JVM use 1g pages (on a system where 2m i the default) even if `LargePageSizeInBytes` is not set? The current description for this flag is: `"Large page size (0 to let VM choose the page size)")` So we might need a CSR, depending on if the meaning of the flags changes. > Thanks, > Marcus ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From github.com+76791+alblue at openjdk.java.net Wed Mar 17 12:10:59 2021 From: github.com+76791+alblue at openjdk.java.net (Alex Blewitt) Date: Wed, 17 Mar 2021 12:10:59 GMT Subject: RFR: 8263659: Reflow GTestResultParser for better readability Message-ID: As noted by https://sonarcloud.io/code?id=shipilev_jdk&selected=shipilev_jdk%3Atest%2Fhotspot%2Fjtreg%2Fgtest%2FGTestResultParser.java there are a few fixes that can be applied for the GTestResultParser: * Avoiding nested 'try' statements * Avoiding nested 'switch' statements * Adding a break for each switch case to prevent accidental/unwanted fall-through * Disabling ability to load from remote files when parsing XML files ------------- Commit messages: - 8263659: Reflow GTestResultParser for better readability Changes: https://git.openjdk.java.net/jdk/pull/2991/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2991&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263659 Stats: 25 lines in 1 file changed: 3 ins; 4 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/2991.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2991/head:pull/2991 PR: https://git.openjdk.java.net/jdk/pull/2991 From shade at openjdk.java.net Wed Mar 17 12:11:00 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 17 Mar 2021 12:11:00 GMT Subject: RFR: 8263659: Reflow GTestResultParser for better readability In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 22:13:39 GMT, Alex Blewitt wrote: > As noted by https://sonarcloud.io/code?id=shipilev_jdk&selected=shipilev_jdk%3Atest%2Fhotspot%2Fjtreg%2Fgtest%2FGTestResultParser.java there are a few fixes that can be applied for the GTestResultParser: > > * Avoiding nested 'try' statements > * Avoiding nested 'switch' statements > * Adding a break for each switch case to prevent accidental/unwanted fall-through > * Disabling ability to load from remote files when parsing XML files Please change this PR synopsis to "8263659: Reflow GTestResultParser for better readability" to get this hooked properly. Also, enable testing, see "Pre-submit test status" in "Checks". test/hotspot/jtreg/gtest/GTestResultParser.java line 52: > 50: if (XMLStreamConstants.START_ELEMENT == xmlReader.next()) { > 51: switch (xmlReader.getLocalName()) { > 52: case "testsuite": Let's write this as: while (xmlReader.hasNext()) { int code = xmlReader.next(); if (code == XMLStreamConstants.START_ELEMENT) { switch (xmlReader.getLocalName()) { ------------- Changes requested by shade (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2991 From stefank at openjdk.java.net Wed Mar 17 13:09:48 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 17 Mar 2021 13:09:48 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: Message-ID: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> On Wed, 17 Mar 2021 10:28:56 GMT, Yasumasa Suenaga wrote: >> Isn't the standard way to silence these kind of problem to cast the result to (void): >> (void)alloca(((pid ^ counter++) & 7) * 128); > >> Isn't the standard way to silence these kind of problem to cast the result to (void): >> >> ``` >> (void)alloca(((pid ^ counter++) & 7) * 128); >> ``` > > I tried it at first, but I still saw the warning. I see. I found this page: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Wed Mar 17 13:38:49 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 17 Mar 2021 13:38:49 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> Message-ID: On Wed, 17 Mar 2021 13:06:46 GMT, Stefan Karlsson wrote: >>> Isn't the standard way to silence these kind of problem to cast the result to (void): >>> >>> ``` >>> (void)alloca(((pid ^ counter++) & 7) * 128); >>> ``` >> >> I tried it at first, but I still saw the warning. > > I see. I found this page: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 Thanks @stefank ! Status of the Bugzilla ticket which you told is unconfirmed (and it does not seem to be updated since last year!). I want to suppress the warning if we cannot remove `alloca()` call. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From stuefe at openjdk.java.net Wed Mar 17 14:58:55 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 17 Mar 2021 14:58:55 GMT Subject: RFR: 8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: References: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> Message-ID: On Wed, 17 Mar 2021 11:37:02 GMT, Stefan Johansson wrote: > This would also give us to to think about a question I haven't made up my mind around. What will `LargePageSizeInBytes` mean after this change? Should the JVM use 1g pages (on a system where 2m i the default) even if `LargePageSizeInBytes` is not set? I see two valid scenarios: a) either use huge pages as best as possible; remove fine-grained control from user hands. So, if we have 1G pages, prefer those over 2M over 4K. Then, we could completely remove LargePageSizeInBytes. There is no need for this switch anymore. b) or keep the current behavior. In that case, UseLargePageSize means "use at most default huge page size" even if larger pages are available; and LargePageSizeInBytes can be used to override the large page size to use. (a): Its elegant, and efficiently uses system resources when available. But its an incompatible change, and the VM being grabby means we could end up using pages meant for a different process. (b): downward compatible. In a sense. With Marcus change, we break downward compatibility already: where "UseLargePageSizeInBytes" before meant "use that or nothing", it now means "use that or whatever smaller page sizes you find". I prefer (a), honestly. ..Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From ihse at openjdk.java.net Wed Mar 17 16:48:48 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Wed, 17 Mar 2021 16:48:48 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> Message-ID: On Wed, 17 Mar 2021 13:35:36 GMT, Yasumasa Suenaga wrote: >> I see. I found this page: >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66425 > > Thanks @stefank ! > Status of the Bugzilla ticket which you told is unconfirmed (and it does not seem to be updated since last year!). > > I want to suppress the warning if we cannot remove `alloca()` call. Did you notice this workaround in the gcc bug report? "Note that a logical negation prior to cast to void is sufficient to suppress the warning: int void_cast_should_not_warn() { (void) !foo(); // ^-- here return 0; } " ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From tschatzl at openjdk.java.net Wed Mar 17 17:21:51 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Wed, 17 Mar 2021 17:21:51 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <_XwpqPW-sKwuvZG26bSgdW6RKtQaIJz-0fpRR5wGa0c=.d6c5394b-7fea-4f0e-ab5f-9092286d3db1@github.com> Message-ID: <3FYRI-abe_Q_yAVKNJad9uB-TOTFD9p22wO7VW5ojH8=.6f73c0f0-aa0a-44fe-b38c-a8758c9cb46d@github.com> On Tue, 16 Mar 2021 12:23:52 GMT, Jaroslav Bachorik wrote: > > Also, for this purpose, why would used() not be a good substitute for liveness? If e.g. used() average grows over time you can deduce the same I would assume (particularly used() after mixed gc phase in g1). > > The major problem is that eg. for g1 given large enough heap the used value can keep on growing for quite long time, possibly generating wrong signal about potential memory leak. So can this liveness estimate - just filter used() a bit more. Actually I strongly believe that due to containing more details, used() could be more appropriate for this: first, it updates more often, second, information like distance between valleys and peaks of the sawtooth pattern are indicative of memory running away. If you really want to limit yourselves to something that is similar in update frequency to this liveness estimate, one could use just one value in a "tooth". However I think just strong enough low-pass filtering enough is as fine. > > If the live estimate is set to `used()` after mixed gc phase in g1 I think it still will be a good estimate. > The only thing I am opposing is having `live()` call return the current `used()` value which, IMO, might become rather confusing. > > > Do you have any numbers on what the impact of using used() vs. this live() would be in such a use case? > > Nope. Do you mean perf impact? Impact on false positives/negatives. > > > What I'm afraid of is that mixing values taken at different times - used and capacity are taken at the time of the event, and the liveness estimated updated at other, irregular intervals may cause significiant amount of confusion in interpreting this value. It might be obvious to you, but there will be other users. > > IDK. If the event field would explicitly mention that this is the **last known live size estimate** it should set the expectations right. > > > One option could be detaching the liveness estimate from used()/capacity() (I see a value in having some heap usage summary at regular intervals) and send the liveness estimate event just when they are generated? Then the various collectors could send this liveness value at times when they think they are fairly accurate, not when the collectors must and particularly not in conjunction with samples taken at completely different times. > > The problem is the irregularity - when the live size is reported only when it is calculated there might be long periods in the recording missing the live size data at all. In order for this information to be useful it should be reported at least at the beginning and end of a JFR chunk. > > > Independent of whether used/capacity and liveness are sent, the receiver needs to do statistics (trend lines) on those anyway. > > Yes. It's just that with the live size estimate one wouldn't be getting the false positives one would get with used heap trend. We've now discussed this issue within the gc team a bit and came to the following conclusions. Before going into that, our guidelines for adding new tracing code: generally we avoid providing functionality in the VM/GC that can be procured easily otherwise or there is a good substitute, particularly if their actual content is unclear. The VM is also generally not the place to store or accumulate data that is only used in external applications for their convenience, particularly where the general public usefulness is questionable or do not drive forward GC algorithms in some way. In the past in the cases we have done this a few times, and this has caused lots of maintenance burden (adding it, keeping it up to date, and finally removing it because nobody used it after all). We are open to providing raw data for events that fits this in the most painless way for the VM if they are well specified. The whole periodic HeapSummary event and its contents are questionable in this light: - used() and capacity() are provided "regularly", and it could either be retrieved at any time by other means (e.g. MXBeans), or even forced to be provided (cause a gc if you are really desperate). Particularly in cases of continuous monitoring, there should be no problem getting them even with existing JFR events. - it can be argued that it is *very* painless for GC to provide the periodic used/capacity, their values are well defined for all collectors. Still I believe particularly if you do continuous monitoring, this is kind of unnecessary and should be polled instead of pushed if required at higher frequency as provided now. - the suggested "liveness estimate" however goes against all of these guidelines: - the value is ill-defined if at all. The quality of the implementations is all over the place: - Epsilon the used() value - Serial the used() value and sometimes some attempt to actually return the live data using the dead-wood heuristic - Parallel returns used() always - G1 returns the amount of bytes marked plus the bytes allocated while marking, used() in other cases (although that may change) - Z returns the amount of bytes marked without the bytes allocated while marking (this is unclear to me actually) - Shenandoah seems to be fairly close to G1, I have no idea what the results are on the various additional modes it uses. - for those collectors that can not give you a good value, the application could as well easily generate it - just use used(). (That deadwood optimization does not change the situation a lot as the difference would be at most 5% or so difference, well within "estimate" range). - it forces gcs that do not use or need that value at all first calculate it and then keep it around for just this event - this liveness estimate, which is outdated a few instructions after the application runs, is coupled with current used()/capacity() values - there is no indication if that estimate in that event can actually used for the suggested purpose: It could have been calculated at any point in time, so its use for trend lines is limited (e.g.. regression etc). For such a regression you typically need multiple values anyway, and even more for some output with a significant amount of considence, so that single value without timestamp does not seem to help. If you track continuously, you would get all values anyway. So overall I believe the current suggestion to have the VM provide all these values and the event is just introducing complexity in the VM for convenience of the application. Still I think there might be need for the raw data if available, and if it's easily obtainable, then fine, do send some duplicate data. So my suggestion and what we in the gc team can support is to a) provide that HeapSummary event with capacity() and used() (but as mentioned, on a change they are sent out already so I do not see the exact situation in particular with continuous tracking...). b) provide some Liveness (or "Marked bytes" or similar) event when the value is generated (if they are generated) as a one-off event in places it can be derived without significant costs. I.e. some "send_marked_bytes()/live_bytes()" method to be called in Serial, Parallel, G1, Z and Shenandoah when generated and available. No additional storage of that value in the and repeated emission of that event by the VM. I'm intentionally using "marked bytes" here because this is a value that can actually be defined and verified by reviewers that it's actually returned. Some best effort estimate is just misleading, and is a pain to maintain and argue whether the goal has been met (and it's even worse to (dis-)prove that a change introduced a bug). Maybe it could be extended to "marked bytes plus allocated during marking, sent when marking finished" for all collectors that do marking - I do not know (particularly for Z), maybe it can. This approach also allows anyone to easily incrementally add new occurrences of that event as more code to support it has been written (e.g. with PR #2966 for g1 full gc), and allows leaving out collectors that do not support it, or places where this has not been gathered. We think this may be a useful value, that can be explained to others, will cause minimal misunderstanding, can be verified at least in the reviews (and tests where sending of that event is verified for the various situations it should be sent), and maintained. Returning "liveness" in a good way (both in accuracy and overhead) is a completely different issue, and probably worth a few PhDs. Just dodging the issue with appending "estimate" to the name is not the fix given alternatives. I would further ask you to at least create two different CRs for adding the two events (more for later additions of the event) for easier and faster review. You can provide a link to a "all-in" diff to let the reviewers see where you want to go with this. However having non-trivial changes across (at this point) 34 files for different collectors for different reasons is nontrivial and very exhausting to review and re-review (10 times so far at least for me). Thanks, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From enikitin at openjdk.java.net Wed Mar 17 19:02:32 2021 From: enikitin at openjdk.java.net (Evgeny Nikitin) Date: Wed, 17 Mar 2021 19:02:32 GMT Subject: RFR: 8058176: [mlvm] tests should not allow code cache exhaustion [v7] In-Reply-To: References: Message-ID: > Another approach to the JDK-8058176 and #2440 - never allowing the tests hit CodeCache limits. The most significant consumer is the MH graph builder (the MHTransformationGen), whose consumption is now controlled. List of changes: > > * Code cache size getters are added to WhiteBox; > * MH sequences are now built with remaining Code cache size in mind (always let 2M clearance); > * Dependencies on WhiteBox added for all affected tests; > * The test cases in question un-problemlisted. > > Testing: the whole vmTestbase/vm/mlvm/ in win-lin-mac x86. Evgeny Nikitin has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge branch 'master' into JDK-8058176/public - Move a comment a bit higher - Extract allowances into constants - Add non-nmethods pool to the monitoring - Fix 'cycles to build' error output - Add support for segmented CodeCache - Switch to ManagementBeans approach instead of the WhiteBox one - Un-problemlist the OOME tests - Add CodeCache methods to the WhiteBox - 8058176: [mlvm] tests should not allow code cache exhaustion ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2523/files - new: https://git.openjdk.java.net/jdk/pull/2523/files/64ae20d5..e76dd098 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2523&range=05-06 Stats: 74918 lines in 2689 files changed: 50234 ins; 13865 del; 10819 mod Patch: https://git.openjdk.java.net/jdk/pull/2523.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2523/head:pull/2523 PR: https://git.openjdk.java.net/jdk/pull/2523 From ysuenaga at openjdk.java.net Wed Mar 17 23:11:49 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Wed, 17 Mar 2021 23:11:49 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> Message-ID: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> On Wed, 17 Mar 2021 16:46:13 GMT, Magnus Ihse Bursie wrote: > "Note that a logical negation prior to cast to void is sufficient to suppress the warning: > > ``` > int void_cast_should_not_warn() { > (void) !foo(); > // ^-- here > return 0; > } > ``` > > " The warning has gone with `(void)!`, but isn't is strange a bit? IMHO code changes or pragma (or compiler option) are suitable than it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From minqi at openjdk.java.net Thu Mar 18 03:34:53 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 18 Mar 2021 03:34:53 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 21:39:48 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Fix filter more flags to exclude in static dump, add more test cases >> - Merge branch 'master' into jdk-8259070 >> - Fix white space in CDS.java >> - Add function CDS.dumpSharedArchive in java to dump shared archive >> - 8259070: Add jcmd option to dump CDS > > src/hotspot/share/memory/dynamicArchive.cpp line 347: > >> 345: if (Arguments::GetSharedDynamicArchivePath() == NULL) { >> 346: if (!RecordDynamicDumpInfo) { >> 347: // If run with -XX:+RecordDynamicDumpInfo, DynamicDumpSharedSpaces will be turned on, > > Is this check needed? It looks like `MetaspaceShared::cmd_dump_dynamic` will not call `DynamicArchive::dump()` unless the path was set up correctly. Fixed. The warning is harmless so I just revert it back. > src/hotspot/share/memory/metaspaceShared.cpp line 789: > >> 787: char filename[JVM_MAXPATHLEN]; >> 788: const char* file = file_name; >> 789: assert(strcmp(cmd, "static_dump") == 0 || strcmp(cmd, "dynamic_dump") == 0, "Sanity check"); > > Since the caller of this function already performed the string validity check, I think it's better to pass `bool is_static` as a parameter and not pass `cmd`. Moved to CDS.java, code is simple than this. > src/hotspot/share/memory/metaspaceShared.cpp line 863: > >> 861: MutexLocker lock(ClassLoaderDataGraph_lock); >> 862: DumpClassListCLDClosure collect_classes(stream); >> 863: ClassLoaderDataGraph::loaded_cld_do(&collect_classes); > > Need to close the stream. Changed to use stack object so it will close the file at destrutor. > src/hotspot/share/runtime/globals.hpp line 1896: > >> 1894: \ >> 1895: product(bool, RecordDynamicDumpInfo, false, \ >> 1896: "Record class info for jcmd Dynamic dump") \ > > "Record class info for jcmd VM.cds dynamic_dump"? Fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Thu Mar 18 03:34:53 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 18 Mar 2021 03:34:53 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: <8BIT_U1LoH-XHUajzOHKBe7xSETc4go9iVbaAlPcTlg=.6d34cc08-ed03-400e-9176-2d809bf1e6a5@github.com> References: <8BIT_U1LoH-XHUajzOHKBe7xSETc4go9iVbaAlPcTlg=.6d34cc08-ed03-400e-9176-2d809bf1e6a5@github.com> Message-ID: On Sat, 27 Feb 2021 05:12:25 GMT, Thomas Stuefe wrote: >> src/hotspot/share/memory/metaspaceShared.cpp line 783: >> >>> 781: char* start = buffer + strlen(buffer); >>> 782: snprintf(start, buff_len, "%s ", arg); >>> 783: } >> >> Maybe move the above function to the StringUtils class under share/utilities? >> Use `os::snprintf()` instead of `snprintf()`? > > The calculation is also wrong, this would overflow. You need: > char* start = buffer + strlen(buffer); > snprintf(start, buff_len - (start - buffer), "%s ", arg); > - and maybe add an assert that strlen(buf) < bufflen. > - and as Ioi wrote, I'd use either one of os::snprintf or jio_snprintf since both guarantee zero termination on truncation. > - or, just use strncat() The new solution is using CDS.java to do the work. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Thu Mar 18 03:34:58 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 18 Mar 2021 03:34:58 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: References: <2ARkZBqUR_xMiSXQYcq-rmOVxB62xIiIjzsJo0ZB9Xo=.1e29924e-5aaf-4356-ac6a-5b4b46c177ee@github.com> Message-ID: On Sat, 27 Feb 2021 18:11:38 GMT, Ioi Lam wrote: >> How would it overflow? But I agree, I would not add jsa extension if the user did not specify one. I dislike when programs do that. > > `file_name` is user input that comes from the jcmd, so it can be arbitrarily long and exceed JVM_MAXPATHLEN characters. The stuff no in java code, and is simple to deal with. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Thu Mar 18 03:34:57 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 18 Mar 2021 03:34:57 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 22:01:50 GMT, Calvin Cheung wrote: >> Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Fix filter more flags to exclude in static dump, add more test cases >> - Merge branch 'master' into jdk-8259070 >> - Fix white space in CDS.java >> - Add function CDS.dumpSharedArchive in java to dump shared archive >> - 8259070: Add jcmd option to dump CDS > > src/hotspot/share/memory/metaspaceShared.cpp line 788: > >> 786: // The existing file will be overwritten. >> 787: char filename[JVM_MAXPATHLEN]; >> 788: const char* file = file_name; > > Is the variable at line 788 necessary? Could you just pass filename to callees? New solutions using CDS.java to do the dump. > src/hotspot/share/memory/metaspaceShared.cpp line 801: > >> 799: file = filename; >> 800: } >> 801: } > > This block of code is very similar to lines 813 - 821 below. Maybe factor it into another function? changed to use java to dump. That will be much simple to deal with string. > src/hotspot/share/memory/metaspaceShared.cpp line 831: > >> 829: DumpClassListCLDClosure(fileStream* f) : CLDClosure() { _stream = f; } >> 830: ~DumpClassListCLDClosure() { >> 831: delete _stream; // The file need close since in child process it will be used. > > Can you clarify the above comment? Changed to use java to do the dump. > test/hotspot/jtreg/runtime/cds/appcds/jcmd/JCmdTest.java line 213: > >> 211: if (!cdsEnabled) { >> 212: System.out.println("CDS is not available for this JDK, skip the test."); >> 213: return; > > Should throw SkippedException instead. Fixed ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Thu Mar 18 03:34:58 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 18 Mar 2021 03:34:58 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v3] In-Reply-To: References: Message-ID: <4juIfgbibTnFJn3B8je1VWCuXCnZVOkzdGLomQk0mrE=.62db438e-f873-49c0-82f5-399017dd5ede@github.com> On Wed, 10 Mar 2021 04:28:04 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix white space in CDS.java > > src/java.base/share/classes/jdk/internal/misc/CDS.java line 256: > >> 254: >> 255: // Do not take parent env which will cause dumping fail. >> 256: Process proc = Runtime.getRuntime().exec(cmds.toArray(new String[0]), > > Could you explain why the parent's env variables will cause dumping to fail? I found jtreg env will be brought in to the children env which is not needed in this case. Add comment. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Thu Mar 18 03:39:53 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Thu, 18 Mar 2021 03:39:53 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: References: Message-ID: On Fri, 26 Feb 2021 22:46:07 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Fix filter more flags to exclude in static dump, add more test cases >> - Merge branch 'master' into jdk-8259070 >> - Fix white space in CDS.java >> - Add function CDS.dumpSharedArchive in java to dump shared archive >> - 8259070: Add jcmd option to dump CDS > > src/hotspot/share/runtime/arguments.cpp line 3525: > >> 3523: os::free(SharedDynamicArchivePath); >> 3524: SharedDynamicArchivePath = nullptr; >> 3525: } > > Is this necessary? When do dynamic dump, we set SharedDynamicArchivePath to the given file name, after that, restore the original value so free the old one to prevent memory leak. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From tschatzl at openjdk.java.net Thu Mar 18 08:36:50 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 18 Mar 2021 08:36:50 GMT Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: <3FYRI-abe_Q_yAVKNJad9uB-TOTFD9p22wO7VW5ojH8=.6f73c0f0-aa0a-44fe-b38c-a8758c9cb46d@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <_XwpqPW-sKwuvZG26bSgdW6RKtQaIJz-0fpRR5wGa0c=.d6c5394b-7fea-4f0e-ab5f-9092286d3db1@github.com> <3FYRI-abe_Q_yAVKNJad9uB-TOTFD9p22wO7VW5ojH8=.6f73c0f0-aa0a-44fe-b38c-a8758c9cb46d@github.com> Message-ID: On Wed, 17 Mar 2021 17:19:06 GMT, Thomas Schatzl wrote: > Still I think there might be need for the raw data if available, and if it's easily obtainable, then fine, do send some duplicate data. So my suggestion and what we in the gc team can support is to > > a) provide that HeapSummary event with capacity() and used() (but as mentioned, on a change they are sent out already so I do not see the exact situation in particular with continuous tracking...). There has been some question in the latter part of this statement: the "but as mentioned...." part refers to the situation that if you are already continuously monitoring, you will get all of the `used()` events the VM currently sends anyway (if subscribed). So this periodic event does not give you more information. There may be need for sending `used()` in particular more often as it is done now (and I am open to somebody improving this), not for convenience but because something interesting happens in the Java heap. I am not sure that a "JFR chunk" is the right periodicity though, because it can potentially mean sending (assuming that a "chunk" is some fixed amount of events) every ms to every hour or day. This "random" sampling may send events both too often and too infrequent and not when it matters. Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From jaroslav.bachorik at datadoghq.com Thu Mar 18 09:26:53 2021 From: jaroslav.bachorik at datadoghq.com (=?UTF-8?Q?Jaroslav_Bachor=C3=ADk?=) Date: Thu, 18 Mar 2021 10:26:53 +0100 Subject: RFR: 8258431: Provide a JFR event with live set size estimate [v12] In-Reply-To: <3FYRI-abe_Q_yAVKNJad9uB-TOTFD9p22wO7VW5ojH8=.6f73c0f0-aa0a-44fe-b38c-a8758c9cb46d@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> <_XwpqPW-sKwuvZG26bSgdW6RKtQaIJz-0fpRR5wGa0c=.d6c5394b-7fea-4f0e-ab5f-9092286d3db1@github.com> <3FYRI-abe_Q_yAVKNJad9uB-TOTFD9p22wO7VW5ojH8=.6f73c0f0-aa0a-44fe-b38c-a8758c9cb46d@github.com> Message-ID: First of all, it is a pity my [attempt to open a discussion regarding this topic](https://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2021-February/033705.html) didn't get more attention. On Wed, Mar 17, 2021 at 6:22 PM Thomas Schatzl wrote: > > On Tue, 16 Mar 2021 12:23:52 GMT, Jaroslav Bachorik wrote: > > > > Also, for this purpose, why would used() not be a good substitute for liveness? If e.g. used() average grows over time you can deduce the same I would assume (particularly used() after mixed gc phase in g1). > > > > The major problem is that eg. for g1 given large enough heap the used value can keep on growing for quite long time, possibly generating wrong signal about potential memory leak. > > So can this liveness estimate - just filter used() a bit more. > > Actually I strongly believe that due to containing more details, used() could be more appropriate for this: first, it updates more often, second, information like distance between valleys and peaks of the sawtooth pattern are indicative of memory running away. > If you really want to limit yourselves to something that is similar in update frequency to this liveness estimate, one could use just one value in a "tooth". However I think just strong enough low-pass filtering enough is as fine. > > > > > If the live estimate is set to `used()` after mixed gc phase in g1 I think it still will be a good estimate. > > The only thing I am opposing is having `live()` call return the current `used()` value which, IMO, might become rather confusing. > > > > > Do you have any numbers on what the impact of using used() vs. this live() would be in such a use case? > > > > Nope. Do you mean perf impact? > > Impact on false positives/negatives. > > > > > > What I'm afraid of is that mixing values taken at different times - used and capacity are taken at the time of the event, and the liveness estimated updated at other, irregular intervals may cause significiant amount of confusion in interpreting this value. It might be obvious to you, but there will be other users. > > > > IDK. If the event field would explicitly mention that this is the **last known live size estimate** it should set the expectations right. > > > > > One option could be detaching the liveness estimate from used()/capacity() (I see a value in having some heap usage summary at regular intervals) and send the liveness estimate event just when they are generated? Then the various collectors could send this liveness value at times when they think they are fairly accurate, not when the collectors must and particularly not in conjunction with samples taken at completely different times. > > > > The problem is the irregularity - when the live size is reported only when it is calculated there might be long periods in the recording missing the live size data at all. In order for this information to be useful it should be reported at least at the beginning and end of a JFR chunk. > > > > > Independent of whether used/capacity and liveness are sent, the receiver needs to do statistics (trend lines) on those anyway. > > > > Yes. It's just that with the live size estimate one wouldn't be getting the false positives one would get with used heap trend. > > We've now discussed this issue within the gc team a bit and came to the following conclusions. > > Before going into that, our guidelines for adding new tracing code: generally we avoid providing functionality in the VM/GC that can be procured easily otherwise or there is a good substitute, particularly if their actual content is unclear. The VM is also generally not the place to store or accumulate data that is only used in external applications for their convenience, particularly where the general public usefulness is questionable or do not drive forward GC algorithms in some way. > > In the past in the cases we have done this a few times, and this has caused lots of maintenance burden (adding it, keeping it up to date, and finally removing it because nobody used it after all). > > We are open to providing raw data for events that fits this in the most painless way for the VM if they are well specified. > > The whole periodic HeapSummary event and its contents are questionable in this light: > > - used() and capacity() are provided "regularly", and it could either be retrieved at any time by other means (e.g. MXBeans), or even forced to be provided (cause a gc if you are really desperate). Particularly in cases of continuous monitoring, there should be no problem getting them even with existing JFR events. These values might be quite imprecise - it is not uncommon for G1 to see heap usage increasing for long time which could falsely trigger a leak alarm. Also, triggering GC does not seem to be that great idea considering that it will cause a rather lengthy safepoint. > - it can be argued that it is *very* painless for GC to provide the periodic used/capacity, their values are well defined for all collectors. Still I believe particularly if you do continuous monitoring, this is kind of unnecessary and should be polled instead of pushed if required at higher frequency as provided now. Well, we are doing continuous monitoring and it is not unnecessary .. > > - the suggested "liveness estimate" however goes against all of these guidelines: > - the value is ill-defined if at all. The quality of the implementations is all over the place: > - Epsilon the used() value There is literally no other information for epsilon if I am not totally mistaken. > - Serial the used() value and sometimes some attempt to actually return the live data using the dead-wood heuristic > - Parallel returns used() always > - G1 returns the amount of bytes marked plus the bytes allocated while marking, used() in other cases (although that may change) > - Z returns the amount of bytes marked without the bytes allocated while marking (this is unclear to me actually) > - Shenandoah seems to be fairly close to G1, I have no idea what the results are on the various additional modes it uses. Yes. And this was my initial question in the email thread I started before even attempting this PR. Whatever limited response I got seemed to confirm this way of getting the liveness estimate. > - for those collectors that can not give you a good value, the application could as well easily generate it - just use used(). (That deadwood optimization does not change the situation a lot as the difference would be at most 5% or so difference, well within "estimate" range). > - it forces gcs that do not use or need that value at all first calculate it and then keep it around for just this event > - this liveness estimate, which is outdated a few instructions after the application runs, is coupled with current used()/capacity() values > - there is no indication if that estimate in that event can actually used for the suggested purpose: It could have been calculated at any point in time, so its use for trend lines is limited (e.g.. regression etc). For such a regression you typically need multiple values anyway, and even more for some output with a significant amount of considence, so that single value without timestamp does not seem to help. If you track continuously, you would get all values anyway. > > So overall I believe the current suggestion to have the VM provide all these values and the event is just introducing complexity in the VM for convenience of the application. > > Still I think there might be need for the raw data if available, and if it's easily obtainable, then fine, do send some duplicate data. So my suggestion and what we in the gc team can support is to > > a) provide that HeapSummary event with capacity() and used() (but as mentioned, on a change they are sent out already so I do not see the exact situation in particular with continuous tracking...). > > b) provide some Liveness (or "Marked bytes" or similar) event when the value is generated (if they are generated) as a one-off event in places it can be derived without significant costs. I.e. some "send_marked_bytes()/live_bytes()" method to be called in Serial, Parallel, G1, Z and Shenandoah when generated and available. No additional storage of that value in the and repeated emission of that event by the VM. > I'm intentionally using "marked bytes" here because this is a value that can actually be defined and verified by reviewers that it's actually returned. Some best effort estimate is just misleading, and is a pain to maintain and argue whether the goal has been met (and it's even worse to (dis-)prove that a change introduced a bug). Maybe it could be extended to "marked bytes plus allocated during marking, sent when marking finished" for all collectors that do marking - I do not know (particularly for Z), maybe it can. > This approach also allows anyone to easily incrementally add new occurrences of that event as more code to support it has been written (e.g. with PR #2966 for g1 full gc), and allows leaving out collectors that do not support it, or places where this has not been gathered. > > We think this may be a useful value, that can be explained to others, will cause minimal misunderstanding, can be verified at least in the reviews (and tests where sending of that event is verified for the various situations it should be sent), and maintained. > > Returning "liveness" in a good way (both in accuracy and overhead) is a completely different issue, and probably worth a few PhDs. Just dodging the issue with appending "estimate" to the name is not the fix given alternatives. > > I would further ask you to at least create two different CRs for adding the two events (more for later additions of the event) for easier and faster review. You can provide a link to a "all-in" diff to let the reviewers see where you want to go with this. However having non-trivial changes across (at this point) 34 files for different collectors for different reasons is nontrivial and very exhausting to review and re-review (10 times so far at least for me). I am sorry for wasting yours and other reviewers time. I really tried getting some consensus before coming with a half-baked PR but as you can see the responses start flowing only when the PR is 'almost done'. I am sensing quite negative attitude toward this change and the whole idea of capturing the liveness estimate. At the same time the solutions you propose give me nothing over just doing custom JFR event generation using data obtained via JMX - so I really don't see a point in doing this change in hotspot. I am going to close this PR and I apologize again for wasting time of all involved engineers. Regards, -JB- > > Thanks, > Thomas > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/2579 From jbachorik at openjdk.java.net Thu Mar 18 09:32:52 2021 From: jbachorik at openjdk.java.net (Jaroslav Bachorik) Date: Thu, 18 Mar 2021 09:32:52 GMT Subject: Withdrawn: 8258431: Provide a JFR event with live set size estimate In-Reply-To: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> References: <7HVs4jngEbNIQIPQByuE6IRYAxdijfa82uhEFWHld5U=.a7784482-d7e1-4d59-88ee-455d8691631e@github.com> Message-ID: On Mon, 15 Feb 2021 17:23:44 GMT, Jaroslav Bachorik wrote: > The purpose of this change is to expose a 'cheap' estimate of the current live set size (the meaning of 'current' is dependent on each particular GC implementation but in worst case 'at last full GC') in form of a periodically emitted JFR event. > > ## Introducing new JFR event > > While there is already 'GC Heap Summary' JFR event it does not fit the requirements as it is closely tied to GC cycle so eg. for ZGC or Shenandoah it may not happen for quite a long time, increasing the risk of not having the heap summary events being present in the JFR recording at all. > Because of this I am proposing to add a new 'Heap Usage Summary' event which will be emitted periodically, by default on each JFR chunk, and will contain the information abut the heap capacity, the used and live bytes. This information is available from all GC implementations and can be provided at literally any time. > > ## Implementation > > The implementation differs from GC to GC because each GC algorithm/implementation provides a slightly different way to track the liveness. The common part is `size_t live() const` method added to `CollectedHeap` superclass and the use of a cached 'liveness' value computed after the last GC cycle. If `liveness` hasn't been calculated yet the implementation will default to returning 'used' value. > > The implementations are based on my (rather shallow) knowledge of inner working of the respective GC engines and I am open to suggestions to make them better/correct. > > ### Epsilon GC > > Trivial implementation - just return `used()` instead. > > ### Serial GC > > Here we utilize the fact that mark-copy phase is naturally compacting so the number of bytes after copy is 'live' and that the mark-sweep implementation keeps an internal info about objects being 'dead' but excluded from the compaction effort and we can these numbers to derive the old-gen live set size (used bytes minus the cumulative size of the 'un-dead' objects). > > ### Parallel GC > > For Parallel GC the liveness is calculated as the sum of used bytes in all regions after the last GC cycle. This seems to be a safe bet because this collector is always compacting (AFAIK). > > ### G1 GC > > Using `G1ConcurrentMark::remark()` method the live set size is computed as a sum of `_live_words` from the associated `G1RegionMarkStats` objects. Here I am not 100% sure this approach covers all eventualities and it would be great to have someone skilled in G1 implementation to chime in so I can fix it. However, the numbers I am getting for G1 are comparable to other GCs for the same application. > > ### Shenandoah > > In Shenandoah, the regions are keeping the liveness info. However, the VM op that is used for iterating regions is a safe-pointing one so it would be great to run it in an already safe-pointed context. > This leads to hooking into `ShenandoahConcurrentMark::finish_mark()` and `ShenandoahSTWMark::mark()` where at the end of the marking process the liveness info is summarized and set to `ShenandoahHeap::_live` volatile field - which is later read by the event emitting code. > > ### ZGC > > `ZStatHeap` is already holding the liveness info - so this implementation is just making it accessible via `ZCollectedHeap::live()` method. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/2579 From ihse at openjdk.java.net Thu Mar 18 09:52:15 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Thu, 18 Mar 2021 09:52:15 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> Message-ID: <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> On Wed, 17 Mar 2021 23:08:31 GMT, Yasumasa Suenaga wrote: >> Did you notice this workaround in the gcc bug report? >> >> "Note that a logical negation prior to cast to void is sufficient to suppress the warning: >> >> int void_cast_should_not_warn() { >> (void) !foo(); >> // ^-- here >> return 0; >> } >> " > >> "Note that a logical negation prior to cast to void is sufficient to suppress the warning: >> >> ``` >> int void_cast_should_not_warn() { >> (void) !foo(); >> // ^-- here >> return 0; >> } >> ``` >> >> " > > The warning has gone with `(void)!`, but isn't is strange a bit? IMHO code changes or pragma (or compiler option) are suitable than it. I can't really comment on which is better -- it's up to you hotspot developers. Nevertheless, my personal preference is that `(void)` would have been best since it's a well-established idiom. If gcc is buggy about this, then I think I'd preferred to use this workaround with a comment `// the ! is needed to workaround gcc bug`, to keep with the idiom as much as possible. Pragmas also work, but I personally consider them ugly and a last resort. But once again, don't listen to what I'm saying :-) ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From stuefe at openjdk.java.net Thu Mar 18 10:06:00 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 18 Mar 2021 10:06:00 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> Message-ID: On Thu, 18 Mar 2021 09:48:27 GMT, Magnus Ihse Bursie wrote: >>> "Note that a logical negation prior to cast to void is sufficient to suppress the warning: >>> >>> ``` >>> int void_cast_should_not_warn() { >>> (void) !foo(); >>> // ^-- here >>> return 0; >>> } >>> ``` >>> >>> " >> >> The warning has gone with `(void)!`, but isn't is strange a bit? IMHO code changes or pragma (or compiler option) are suitable than it. > > I can't really comment on which is better -- it's up to you hotspot developers. > > Nevertheless, my personal preference is that `(void)` would have been best since it's a well-established idiom. If gcc is buggy about this, then I think I'd preferred to use this workaround with a comment `// the ! is needed to workaround gcc bug`, to keep with the idiom as much as possible. Pragmas also work, but I personally consider them ugly and a last resort. But once again, don't listen to what I'm saying :-) I agree with Magnus. BTW, we are sure that alloca() call is not simply optimized away, right? Otherwise I would assign the return value to some file static volatile holder. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From goetz at openjdk.java.net Thu Mar 18 10:24:40 2021 From: goetz at openjdk.java.net (Goetz Lindenmaier) Date: Thu, 18 Mar 2021 10:24:40 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 17:35:20 GMT, Lutz Schmidt wrote: > 8263260: [s390] Support latest hardware (z14 and z15) LGTM ------------- Marked as reviewed by goetz (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2918 From mdoerr at openjdk.java.net Thu Mar 18 11:49:41 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Thu, 18 Mar 2021 11:49:41 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 17:35:20 GMT, Lutz Schmidt wrote: > 8263260: [s390] Support latest hardware (z14 and z15) Detection looks good to me. Cleanup proposal: Try to make dates more consistent and comprehensive. src/hotspot/cpu/s390/vm_version_s390.cpp line 54: > 52: static const char* z_name[] = {" ", "z900", "z990", "z9 EC", "z10 EC", "z196 EC", "ec12", "z13", "z14", "z15" }; > 53: static const char* z_WDFM[] = {" ", "2006-06-30", "2008-06-30", "2010-06-30", "2012-06-30", "2014-06-30", "2016-12-31", "2019-06-30", "2021-06-30", "tbd" }; > 54: static const char* z_EOS[] = {" ", "2014-12-31", "2014-12-31", "2017-10-31", "2019-12-31", "2021-12-31", "tbd", "tbd", "tbd", "tbd" }; Table provides a nice overview, but seems like only z_name is used in the code. The rest only serves as comments. src/hotspot/cpu/s390/vm_version_s390.cpp line 309: > 307: } > 308: if (is_z9()) { > 309: _features_string = "system-z, g3-z9, ldisp_fast, extimm, out-of-support_as_of_2016-04-01"; How does this relate to the table above? src/hotspot/cpu/s390/vm_version_s390.hpp line 48: > 46: // z13: 2015-03 > 47: // z14: 2017-09 > 48: // z15: 2019-09 How does this relate to the table in the .cpp file? I'd prefer to have such kind of information consolidated at one place. ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2918 From ysuenaga at openjdk.java.net Thu Mar 18 12:29:38 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Thu, 18 Mar 2021 12:29:38 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> Message-ID: <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnPKEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> On Thu, 18 Mar 2021 10:03:04 GMT, Thomas Stuefe wrote: >> I can't really comment on which is better -- it's up to you hotspot developers. >> >> Nevertheless, my personal preference is that `(void)` would have been best since it's a well-established idiom. If gcc is buggy about this, then I think I'd preferred to use this workaround with a comment `// the ! is needed to workaround gcc bug`, to keep with the idiom as much as possible. Pragmas also work, but I personally consider them ugly and a last resort. But once again, don't listen to what I'm saying :-) > > I agree with Magnus. BTW, we are sure that alloca() call is not simply optimized away, right? Otherwise I would assign the return value to some file static volatile holder. First, I think it is the best to remove `alloca()`. According to the comment, it seems to aim to reduce cache pollution. But I wonder why this `alloca()` call resolves it because stack memory will be allocated in each threads - they should be different physical memory. This code has existed since initial load, so I cannot find JBS ticket for this. So I'm not sure we can remove it yet. (comments are welcome!) If we decide to remain this code, we need to avoid unused-result warning from GCC. I think we can use pragma because other pragmas (format-nonliteral, format-security, stringop-truncation") are used in HotSpot. In addition, this behavior does not seem to determine to be a bug in GCC (status is UNCONFIRMED). However I will agree to use `(void)!` if it is still preferred based on them. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From sjohanss at openjdk.java.net Thu Mar 18 12:50:46 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Thu, 18 Mar 2021 12:50:46 GMT Subject: RFR: 8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v23] In-Reply-To: References: <_uIvdHdm5ptjZJ8gEw8AzBJ8PC-16GoEWUuKW5zAXrg=.36c2da1c-685b-4d4d-a4e5-73b3fc48b812@github.com> Message-ID: On Wed, 17 Mar 2021 14:55:48 GMT, Thomas Stuefe wrote: > > This would also give us to to think about a question I haven't made up my mind around. What will `LargePageSizeInBytes` mean after this change? Should the JVM use 1g pages (on a system where 2m i the default) even if `LargePageSizeInBytes` is not set? > > I see two valid scenarios: Me too. > a) either use huge pages as best as possible; remove fine-grained control from user hands. So, if we have 1G pages, prefer those over 2M over 4K. Then, we could completely remove LargePageSizeInBytes. There is no need for this switch anymore. I agree, preferably we can make it so that the upper layers can use something like `page_size_for_region*` and request a certain page size, but fall back to smaller ones. > b) or keep the current behavior. In that case, UseLargePageSize means "use at most default huge page size" even if larger pages are available; and LargePageSizeInBytes can be used to override the large page size to use. > So it becomes more like a maximium value right? Or at least this is how I've thought about this second scenario. On a system with both 2M (the default size) and 1G pages available you would have to set `LargePageSizeInBytes=1g` to use the 1G pages, but the 2M could still be used for smaller mappings. > (a): Its elegant, and efficiently uses system resources when available. But its an incompatible change, and the VM being grabby means we could end up using pages meant for a different process. > (b): downward compatible. In a sense. With Marcus change, we break downward compatibility already: where "UseLargePageSizeInBytes" before meant "use that or nothing", it now means "use that or whatever smaller page sizes you find". > > I prefer (a), honestly. > I would also prefer (a). ------------- PR: https://git.openjdk.java.net/jdk/pull/1153 From david.holmes at oracle.com Thu Mar 18 13:25:02 2021 From: david.holmes at oracle.com (David Holmes) Date: Thu, 18 Mar 2021 23:25:02 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnPKEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnPKEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: <343eafb7-b2ef-ca04-0090-787470084831@oracle.com> Hi Yasumasa, On 18/03/2021 10:29 pm, Yasumasa Suenaga wrote: > On Thu, 18 Mar 2021 10:03:04 GMT, Thomas Stuefe wrote: > >>> I can't really comment on which is better -- it's up to you hotspot developers. >>> >>> Nevertheless, my personal preference is that `(void)` would have been best since it's a well-established idiom. If gcc is buggy about this, then I think I'd preferred to use this workaround with a comment `// the ! is needed to workaround gcc bug`, to keep with the idiom as much as possible. Pragmas also work, but I personally consider them ugly and a last resort. But once again, don't listen to what I'm saying :-) >> >> I agree with Magnus. BTW, we are sure that alloca() call is not simply optimized away, right? Otherwise I would assign the return value to some file static volatile holder. > > First, I think it is the best to remove `alloca()`. According to the comment, it seems to aim to reduce cache pollution. But I wonder why this `alloca()` call resolves it because stack memory will be allocated in each threads - they should be different physical memory. This code has existed since initial load, so I cannot find JBS ticket for this. So I'm not sure we can remove it yet. (comments are welcome!) The alloca was added as a performance boost for hyperthreaded systems back in 2003 for JDK 5: "A per-thread offset was added to each thread's stack to randomize the cachelines of hot stack frames (aka, stack coloring)." The issue is not open, hence why you could not find it, but it says very little beyond what I just quoted. I'm running some of our benchmarks to see if removing it makes a difference ... but chances are I'm not going to be running it on the kind of machines for which it was introduced. > If we decide to remain this code, we need to avoid unused-result warning from GCC. I think we can use pragma because other pragmas (format-nonliteral, format-security, stringop-truncation") are used in HotSpot. In addition, this behavior does not seem to determine to be a bug in GCC (status is UNCONFIRMED). However I will agree to use `(void)!` if it is still preferred based on them. I'd reluctantly prefer to use the pragma as an official mechanism for avoiding the warning. BTW this is not the first time we have had this issue with gcc making a function deprecated and casting to void not fixing it. Unfortunately I can't remember enough of the previous case's details to actually look it up and see what we resolved to do. :( David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3042 > From sjohanss at openjdk.java.net Thu Mar 18 14:12:52 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Thu, 18 Mar 2021 14:12:52 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs Message-ID: Please review this refactoring of the hugetlbfs reservation code. **Summary** In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); } else { return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); } The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). **Testing** Mach5 tier1-3 and a lot of local testing with different large page configurations. ------------- Commit messages: - 8262291: Refactor reserve_memory_special_huge_tlbfs Changes: https://git.openjdk.java.net/jdk/pull/3073/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262291 Stats: 159 lines in 3 files changed: 24 ins; 65 del; 70 mod Patch: https://git.openjdk.java.net/jdk/pull/3073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3073/head:pull/3073 PR: https://git.openjdk.java.net/jdk/pull/3073 From ysuenaga at openjdk.java.net Thu Mar 18 14:42:38 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Thu, 18 Mar 2021 14:42:38 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnPKEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Thu, 18 Mar 2021 12:27:04 GMT, Yasumasa Suenaga wrote: >> I agree with Magnus. BTW, we are sure that alloca() call is not simply optimized away, right? Otherwise I would assign the return value to some file static volatile holder. > > First, I think it is the best to remove `alloca()`. According to the comment, it seems to aim to reduce cache pollution. But I wonder why this `alloca()` call resolves it because stack memory will be allocated in each threads - they should be different physical memory. This code has existed since initial load, so I cannot find JBS ticket for this. So I'm not sure we can remove it yet. (comments are welcome!) > > If we decide to remain this code, we need to avoid unused-result warning from GCC. I think we can use pragma because other pragmas (format-nonliteral, format-security, stringop-truncation") are used in HotSpot. In addition, this behavior does not seem to determine to be a bug in GCC (status is UNCONFIRMED). However I will agree to use `(void)!` if it is still preferred based on them. > The alloca was added as a performance boost for hyperthreaded systems > back in 2003 for JDK 5: > > "A per-thread offset was added to each thread's stack to randomize the > cachelines of hot stack frames (aka, stack coloring)." IIUC it relates to share DTLB between two logical processors when Hyperthreading is enabled. > I'm running some of our benchmarks to see if removing it makes a > difference ... but chances are I'm not going to be running it on the > kind of machines for which it was introduced. Thanks! > > If we decide to remain this code, we need to avoid unused-result warning from GCC. I think we can use pragma because other pragmas (format-nonliteral, format-security, stringop-truncation") are used in HotSpot. In addition, this behavior does not seem to determine to be a bug in GCC (status is UNCONFIRMED). However I will agree to use `(void)!` if it is still preferred based on them. > > I'd reluctantly prefer to use the pragma as an official mechanism for > avoiding the warning. Ok, OpenJDK folks who discuss in this PR are not prefer to use pragma, so I will use `(void)!` if it is needed. Let's see the result of benchmark and discuss what should we do. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From iignatyev at openjdk.java.net Thu Mar 18 15:37:39 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 18 Mar 2021 15:37:39 GMT Subject: RFR: 8246494: introduce vm.flagless at-requires property In-Reply-To: References: Message-ID: On Wed, 17 Mar 2021 07:02:59 GMT, Serguei Spitsyn wrote: >> resurrecting old [RFR](https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/041981.html): >> >>> Hi all, >>> >>> could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? >>> >>> the idea behind this patch is to have a way to clearly mark tests which ignore flags, so >>> a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; >>> b) they can be easily excluded from runs w/ flags. >>> >>> @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. >>> >>> this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. >>> >>> please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 >>> webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >>> testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags >>> >>> [1] https://bugs.openjdk.java.net/browse/JDK-8151707 >>> [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 >>> [3] https://bugs.openjdk.java.net/browse/JDK-8246387 >>> >> >> after offline discussion with @pliden, it has been decided to reduce the scope of [8246499](https://bugs.openjdk.java.net/browse/JDK-8246499) and not mark the tests that use `UseXGC` flags for selection, e.g. `test/hotspot/jtreg/gc/z/TestSmallHeap.java`. >> >> Thanks, >> -- Igor > > Igor, > The fix looks good to me. > Thanks, > Serguei Thanks, Serguei! ------------- PR: https://git.openjdk.java.net/jdk/pull/2800 From iignatyev at openjdk.java.net Thu Mar 18 15:37:40 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 18 Mar 2021 15:37:40 GMT Subject: Integrated: 8246494: introduce vm.flagless at-requires property In-Reply-To: References: Message-ID: On Tue, 2 Mar 2021 23:27:21 GMT, Igor Ignatyev wrote: > resurrecting old [RFR](https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/041981.html): > >> Hi all, >> >> could you please review the patch which introduces a new @requires property to filter out the tests which ignore externally provided JVM flags? >> >> the idea behind this patch is to have a way to clearly mark tests which ignore flags, so >> a) it's obvious that they don't execute a flag-guarded code/feature, and extra care should be taken to use them to verify any flag-guarded changed; >> b) they can be easily excluded from runs w/ flags. >> >> @requires and VMProps allows us to achieve both, so it's been decided to add a new property `vm.flagless`. `vm.flagless` is set to false if there are any XX flags other than `-XX:MaxRAMPercentage` and `-XX:CreateCoredumpOnCrash` (which are known to be set almost always) or any X flags other `-Xmixed`; in other words any tests w/ `@requires vm.flagless` will be excluded from runs w/ any other X / XX flags passed via `-vmoption` / `-javaoption`. in rare cases, when one still wants to run the tests marked by `vm.flagless` w/ external flags, `vm.flagless` can be forcefully set to true by setting any value to `TEST_VM_FLAGLESS` env. variable. >> >> this patch adds necessary common changes and marks common tests, namely Scimark, GTestWrapper and TestNativeProcessBuilder. Component-specific tests will be marked separately by the corresponding subtasks of 8151707[1]. >> >> please note, the patch depends on CODETOOLS-7902336[2], which will be included in the next jtreg version, so this patch is to be integrated only after jtreg5.1 is promoted and we switch to use it by 8246387[3]. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8246494 >> webrev: http://cr.openjdk.java.net/~iignatyev//8246494/webrev.00 >> testing: marked tests w/ different XX and X flags w/ and w/o TEST_VM_FLAGLESS env. var, and w/o any flags >> >> [1] https://bugs.openjdk.java.net/browse/JDK-8151707 >> [2] https://bugs.openjdk.java.net/browse/CODETOOLS-7902336 >> [3] https://bugs.openjdk.java.net/browse/JDK-8246387 >> > > after offline discussion with @pliden, it has been decided to reduce the scope of [8246499](https://bugs.openjdk.java.net/browse/JDK-8246499) and not mark the tests that use `UseXGC` flags for selection, e.g. `test/hotspot/jtreg/gc/z/TestSmallHeap.java`. > > Thanks, > -- Igor This pull request has now been integrated. Changeset: e333b6e1 Author: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/e333b6e1 Stats: 81 lines in 6 files changed: 75 ins; 0 del; 6 mod 8246494: introduce vm.flagless at-requires property Reviewed-by: mseledtsov, sspitsyn ------------- PR: https://git.openjdk.java.net/jdk/pull/2800 From github.com+76791+alblue at openjdk.java.net Thu Mar 18 15:49:38 2021 From: github.com+76791+alblue at openjdk.java.net (Alex Blewitt) Date: Thu, 18 Mar 2021 15:49:38 GMT Subject: RFR: 8263659: Reflow GTestResultParser for better readability In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 09:17:30 GMT, Aleksey Shipilev wrote: >> As noted by https://sonarcloud.io/code?id=shipilev_jdk&selected=shipilev_jdk%3Atest%2Fhotspot%2Fjtreg%2Fgtest%2FGTestResultParser.java there are a few fixes that can be applied for the GTestResultParser: >> >> * Avoiding nested 'try' statements >> * Avoiding nested 'switch' statements >> * Adding a break for each switch case to prevent accidental/unwanted fall-through >> * Disabling ability to load from remote files when parsing XML files > > Please change this PR synopsis to "8263659: Reflow GTestResultParser for better readability" to get this hooked properly. Also, enable testing, see "Pre-submit test status" in "Checks". One of the test jobs hasn't finished running according to the above, but has if you drill down into the check itself. Is that blocking this PR moving forwards? Should I try to re-run the checks? ------------- PR: https://git.openjdk.java.net/jdk/pull/2991 From shade at openjdk.java.net Thu Mar 18 15:52:39 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 18 Mar 2021 15:52:39 GMT Subject: RFR: 8263659: Reflow GTestResultParser for better readability In-Reply-To: References: Message-ID: On Thu, 18 Mar 2021 15:46:39 GMT, Alex Blewitt wrote: >> Please change this PR synopsis to "8263659: Reflow GTestResultParser for better readability" to get this hooked properly. Also, enable testing, see "Pre-submit test status" in "Checks". > > One of the test jobs hasn't finished running according to the above, but has if you drill down into the check itself. Is that blocking this PR moving forwards? Should I try to re-run the checks? I think all checks have passed: https://github.com/openjdk/jdk/pull/2991/checks. We just need a test maintainer to ack the patch. @iignatev, want to to take a look? ------------- PR: https://git.openjdk.java.net/jdk/pull/2991 From iignatyev at openjdk.java.net Thu Mar 18 15:57:43 2021 From: iignatyev at openjdk.java.net (Igor Ignatyev) Date: Thu, 18 Mar 2021 15:57:43 GMT Subject: RFR: 8263659: Reflow GTestResultParser for better readability In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 22:13:39 GMT, Alex Blewitt wrote: > As noted by https://sonarcloud.io/code?id=shipilev_jdk&selected=shipilev_jdk%3Atest%2Fhotspot%2Fjtreg%2Fgtest%2FGTestResultParser.java there are a few fixes that can be applied for the GTestResultParser: > > * Avoiding nested 'try' statements > * Avoiding nested 'switch' statements > * Adding a break for each switch case to prevent accidental/unwanted fall-through > * Disabling ability to load from remote files when parsing XML files LGTM. @alblue, thanks for fixing that. ------------- Marked as reviewed by iignatyev (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2991 From shade at openjdk.java.net Thu Mar 18 16:00:40 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Thu, 18 Mar 2021 16:00:40 GMT Subject: RFR: 8263659: Reflow GTestResultParser for better readability In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 22:13:39 GMT, Alex Blewitt wrote: > As noted by https://sonarcloud.io/code?id=shipilev_jdk&selected=shipilev_jdk%3Atest%2Fhotspot%2Fjtreg%2Fgtest%2FGTestResultParser.java there are a few fixes that can be applied for the GTestResultParser: > > * Avoiding nested 'try' statements > * Avoiding nested 'switch' statements > * Adding a break for each switch case to prevent accidental/unwanted fall-through > * Disabling ability to load from remote files when parsing XML files Marked as reviewed by shade (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2991 From github.com+76791+alblue at openjdk.java.net Thu Mar 18 16:44:42 2021 From: github.com+76791+alblue at openjdk.java.net (Alex Blewitt) Date: Thu, 18 Mar 2021 16:44:42 GMT Subject: Integrated: 8263659: Reflow GTestResultParser for better readability In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 22:13:39 GMT, Alex Blewitt wrote: > As noted by https://sonarcloud.io/code?id=shipilev_jdk&selected=shipilev_jdk%3Atest%2Fhotspot%2Fjtreg%2Fgtest%2FGTestResultParser.java there are a few fixes that can be applied for the GTestResultParser: > > * Avoiding nested 'try' statements > * Avoiding nested 'switch' statements > * Adding a break for each switch case to prevent accidental/unwanted fall-through > * Disabling ability to load from remote files when parsing XML files This pull request has now been integrated. Changeset: 21db0f67 Author: Alex Blewitt Committer: Igor Ignatyev URL: https://git.openjdk.java.net/jdk/commit/21db0f67 Stats: 25 lines in 1 file changed: 3 ins; 4 del; 18 mod 8263659: Reflow GTestResultParser for better readability Reviewed-by: shade, iignatyev ------------- PR: https://git.openjdk.java.net/jdk/pull/2991 From ysuenaga at openjdk.java.net Fri Mar 19 00:05:39 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 19 Mar 2021 00:05:39 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Thu, 18 Mar 2021 14:40:20 GMT, Yasumasa Suenaga wrote: > > The alloca was added as a performance boost for hyperthreaded systems > > back in 2003 for JDK 5: > > "A per-thread offset was added to each thread's stack to randomize the > > cachelines of hot stack frames (aka, stack coloring)." > > IIUC it relates to share DTLB between two logical processors when Hyperthreading is enabled. `alloca()` call might be less effective if ASLR is enabled. It can be configured by [randomize_va_space](https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#randomize-va-space), and I guess it is enabled in a lot of x86 systems. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Fri Mar 19 00:17:37 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 19 Mar 2021 00:17:37 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 00:02:21 GMT, Yasumasa Suenaga wrote: >>> The alloca was added as a performance boost for hyperthreaded systems >>> back in 2003 for JDK 5: >>> >>> "A per-thread offset was added to each thread's stack to randomize the >>> cachelines of hot stack frames (aka, stack coloring)." >> >> IIUC it relates to share DTLB between two logical processors when Hyperthreading is enabled. >> >>> I'm running some of our benchmarks to see if removing it makes a >>> difference ... but chances are I'm not going to be running it on the >>> kind of machines for which it was introduced. >> >> Thanks! >> >>> > If we decide to remain this code, we need to avoid unused-result warning from GCC. I think we can use pragma because other pragmas (format-nonliteral, format-security, stringop-truncation") are used in HotSpot. In addition, this behavior does not seem to determine to be a bug in GCC (status is UNCONFIRMED). However I will agree to use `(void)!` if it is still preferred based on them. >>> >>> I'd reluctantly prefer to use the pragma as an official mechanism for >>> avoiding the warning. >> >> Ok, OpenJDK folks who discuss in this PR are not prefer to use pragma, so I will use `(void)!` if it is needed. >> Let's see the result of benchmark and discuss what should we do. > >> > The alloca was added as a performance boost for hyperthreaded systems >> > back in 2003 for JDK 5: >> > "A per-thread offset was added to each thread's stack to randomize the >> > cachelines of hot stack frames (aka, stack coloring)." >> >> IIUC it relates to share DTLB between two logical processors when Hyperthreading is enabled. > > `alloca()` call might be less effective if ASLR is enabled. It can be configured by [randomize_va_space](https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#randomize-va-space), and I guess it is enabled in a lot of x86 systems. > BTW this is not the first time we have had this issue with gcc making a > function deprecated and casting to void not fixing it. Unfortunately I > can't remember enough of the previous case's details to actually look it > up and see what we resolved to do. :( I found [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and it fixed unused-return as following: https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 I prefer it rather than `(void)!`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From stuefe at openjdk.java.net Fri Mar 19 04:47:42 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 19 Mar 2021 04:47:42 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 00:14:38 GMT, Yasumasa Suenaga wrote: > > BTW this is not the first time we have had this issue with gcc making a > > function deprecated and casting to void not fixing it. Unfortunately I > > can't remember enough of the previous case's details to actually look it > > up and see what we resolved to do. :( > > I found [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and it fixed unused-return as following: > > https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 > > I prefer it rather than `(void)!`. Does that work in release builds too? ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From david.holmes at oracle.com Fri Mar 19 05:09:56 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Mar 2021 15:09:56 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> On 19/03/2021 10:17 am, Yasumasa Suenaga wrote: > On Fri, 19 Mar 2021 00:02:21 GMT, Yasumasa Suenaga wrote: > >>>> The alloca was added as a performance boost for hyperthreaded systems >>>> back in 2003 for JDK 5: >>>> >>>> "A per-thread offset was added to each thread's stack to randomize the >>>> cachelines of hot stack frames (aka, stack coloring)." >>> >>> IIUC it relates to share DTLB between two logical processors when Hyperthreading is enabled. >>> >>>> I'm running some of our benchmarks to see if removing it makes a >>>> difference ... but chances are I'm not going to be running it on the >>>> kind of machines for which it was introduced. >>> >>> Thanks! >>> >>>>> If we decide to remain this code, we need to avoid unused-result warning from GCC. I think we can use pragma because other pragmas (format-nonliteral, format-security, stringop-truncation") are used in HotSpot. In addition, this behavior does not seem to determine to be a bug in GCC (status is UNCONFIRMED). However I will agree to use `(void)!` if it is still preferred based on them. >>>> >>>> I'd reluctantly prefer to use the pragma as an official mechanism for >>>> avoiding the warning. >>> >>> Ok, OpenJDK folks who discuss in this PR are not prefer to use pragma, so I will use `(void)!` if it is needed. >>> Let's see the result of benchmark and discuss what should we do. >> >>>> The alloca was added as a performance boost for hyperthreaded systems >>>> back in 2003 for JDK 5: >>>> "A per-thread offset was added to each thread's stack to randomize the >>>> cachelines of hot stack frames (aka, stack coloring)." >>> >>> IIUC it relates to share DTLB between two logical processors when Hyperthreading is enabled. >> >> `alloca()` call might be less effective if ASLR is enabled. It can be configured by [randomize_va_space](https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#randomize-va-space), and I guess it is enabled in a lot of x86 systems. It relates to L1 cache index collisions primarily so only impacts hyperthreading on Intel CPUs (it also impacted CMT on SPARC). I've been told that ASLR may render the manual stack-coloring unnecessary (as well as ineffective), but then we'd need to establish which platforms have that enabled, and whether we can tell. The benchmarking has been inconclusive - no observable differences. But the benchmarking machines (and indeed the benchmarks) may not be representative of the context where this optimization is needed. It has also been raised (as it was here) whether the alloca is even left in place by the compiler. Something I have yet to check. >> BTW this is not the first time we have had this issue with gcc making a >> function deprecated and casting to void not fixing it. Unfortunately I >> can't remember enough of the previous case's details to actually look it >> up and see what we resolved to do. :( > > I found [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and it fixed unused-return as following: > > https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 > > I prefer it rather than `(void)!`. But introducing a local (which was already suggested earlier) that is unused in a product build may also trigger a warning from the compiler and we are back where we started. David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3042 > From iklam at openjdk.java.net Fri Mar 19 05:42:42 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 19 Mar 2021 05:42:42 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: <9EU_DwWh3XcyBxJkxgPH1qzvbaa2hvWQYuccdRXWKj0=.c6816df0-6e73-45bc-9e52-caa70b0611fd@github.com> References: <9EU_DwWh3XcyBxJkxgPH1qzvbaa2hvWQYuccdRXWKj0=.c6816df0-6e73-45bc-9e52-caa70b0611fd@github.com> Message-ID: On Thu, 11 Mar 2021 04:03:29 GMT, Yumin Qi wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Fix filter more flags to exclude in static dump, add more test cases > - Merge branch 'master' into jdk-8259070 > - Fix white space in CDS.java > - Add function CDS.dumpSharedArchive in java to dump shared archive > - 8259070: Add jcmd option to dump CDS Changes requested by iklam (Reviewer). src/hotspot/share/memory/metaspaceShared.cpp line 50: > 48: #include "memory/cppVtables.hpp" > 49: #include "memory/dumpAllocStats.hpp" > 50: #include "memory/dynamicArchive.hpp" This is not needed anymore. src/hotspot/share/prims/jvm.cpp line 3766: > 3764: JVM_ENTRY(void, JVM_DumpDynamicArchive(JNIEnv *env, jstring archiveName)) > 3765: #if INCLUDE_CDS > 3766: assert(UseSharedSpaces && RecordDynamicDumpInfo, "Sanity check"); I think the message should say why this is true. How about `assert(UseSharedSpaces && RecordDynamicDumpInfo, "already checked in arguments.cpp");`? (same for the next assert statement at line 3774) src/hotspot/share/services/diagnosticCommand.cpp line 1139: > 1137: Handle throwable(THREAD, PENDING_EXCEPTION); > 1138: CLEAR_PENDING_EXCEPTION; > 1139: java_lang_Throwable::print_stack_trace(throwable, output()); Actually, it's not necessary to print the stack trace here. This function is called by `jcmd` in attachListener.cpp which will print the stack trace for you. static jint jcmd(AttachOperation* op, outputStream* out) { Thread* THREAD = Thread::current(); // All the supplied jcmd arguments are stored as a single // string (op->arg(0)). This is parsed by the Dcmd framework. DCmd::parse_and_execute(DCmd_Source_AttachAPI, out, op->arg(0), ' ', THREAD); if (HAS_PENDING_EXCEPTION) { java_lang_Throwable::print(PENDING_EXCEPTION, out); out->cr(); CLEAR_PENDING_EXCEPTION; return JNI_ERR; } return JNI_OK; } ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From iklam at openjdk.java.net Fri Mar 19 05:42:43 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 19 Mar 2021 05:42:43 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v3] In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 04:22:07 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix white space in CDS.java > > src/java.base/share/classes/jdk/internal/misc/CDS.java line 278: > >> 276: dumpDynamicArchive(archiveFile); >> 277: } >> 278: } > > I think we should have some error checks and clean up: > > - Remove the classlist file > - Check if if the process exit status is 0 > - Remove the JSA file first, then try to dump it, and check if the file exists afterwards. If not, report the error. (For both dynamic and static dumps) The classlist file is not deleted after the dump has finished. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From david.holmes at oracle.com Fri Mar 19 05:49:25 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Mar 2021 15:49:25 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> References: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> Message-ID: <43ab96d3-1091-5b8b-5d2b-9e2006c17cd6@oracle.com> Okay so this may all be moot :) On 19/03/2021 3:09 pm, David Holmes wrote: > On 19/03/2021 10:17 am, Yasumasa Suenaga wrote: >> On Fri, 19 Mar 2021 00:02:21 GMT, Yasumasa Suenaga >> wrote: >> >>>>> The alloca was added as a performance boost for hyperthreaded systems >>>>> back in 2003 for JDK 5: >>>>> >>>>> "A per-thread offset was added to each thread's stack to randomize the >>>>> cachelines of hot stack frames (aka, stack coloring)." >>>> >>>> IIUC it relates to share DTLB between two logical processors when >>>> Hyperthreading is enabled. >>>> >>>>> I'm running some of our benchmarks to see if removing it makes a >>>>> difference ... but chances are I'm not going to be running it on the >>>>> kind of machines for which it was introduced. >>>> >>>> Thanks! >>>> >>>>>> If we decide to remain this code, we need to avoid unused-result >>>>>> warning from GCC. I think we can use pragma because other pragmas >>>>>> (format-nonliteral, format-security, stringop-truncation") are >>>>>> used in HotSpot. In addition, this behavior does not seem to >>>>>> determine to be a bug in GCC (status is UNCONFIRMED). However I >>>>>> will agree to use `(void)!` if it is still preferred based on them. >>>>> >>>>> I'd reluctantly prefer to use the pragma as an official mechanism for >>>>> avoiding the warning. >>>> >>>> Ok, OpenJDK folks who discuss in this PR are not prefer to use >>>> pragma, so I will use `(void)!` if it is needed. >>>> Let's see the result of benchmark and discuss what should we do. >>> >>>>> The alloca was added as a performance boost for hyperthreaded systems >>>>> back in 2003 for JDK 5: >>>>> "A per-thread offset was added to each thread's stack to randomize the >>>>> cachelines of hot stack frames (aka, stack coloring)." >>>> >>>> IIUC it relates to share DTLB between two logical processors when >>>> Hyperthreading is enabled. >>> >>> `alloca()` call might be less effective if ASLR is enabled. It can be >>> configured by >>> [randomize_va_space](https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#randomize-va-space), >>> and I guess it is enabled in a lot of x86 systems. > > It relates to L1 cache index collisions primarily so only impacts > hyperthreading on Intel CPUs (it also impacted CMT on SPARC). I've been > told that ASLR may render the manual stack-coloring unnecessary (as well > as ineffective), but then we'd need to establish which platforms have > that enabled, and whether we can tell. > > The benchmarking has been inconclusive - no observable differences. But > the benchmarking machines (and indeed the benchmarks) may not be > representative of the context where this optimization is needed. > > It has also been raised (as it was here) whether the alloca is even left > in place by the compiler. Something I have yet to check. call _ZN6Thread26record_stack_base_and_sizeEv at PLT .LVL2492: .loc 3 666 3 is_stmt 1 view .LVU7882 .loc 3 667 3 view .LVU7883 .LBB6399: .LBI6399: .loc 3 1359 5 view .LVU7884 .LBB6400: .loc 3 1360 3 view .LVU7885 .loc 3 1360 18 is_stmt 0 view .LVU7886 call getpid at PLT .LVL2493: .LBE6400: .LBE6399: .loc 3 668 3 is_stmt 1 view .LVU7887 .loc 3 670 36 is_stmt 0 view .LVU7888 movq %r13, %rdi .loc 3 668 3 view .LVU7889 addl $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) .loc 3 670 3 is_stmt 1 view .LVU7890 .loc 3 670 36 is_stmt 0 view .LVU7891 call _ZN6Thread25initialize_thread_currentEv at PLT Appears the alloca has been elided, along with the arithmetic. Don't know if that is true for other platforms. David ----- >>> BTW this is not the first time we have had this issue with gcc making a >>> function deprecated and casting to void not fixing it. Unfortunately I >>> can't remember enough of the previous case's details to actually look it >>> up and see what we resolved to do. :( >> >> I found >> [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and >> it fixed unused-return as following: >> >> https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 >> >> >> I prefer it rather than `(void)!`. > > But introducing a local (which was already suggested earlier) that is > unused in a product build may also trigger a warning from the compiler > and we are back where we started. > > David > >> ------------- >> >> PR: https://git.openjdk.java.net/jdk/pull/3042 >> From thomas.stuefe at gmail.com Fri Mar 19 05:56:20 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 19 Mar 2021 06:56:20 +0100 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <43ab96d3-1091-5b8b-5d2b-9e2006c17cd6@oracle.com> References: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> <43ab96d3-1091-5b8b-5d2b-9e2006c17cd6@oracle.com> Message-ID: On Fri, Mar 19, 2021 at 6:49 AM David Holmes wrote: > Okay so this may all be moot :) > > On 19/03/2021 3:09 pm, David Holmes wrote: > > On 19/03/2021 10:17 am, Yasumasa Suenaga wrote: > >> On Fri, 19 Mar 2021 00:02:21 GMT, Yasumasa Suenaga > >> wrote: > >> > >>>>> The alloca was added as a performance boost for hyperthreaded systems > >>>>> back in 2003 for JDK 5: > >>>>> > >>>>> "A per-thread offset was added to each thread's stack to randomize > the > >>>>> cachelines of hot stack frames (aka, stack coloring)." > >>>> > >>>> IIUC it relates to share DTLB between two logical processors when > >>>> Hyperthreading is enabled. > >>>> > >>>>> I'm running some of our benchmarks to see if removing it makes a > >>>>> difference ... but chances are I'm not going to be running it on the > >>>>> kind of machines for which it was introduced. > >>>> > >>>> Thanks! > >>>> > >>>>>> If we decide to remain this code, we need to avoid unused-result > >>>>>> warning from GCC. I think we can use pragma because other pragmas > >>>>>> (format-nonliteral, format-security, stringop-truncation") are > >>>>>> used in HotSpot. In addition, this behavior does not seem to > >>>>>> determine to be a bug in GCC (status is UNCONFIRMED). However I > >>>>>> will agree to use `(void)!` if it is still preferred based on them. > >>>>> > >>>>> I'd reluctantly prefer to use the pragma as an official mechanism for > >>>>> avoiding the warning. > >>>> > >>>> Ok, OpenJDK folks who discuss in this PR are not prefer to use > >>>> pragma, so I will use `(void)!` if it is needed. > >>>> Let's see the result of benchmark and discuss what should we do. > >>> > >>>>> The alloca was added as a performance boost for hyperthreaded systems > >>>>> back in 2003 for JDK 5: > >>>>> "A per-thread offset was added to each thread's stack to randomize > the > >>>>> cachelines of hot stack frames (aka, stack coloring)." > >>>> > >>>> IIUC it relates to share DTLB between two logical processors when > >>>> Hyperthreading is enabled. > >>> > >>> `alloca()` call might be less effective if ASLR is enabled. It can be > >>> configured by > >>> [randomize_va_space]( > https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#randomize-va-space), > > >>> and I guess it is enabled in a lot of x86 systems. > > > > It relates to L1 cache index collisions primarily so only impacts > > hyperthreading on Intel CPUs (it also impacted CMT on SPARC). I've been > > told that ASLR may render the manual stack-coloring unnecessary (as well > > as ineffective), but then we'd need to establish which platforms have > > that enabled, and whether we can tell. > > > > The benchmarking has been inconclusive - no observable differences. But > > the benchmarking machines (and indeed the benchmarks) may not be > > representative of the context where this optimization is needed. > > > > It has also been raised (as it was here) whether the alloca is even left > > in place by the compiler. Something I have yet to check. > > call _ZN6Thread26record_stack_base_and_sizeEv at PLT > .LVL2492: > .loc 3 666 3 is_stmt 1 view .LVU7882 > .loc 3 667 3 view .LVU7883 > .LBB6399: > .LBI6399: > .loc 3 1359 5 view .LVU7884 > .LBB6400: > .loc 3 1360 3 view .LVU7885 > .loc 3 1360 18 is_stmt 0 view .LVU7886 > call getpid at PLT > .LVL2493: > .LBE6400: > .LBE6399: > .loc 3 668 3 is_stmt 1 view .LVU7887 > .loc 3 670 36 is_stmt 0 view .LVU7888 > movq %r13, %rdi > .loc 3 668 3 view .LVU7889 > addl $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) > .loc 3 670 3 is_stmt 1 view .LVU7890 > .loc 3 670 36 is_stmt 0 view .LVU7891 > call _ZN6Thread25initialize_thread_currentEv at PLT > > Appears the alloca has been elided, along with the arithmetic. > > Don't know if that is true for other platforms. > > David > ----- > > Which would explain why we don't see effects in benchmarks. Question is, do we repair this or remove it. ..Thomas > >>> BTW this is not the first time we have had this issue with gcc making a > >>> function deprecated and casting to void not fixing it. Unfortunately I > >>> can't remember enough of the previous case's details to actually look > it > >>> up and see what we resolved to do. :( > >> > >> I found > >> [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and > >> it fixed unused-return as following: > >> > >> > https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 > >> > >> > >> I prefer it rather than `(void)!`. > > > > But introducing a local (which was already suggested earlier) that is > > unused in a product build may also trigger a warning from the compiler > > and we are back where we started. > > > > David > > > >> ------------- > >> > >> PR: https://git.openjdk.java.net/jdk/pull/3042 > >> > From david.holmes at oracle.com Fri Mar 19 06:08:09 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Mar 2021 16:08:09 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> <43ab96d3-1091-5b8b-5d2b-9e2006c17cd6@oracle.com> Message-ID: <67bb986a-b73d-b388-7f2a-d12b8ed29f30@oracle.com> On 19/03/2021 3:56 pm, Thomas St?fe wrote: > > > On Fri, Mar 19, 2021 at 6:49 AM David Holmes > wrote: > > Okay so this may all be moot :) > > On 19/03/2021 3:09 pm, David Holmes wrote: > > On 19/03/2021 10:17 am, Yasumasa Suenaga wrote: > >> On Fri, 19 Mar 2021 00:02:21 GMT, Yasumasa Suenaga > >> > wrote: > >> > >>>>> The alloca was added as a performance boost for hyperthreaded > systems > >>>>> back in 2003 for JDK 5: > >>>>> > >>>>> "A per-thread offset was added to each thread's stack to > randomize the > >>>>> cachelines of hot stack frames (aka, stack coloring)." > >>>> > >>>> IIUC it relates to share DTLB between two logical processors when > >>>> Hyperthreading is enabled. > >>>> > >>>>> I'm running some of our benchmarks to see if removing it makes a > >>>>> difference ... but chances are I'm not going to be running it > on the > >>>>> kind of machines for which it was introduced. > >>>> > >>>> Thanks! > >>>> > >>>>>> If we decide to remain this code, we need to avoid > unused-result > >>>>>> warning from GCC. I think we can use pragma because other > pragmas > >>>>>> (format-nonliteral, format-security, stringop-truncation") are > >>>>>> used in HotSpot. In addition, this behavior does not seem to > >>>>>> determine to be a bug in GCC (status is UNCONFIRMED). However I > >>>>>> will agree to use `(void)!` if it is still preferred based > on them. > >>>>> > >>>>> I'd reluctantly prefer to use the pragma as an official > mechanism for > >>>>> avoiding the warning. > >>>> > >>>> Ok, OpenJDK folks who discuss in this PR are not prefer to use > >>>> pragma, so I will use `(void)!` if it is needed. > >>>> Let's see the result of benchmark and discuss what should we do. > >>> > >>>>> The alloca was added as a performance boost for hyperthreaded > systems > >>>>> back in 2003 for JDK 5: > >>>>> "A per-thread offset was added to each thread's stack to > randomize the > >>>>> cachelines of hot stack frames (aka, stack coloring)." > >>>> > >>>> IIUC it relates to share DTLB between two logical processors when > >>>> Hyperthreading is enabled. > >>> > >>> `alloca()` call might be less effective if ASLR is enabled. It > can be > >>> configured by > >>> > [randomize_va_space](https://www.kernel.org/doc/html/latest/admin-guide/sysctl/kernel.html#randomize-va-space > ), > > >>> and I guess it is enabled in a lot of x86 systems. > > > > It relates to L1 cache index collisions primarily so only impacts > > hyperthreading on Intel CPUs (it also impacted CMT on SPARC). > I've been > > told that ASLR may render the manual stack-coloring unnecessary > (as well > > as ineffective), but then we'd need to establish which platforms > have > > that enabled, and whether we can tell. > > > > The benchmarking has been inconclusive - no observable > differences. But > > the benchmarking machines (and indeed the benchmarks) may not be > > representative of the context where this optimization is needed. > > > > It has also been raised (as it was here) whether the alloca is > even left > > in place by the compiler. Something I have yet to check. > > ? ? ? ? call? ? _ZN6Thread26record_stack_base_and_sizeEv at PLT > .LVL2492: > ? ? ? ? .loc 3 666 3 is_stmt 1 view .LVU7882 > ? ? ? ? .loc 3 667 3 view .LVU7883 > .LBB6399: > .LBI6399: > ? ? ? ? .loc 3 1359 5 view .LVU7884 > .LBB6400: > ? ? ? ? .loc 3 1360 3 view .LVU7885 > ? ? ? ? .loc 3 1360 18 is_stmt 0 view .LVU7886 > ? ? ? ? call? ? getpid at PLT > .LVL2493: > .LBE6400: > .LBE6399: > ? ? ? ? .loc 3 668 3 is_stmt 1 view .LVU7887 > ? ? ? ? .loc 3 670 36 is_stmt 0 view .LVU7888 > ? ? ? ? movq? ? %r13, %rdi > ? ? ? ? .loc 3 668 3 view .LVU7889 > ? ? ? ? addl? ? $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) > ? ? ? ? .loc 3 670 3 is_stmt 1 view .LVU7890 > ? ? ? ? .loc 3 670 36 is_stmt 0 view .LVU7891 > ? ? ? ? call? ? _ZN6Thread25initialize_thread_currentEv at PLT > > Appears the alloca has been elided, along with the arithmetic. > > Don't know if that is true for other platforms. > > David > ----- > > > Which would explain why we don't see effects in benchmarks. Question is, > do we repair this or remove it. I'm trying a repair and re-running the benchmarks. The repair is simply: static void* _stack_pad = alloca(...); David ----- > ..Thomas > > >>> BTW this is not the first time we have had this issue with gcc > making a > >>> function deprecated and casting to void not fixing it. > Unfortunately I > >>> can't remember enough of the previous case's details to > actually look it > >>> up and see what we resolved to do. :( > >> > >> I found > >> [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689 > ), and > >> it fixed unused-return as following: > >> > >> > https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 > > > >> > >> > >> I prefer it rather than `(void)!`. > > > > But introducing a local (which was already suggested earlier) > that is > > unused in a product build may also trigger a warning from the > compiler > > and we are back where we started. > > > > David > > > >> ------------- > >> > >> PR: https://git.openjdk.java.net/jdk/pull/3042 > > >> > From ysuenaga at openjdk.java.net Fri Mar 19 06:11:38 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 19 Mar 2021 06:11:38 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 04:44:55 GMT, Thomas Stuefe wrote: >>> BTW this is not the first time we have had this issue with gcc making a >>> function deprecated and casting to void not fixing it. Unfortunately I >>> can't remember enough of the previous case's details to actually look it >>> up and see what we resolved to do. :( >> >> I found [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and it fixed unused-return as following: >> >> https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 >> >> I prefer it rather than `(void)!`. > >> > BTW this is not the first time we have had this issue with gcc making a >> > function deprecated and casting to void not fixing it. Unfortunately I >> > can't remember enough of the previous case's details to actually look it >> > up and see what we resolved to do. :( >> >> I found [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and it fixed unused-return as following: >> >> https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 >> >> I prefer it rather than `(void)!`. > > Does that work in release builds too? I checked `alloca()` call on Fedora 33 x86_64 and Alpine 3.13 x86_64. Both of them seem to elide `alloca()`. (It is fastdebug build - slow debug might be different, but production build might be same) 659 thread->record_stack_base_and_size(); 0x00007ffff7154d44 <+20>: call 0x7ffff75b3a20 <_ZN6Thread26record_stack_base_and_sizeEv> 660 661 // Try to randomize the cache line index of hot stack frames. 662 // This helps when threads of the same stack traces evict each other's 663 // cache lines. The threads can be either from the same JVM instance, or 664 // from different JVM instances. The benefit is especially true for 665 // processors with hyperthreading technology. 666 static int counter = 0; 667 int pid = os::current_process_id(); 668 alloca(((pid ^ counter++) & 7) * 128); 0x00007ffff7154d51 <+33>: addl $0x1,0xc1db10(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> 669 670 thread->initialize_thread_current(); => 0x00007ffff7154d4e <+30>: mov %r13,%rdi 0x00007ffff7154d58 <+40>: call 0x7ffff75b3800 <_ZN6Thread25initialize_thread_currentEv> > I prefer it rather than (void)!. > > Does that work in release builds too? It will not work as David said :) we need to use `(void)!` if we should left it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From iklam at openjdk.java.net Fri Mar 19 06:28:55 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 19 Mar 2021 06:28:55 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry Message-ID: Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: (gdb) p this $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 (gdb) p *this $2 = (gdb) ptype this type = const class PlaceholderEntry { } * const The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. Tested with tiers 1-4. ------------- Commit messages: - 8263834: Work around gdb for HashtableEntry Changes: https://git.openjdk.java.net/jdk/pull/3084/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3084&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263834 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3084.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3084/head:pull/3084 PR: https://git.openjdk.java.net/jdk/pull/3084 From david.holmes at oracle.com Fri Mar 19 06:47:06 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Mar 2021 16:47:06 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <67bb986a-b73d-b388-7f2a-d12b8ed29f30@oracle.com> References: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> <43ab96d3-1091-5b8b-5d2b-9e2006c17cd6@oracle.com> <67bb986a-b73d-b388-7f2a-d12b8ed29f30@oracle.com> Message-ID: On 19/03/2021 4:08 pm, David Holmes wrote: > On 19/03/2021 3:56 pm, Thomas St?fe wrote: >> ???? > It has also been raised (as it was here) whether the alloca is >> ??? even left >> ???? > in place by the compiler. Something I have yet to check. >> >> ???? ? ? ? ? call? ? _ZN6Thread26record_stack_base_and_sizeEv at PLT >> ??? .LVL2492: >> ???? ? ? ? ? .loc 3 666 3 is_stmt 1 view .LVU7882 >> ???? ? ? ? ? .loc 3 667 3 view .LVU7883 >> ??? .LBB6399: >> ??? .LBI6399: >> ???? ? ? ? ? .loc 3 1359 5 view .LVU7884 >> ??? .LBB6400: >> ???? ? ? ? ? .loc 3 1360 3 view .LVU7885 >> ???? ? ? ? ? .loc 3 1360 18 is_stmt 0 view .LVU7886 >> ???? ? ? ? ? call? ? getpid at PLT >> ??? .LVL2493: >> ??? .LBE6400: >> ??? .LBE6399: >> ???? ? ? ? ? .loc 3 668 3 is_stmt 1 view .LVU7887 >> ???? ? ? ? ? .loc 3 670 36 is_stmt 0 view .LVU7888 >> ???? ? ? ? ? movq? ? %r13, %rdi >> ???? ? ? ? ? .loc 3 668 3 view .LVU7889 >> ???? ? ? ? ? addl? ? $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) >> ???? ? ? ? ? .loc 3 670 3 is_stmt 1 view .LVU7890 >> ???? ? ? ? ? .loc 3 670 36 is_stmt 0 view .LVU7891 >> ???? ? ? ? ? call? ? _ZN6Thread25initialize_thread_currentEv at PLT >> >> ??? Appears the alloca has been elided, along with the arithmetic. >> >> ??? Don't know if that is true for other platforms. >> >> ??? David >> ??? ----- >> >> >> Which would explain why we don't see effects in benchmarks. Question >> is, do we repair this or remove it. > > I'm trying a repair and re-running the benchmarks. > > The repair is simply: > > static void* _stack_pad = alloca(...); That doesn't seem to work either: call getpid at PLT .LVL2493: .LBE6400: .LBE6399: .loc 3 668 3 is_stmt 1 view .LVU7887 .loc 3 668 29 is_stmt 0 view .LVU7888 movzbl _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), %eax testb %al, %al je .L2250 .L2214: .loc 3 670 3 is_stmt 1 view .LVU7889 .loc 3 670 36 is_stmt 0 view .LVU7890 movq %r13, %rdi call _ZN6Thread25initialize_thread_currentEv at PLT and: .L2250: .cfi_restore_state .loc 3 668 29 discriminator 1 view .LVU8015 leaq _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), %rdi call __cxa_guard_acquire at PLT .LVL2520: testl %eax, %eax je .L2214 .loc 3 668 29 discriminator 2 view .LVU8016 leaq _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), %rdi addl $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) call __cxa_guard_release at PLT .LVL2521: jmp .L2214 I'm not really sure what that is doing but now even the counter++ has been elided, along with the calculation and the alloca. ?? Then I tried: void* _stack_pad = alloca(((pid ^ counter++) & 7) * 128); if (_stack_pad == 0) counter--; and still no alloca: call _ZN6Thread26record_stack_base_and_sizeEv at PLT .LVL2492: .loc 3 666 3 is_stmt 1 view .LVU7882 .loc 3 667 3 view .LVU7883 .LBB6399: .LBI6399: .loc 3 1360 5 view .LVU7884 .LBB6400: .loc 3 1361 3 view .LVU7885 .loc 3 1361 18 is_stmt 0 view .LVU7886 call getpid at PLT .LVL2493: .LBE6400: .LBE6399: .loc 3 668 3 is_stmt 1 view .LVU7887 .loc 3 671 36 is_stmt 0 view .LVU7888 movq %r13, %rdi .loc 3 668 22 view .LVU7889 addl $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) .loc 3 669 3 is_stmt 1 view .LVU7890 .loc 3 671 3 view .LVU7891 .loc 3 671 36 is_stmt 0 view .LVU7892 call _ZN6Thread25initialize_thread_currentEv at PLT Have I missed something obvious here? (This is g++ -S output). David ----- >> ???? >> PR: https://git.openjdk.java.net/jdk/pull/3042 >> ??? >> ???? >> >> From thomas.stuefe at gmail.com Fri Mar 19 06:52:32 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 19 Mar 2021 07:52:32 +0100 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> <43ab96d3-1091-5b8b-5d2b-9e2006c17cd6@oracle.com> <67bb986a-b73d-b388-7f2a-d12b8ed29f30@oracle.com> Message-ID: It seems reasonable, since counter and alloca have no visible effect apart from affecting each other. - just assign the return value to a volatile pointer? - and/or just write something into the alloca'd memory? (see also _expand_stack_to) ..Thomas On Fri, Mar 19, 2021 at 7:47 AM David Holmes wrote: > On 19/03/2021 4:08 pm, David Holmes wrote: > > On 19/03/2021 3:56 pm, Thomas St?fe wrote: > >> > It has also been raised (as it was here) whether the alloca is > >> even left > >> > in place by the compiler. Something I have yet to check. > >> > >> call _ZN6Thread26record_stack_base_and_sizeEv at PLT > >> .LVL2492: > >> .loc 3 666 3 is_stmt 1 view .LVU7882 > >> .loc 3 667 3 view .LVU7883 > >> .LBB6399: > >> .LBI6399: > >> .loc 3 1359 5 view .LVU7884 > >> .LBB6400: > >> .loc 3 1360 3 view .LVU7885 > >> .loc 3 1360 18 is_stmt 0 view .LVU7886 > >> call getpid at PLT > >> .LVL2493: > >> .LBE6400: > >> .LBE6399: > >> .loc 3 668 3 is_stmt 1 view .LVU7887 > >> .loc 3 670 36 is_stmt 0 view .LVU7888 > >> movq %r13, %rdi > >> .loc 3 668 3 view .LVU7889 > >> addl $1, > _ZZL19thread_native_entryP6ThreadE7counter(%rip) > >> .loc 3 670 3 is_stmt 1 view .LVU7890 > >> .loc 3 670 36 is_stmt 0 view .LVU7891 > >> call _ZN6Thread25initialize_thread_currentEv at PLT > >> > >> Appears the alloca has been elided, along with the arithmetic. > >> > >> Don't know if that is true for other platforms. > >> > >> David > >> ----- > >> > >> > >> Which would explain why we don't see effects in benchmarks. Question > >> is, do we repair this or remove it. > > > > I'm trying a repair and re-running the benchmarks. > > > > The repair is simply: > > > > static void* _stack_pad = alloca(...); > > That doesn't seem to work either: > > call getpid at PLT > .LVL2493: > .LBE6400: > .LBE6399: > .loc 3 668 3 is_stmt 1 view .LVU7887 > .loc 3 668 29 is_stmt 0 view .LVU7888 > movzbl _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), > %eax > testb %al, %al > je .L2250 > .L2214: > .loc 3 670 3 is_stmt 1 view .LVU7889 > .loc 3 670 36 is_stmt 0 view .LVU7890 > movq %r13, %rdi > call _ZN6Thread25initialize_thread_currentEv at PLT > > and: > > .L2250: > .cfi_restore_state > .loc 3 668 29 discriminator 1 view .LVU8015 > leaq _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), > %rdi > call __cxa_guard_acquire at PLT > .LVL2520: > testl %eax, %eax > je .L2214 > .loc 3 668 29 discriminator 2 view .LVU8016 > leaq _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), > %rdi > addl $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) > call __cxa_guard_release at PLT > .LVL2521: > jmp .L2214 > > I'm not really sure what that is doing but now even the counter++ has > been elided, along with the calculation and the alloca. ?? > > Then I tried: > > void* _stack_pad = alloca(((pid ^ counter++) & 7) * 128); > if (_stack_pad == 0) counter--; > > and still no alloca: > > call _ZN6Thread26record_stack_base_and_sizeEv at PLT > .LVL2492: > .loc 3 666 3 is_stmt 1 view .LVU7882 > .loc 3 667 3 view .LVU7883 > .LBB6399: > .LBI6399: > .loc 3 1360 5 view .LVU7884 > .LBB6400: > .loc 3 1361 3 view .LVU7885 > .loc 3 1361 18 is_stmt 0 view .LVU7886 > call getpid at PLT > .LVL2493: > .LBE6400: > .LBE6399: > .loc 3 668 3 is_stmt 1 view .LVU7887 > .loc 3 671 36 is_stmt 0 view .LVU7888 > movq %r13, %rdi > .loc 3 668 22 view .LVU7889 > addl $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) > .loc 3 669 3 is_stmt 1 view .LVU7890 > .loc 3 671 3 view .LVU7891 > .loc 3 671 36 is_stmt 0 view .LVU7892 > call _ZN6Thread25initialize_thread_currentEv at PLT > > Have I missed something obvious here? (This is g++ -S output). > > David > ----- > > >> >> PR: https://git.openjdk.java.net/jdk/pull/3042 > >> > >> >> > >> > From dholmes at openjdk.java.net Fri Mar 19 06:54:38 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 19 Mar 2021 06:54:38 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 03:20:21 GMT, Ioi Lam wrote: > Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: > > (gdb) p this > $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 > (gdb) p *this > $2 = > (gdb) ptype this > type = const class PlaceholderEntry { > > } * const > > The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. > > Tested with tiers 1-4. Does this affect NMT? Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/3084 From iklam at openjdk.java.net Fri Mar 19 07:02:39 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 19 Mar 2021 07:02:39 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: <7ugAkgMJWVsEHBK8Ea68ishT__Pdu20ZFfRQCE3DUjo=.8fe43185-13b0-43ac-87c6-ebf9e21635d9@github.com> On Fri, 19 Mar 2021 06:51:42 GMT, David Holmes wrote: > Does this affect NMT? It doesn't. The part of the code that interacts with NMT is unchanged: template BasicHashtableEntry* BasicHashtable::new_entry(unsigned int hashValue) { BasicHashtableEntry* entry = new_entry_free_list(); if (entry == NULL) { if (_first_free_entry + _entry_size >= _end_block) { int block_size = MAX2((int)_table_size / 2, (int)_number_of_entries); // pick a reasonable value block_size = clamp(block_size, 2, 512); // but never go out of this range int len = round_down_power_of_2(_entry_size * block_size); assert(len >= _entry_size, ""); _first_free_entry = NEW_C_HEAP_ARRAY2(char, len, F, CURRENT_PC); <<<<<<< HERE _entry_blocks.append(_first_free_entry); _end_block = _first_free_entry + len; } entry = (BasicHashtableEntry*)_first_free_entry; _first_free_entry += _entry_size; } assert(_entry_size % HeapWordSize == 0, ""); entry->set_hash(hashValue); return entry; } ------------- PR: https://git.openjdk.java.net/jdk/pull/3084 From stuefe at openjdk.java.net Fri Mar 19 07:09:39 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 19 Mar 2021 07:09:39 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 03:20:21 GMT, Ioi Lam wrote: > Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: > > (gdb) p this > $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 > (gdb) p *this > $2 = > (gdb) ptype this > type = const class PlaceholderEntry { > > } * const > > The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. > > Tested with tiers 1-4. Makes sense. ------------- Marked as reviewed by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3084 From stuefe at openjdk.java.net Fri Mar 19 07:09:40 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 19 Mar 2021 07:09:40 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 06:51:42 GMT, David Holmes wrote: > Does this affect NMT? > > Thanks, > David The entries lived in pre-allocated blocks and were placed manually without even invoking constructors. NMT does not care. ------------- PR: https://git.openjdk.java.net/jdk/pull/3084 From dholmes at openjdk.java.net Fri Mar 19 07:41:40 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 19 Mar 2021 07:41:40 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 03:20:21 GMT, Ioi Lam wrote: > Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: > > (gdb) p this > $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 > (gdb) p *this > $2 = > (gdb) ptype this > type = const class PlaceholderEntry { > > } * const > > The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. > > Tested with tiers 1-4. That was my only query so LGTM! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3084 From david.holmes at oracle.com Fri Mar 19 07:46:24 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Mar 2021 17:46:24 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <754b2753-06ee-60ed-464b-fd3dcb552324@oracle.com> <43ab96d3-1091-5b8b-5d2b-9e2006c17cd6@oracle.com> <67bb986a-b73d-b388-7f2a-d12b8ed29f30@oracle.com> Message-ID: <71a4f9f0-332c-b17c-da96-c1b6830c01d5@oracle.com> On 19/03/2021 4:52 pm, Thomas St?fe wrote: > It seems reasonable, since counter and alloca have no visible effect > apart from affecting each other. > - just assign the return value to a volatile pointer? > - and/or just write something into the alloca'd memory? (see > also?_expand_stack_to) + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128); + if (_stack_pad != 0) { + ((char*)_stack_pad)[0] = 1; + } still didn't call alloca, or show the calculation of the value passed to alloca. call getpid at PLT .LVL2493: movl %eax, %ebx .LVL2494: .loc 3 1363 18 view .LVU7888 .LBE6400: .LBE6399: .loc 3 668 3 is_stmt 1 view .LVU7889 .loc 3 668 38 is_stmt 0 view .LVU7890 movzbl _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), %eax .LVL2495: .loc 3 668 38 view .LVU7891 testb %al, %al je .L2255 .L2214: .loc 3 669 3 is_stmt 1 view .LVU7892 .loc 3 669 7 is_stmt 0 view .LVU7893 movq _ZZL19thread_native_entryP6ThreadE10_stack_pad(%rip), %rax .loc 3 669 3 view .LVU7894 testq %rax, %rax je .L2216 .loc 3 670 5 is_stmt 1 view .LVU7895 .loc 3 670 6 is_stmt 0 view .LVU7896 movq _ZZL19thread_native_entryP6ThreadE10_stack_pad(%rip), %rax .loc 3 670 28 view .LVU7897 movb $1, (%rax) .L2216: .loc 3 673 3 is_stmt 1 view .LVU7898 .loc 3 673 36 is_stmt 0 view .LVU7899 movq %r13, %rdi call _ZN6Thread25initialize_thread_currentEv at PLT I don't know what to suggest. David ----- > ..Thomas > > On Fri, Mar 19, 2021 at 7:47 AM David Holmes > wrote: > > On 19/03/2021 4:08 pm, David Holmes wrote: > > On 19/03/2021 3:56 pm, Thomas St?fe wrote: > >> ???? > It has also been raised (as it was here) whether the > alloca is > >> ??? even left > >> ???? > in place by the compiler. Something I have yet to check. > >> > >> ???? ? ? ? ? call? ? _ZN6Thread26record_stack_base_and_sizeEv at PLT > >> ??? .LVL2492: > >> ???? ? ? ? ? .loc 3 666 3 is_stmt 1 view .LVU7882 > >> ???? ? ? ? ? .loc 3 667 3 view .LVU7883 > >> ??? .LBB6399: > >> ??? .LBI6399: > >> ???? ? ? ? ? .loc 3 1359 5 view .LVU7884 > >> ??? .LBB6400: > >> ???? ? ? ? ? .loc 3 1360 3 view .LVU7885 > >> ???? ? ? ? ? .loc 3 1360 18 is_stmt 0 view .LVU7886 > >> ???? ? ? ? ? call? ? getpid at PLT > >> ??? .LVL2493: > >> ??? .LBE6400: > >> ??? .LBE6399: > >> ???? ? ? ? ? .loc 3 668 3 is_stmt 1 view .LVU7887 > >> ???? ? ? ? ? .loc 3 670 36 is_stmt 0 view .LVU7888 > >> ???? ? ? ? ? movq? ? %r13, %rdi > >> ???? ? ? ? ? .loc 3 668 3 view .LVU7889 > >> ???? ? ? ? ? addl? ? $1, > _ZZL19thread_native_entryP6ThreadE7counter(%rip) > >> ???? ? ? ? ? .loc 3 670 3 is_stmt 1 view .LVU7890 > >> ???? ? ? ? ? .loc 3 670 36 is_stmt 0 view .LVU7891 > >> ???? ? ? ? ? call? ? _ZN6Thread25initialize_thread_currentEv at PLT > >> > >> ??? Appears the alloca has been elided, along with the arithmetic. > >> > >> ??? Don't know if that is true for other platforms. > >> > >> ??? David > >> ??? ----- > >> > >> > >> Which would explain why we don't see effects in benchmarks. > Question > >> is, do we repair this or remove it. > > > > I'm trying a repair and re-running the benchmarks. > > > > The repair is simply: > > > > static void* _stack_pad = alloca(...); > > That doesn't seem to work either: > > ? ? ? ? ?call? ? getpid at PLT > .LVL2493: > .LBE6400: > .LBE6399: > ? ? ? ? ?.loc 3 668 3 is_stmt 1 view .LVU7887 > ? ? ? ? ?.loc 3 668 29 is_stmt 0 view .LVU7888 > ? ? ? ? ?movzbl > _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), > %eax > ? ? ? ? ?testb? ?%al, %al > ? ? ? ? ?je? ? ? .L2250 > .L2214: > ? ? ? ? ?.loc 3 670 3 is_stmt 1 view .LVU7889 > ? ? ? ? ?.loc 3 670 36 is_stmt 0 view .LVU7890 > ? ? ? ? ?movq? ? %r13, %rdi > ? ? ? ? ?call? ? _ZN6Thread25initialize_thread_currentEv at PLT > > and: > > .L2250: > ? ? ? ? ?.cfi_restore_state > ? ? ? ? ?.loc 3 668 29 discriminator 1 view .LVU8015 > ? ? ? ? ?leaq > _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), > %rdi > ? ? ? ? ?call? ? __cxa_guard_acquire at PLT > .LVL2520: > ? ? ? ? ?testl? ?%eax, %eax > ? ? ? ? ?je? ? ? .L2214 > ? ? ? ? ?.loc 3 668 29 discriminator 2 view .LVU8016 > ? ? ? ? ?leaq > _ZGVZL19thread_native_entryP6ThreadE10_stack_pad(%rip), > %rdi > ? ? ? ? ?addl? ? $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) > ? ? ? ? ?call? ? __cxa_guard_release at PLT > .LVL2521: > ? ? ? ? ?jmp? ? ?.L2214 > > I'm not really sure what that is doing but now even the counter++ has > been elided, along with the calculation and the alloca. ?? > > Then I tried: > > void* _stack_pad = alloca(((pid ^ counter++) & 7) * 128); > if (_stack_pad == 0) counter--; > > and still no alloca: > > ? ? ? ? ?call? ? _ZN6Thread26record_stack_base_and_sizeEv at PLT > .LVL2492: > ? ? ? ? ?.loc 3 666 3 is_stmt 1 view .LVU7882 > ? ? ? ? ?.loc 3 667 3 view .LVU7883 > .LBB6399: > .LBI6399: > ? ? ? ? ?.loc 3 1360 5 view .LVU7884 > .LBB6400: > ? ? ? ? ?.loc 3 1361 3 view .LVU7885 > ? ? ? ? ?.loc 3 1361 18 is_stmt 0 view .LVU7886 > ? ? ? ? ?call? ? getpid at PLT > .LVL2493: > .LBE6400: > .LBE6399: > ? ? ? ? ?.loc 3 668 3 is_stmt 1 view .LVU7887 > ? ? ? ? ?.loc 3 671 36 is_stmt 0 view .LVU7888 > ? ? ? ? ?movq? ? %r13, %rdi > ? ? ? ? ?.loc 3 668 22 view .LVU7889 > ? ? ? ? ?addl? ? $1, _ZZL19thread_native_entryP6ThreadE7counter(%rip) > ? ? ? ? ?.loc 3 669 3 is_stmt 1 view .LVU7890 > ? ? ? ? ?.loc 3 671 3 view .LVU7891 > ? ? ? ? ?.loc 3 671 36 is_stmt 0 view .LVU7892 > ? ? ? ? ?call? ? _ZN6Thread25initialize_thread_currentEv at PLT > > Have I missed something obvious here? (This is g++ -S output). > > David > ----- > > >> ???? >> PR: https://git.openjdk.java.net/jdk/pull/3042 > > >> ??? > > >> ???? >> > >> > From ysuenaga at openjdk.java.net Fri Mar 19 08:50:40 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 19 Mar 2021 08:50:40 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 06:09:16 GMT, Yasumasa Suenaga wrote: >>> > BTW this is not the first time we have had this issue with gcc making a >>> > function deprecated and casting to void not fixing it. Unfortunately I >>> > can't remember enough of the previous case's details to actually look it >>> > up and see what we resolved to do. :( >>> >>> I found [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and it fixed unused-return as following: >>> >>> https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 >>> >>> I prefer it rather than `(void)!`. >> >> Does that work in release builds too? > > I checked `alloca()` call on Fedora 33 x86_64 and Alpine 3.13 x86_64. Both of them seem to elide `alloca()`. > (It is fastdebug build - slow debug might be different, but production build might be same) > > 659 thread->record_stack_base_and_size(); > 0x00007ffff7154d44 <+20>: call 0x7ffff75b3a20 <_ZN6Thread26record_stack_base_and_sizeEv> > > 660 > 661 // Try to randomize the cache line index of hot stack frames. > 662 // This helps when threads of the same stack traces evict each other's > 663 // cache lines. The threads can be either from the same JVM instance, or > 664 // from different JVM instances. The benefit is especially true for > 665 // processors with hyperthreading technology. > 666 static int counter = 0; > > 667 int pid = os::current_process_id(); > > 668 alloca(((pid ^ counter++) & 7) * 128); > 0x00007ffff7154d51 <+33>: addl $0x1,0xc1db10(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> > > 669 > 670 thread->initialize_thread_current(); > => 0x00007ffff7154d4e <+30>: mov %r13,%rdi > 0x00007ffff7154d58 <+40>: call 0x7ffff75b3800 <_ZN6Thread25initialize_thread_currentEv> > >> I prefer it rather than (void)!. >> >> Does that work in release builds too? > > It will not work as David said :) we need to use `(void)!` if we should left it. > + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128); > + if (_stack_pad != 0) { > + ((char*)_stack_pad)[0] = 1; > + } I guess `_stack_pad` will be overwritten in each `threaad_native_entry()` call, so it might be elided. I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp index 5af63befb58..bdb2dc89615 100644 --- a/src/hotspot/os/linux/os_linux.cpp +++ b/src/hotspot/os/linux/os_linux.cpp @@ -665,7 +665,8 @@ static void *thread_native_entry(Thread *thread) { // processors with hyperthreading technology. static int counter = 0; int pid = os::current_process_id(); - alloca(((pid ^ counter++) & 7) * 128); + void *ptr = alloca(((pid ^ counter++) & 7) * 128); + ((char *)ptr)[0] = 1; thread->initialize_thread_current(); 659 thread->record_stack_base_and_size(); 0x00007ffff7154d53 <+35>: call 0x7ffff75b3a80 <_ZN6Thread26record_stack_base_and_sizeEv> 660 661 // Try to randomize the cache line index of hot stack frames. 662 // This helps when threads of the same stack traces evict each other's 663 // cache lines. The threads can be either from the same JVM instance, or 664 // from different JVM instances. The benefit is especially true for 665 // processors with hyperthreading technology. 666 static int counter = 0; 667 int pid = os::current_process_id(); 668 void *ptr = alloca(((pid ^ counter++) & 7) * 128); 0x00007ffff7154d63 <+51>: mov 0xc1daff(%rip),%eax # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> 0x00007ffff7154d69 <+57>: lea 0x1(%rax),%edx 0x00007ffff7154d6c <+60>: xor %r8d,%eax 0x00007ffff7154d6f <+63>: shl $0x7,%rax 0x00007ffff7154d73 <+67>: mov %edx,0xc1daef(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> 0x00007ffff7154d79 <+73>: and $0x380,%eax 0x00007ffff7154d7e <+78>: add $0x17,%rax 0x00007ffff7154d82 <+82>: and $0x7f0,%eax 0x00007ffff7154d87 <+87>: sub %rax,%rsp 0x00007ffff7154d8a <+90>: lea 0xf(%rsp),%rax 0x00007ffff7154d8f <+95>: and $0xfffffffffffffff0,%rax 669 ((char *)ptr)[0] = 1; 0x00007ffff7154d93 <+99>: movb $0x1,(%rax) ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From david.holmes at oracle.com Fri Mar 19 08:56:04 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 19 Mar 2021 18:56:04 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On 19/03/2021 6:50 pm, Yasumasa Suenaga wrote: > On Fri, 19 Mar 2021 06:09:16 GMT, Yasumasa Suenaga wrote: > >>>>> BTW this is not the first time we have had this issue with gcc making a >>>>> function deprecated and casting to void not fixing it. Unfortunately I >>>>> can't remember enough of the previous case's details to actually look it >>>>> up and see what we resolved to do. :( >>>> >>>> I found [JDK-6879689](https://bugs.openjdk.java.net/browse/JDK-6879689), and it fixed unused-return as following: >>>> >>>> https://github.com/openjdk/jdk/blob/9b5a9b61899cf649104c0ff70e14549f64a89561/src/hotspot/share/adlc/archDesc.cpp#L1082-L1083 >>>> >>>> I prefer it rather than `(void)!`. >>> >>> Does that work in release builds too? >> >> I checked `alloca()` call on Fedora 33 x86_64 and Alpine 3.13 x86_64. Both of them seem to elide `alloca()`. >> (It is fastdebug build - slow debug might be different, but production build might be same) >> >> 659 thread->record_stack_base_and_size(); >> 0x00007ffff7154d44 <+20>: call 0x7ffff75b3a20 <_ZN6Thread26record_stack_base_and_sizeEv> >> >> 660 >> 661 // Try to randomize the cache line index of hot stack frames. >> 662 // This helps when threads of the same stack traces evict each other's >> 663 // cache lines. The threads can be either from the same JVM instance, or >> 664 // from different JVM instances. The benefit is especially true for >> 665 // processors with hyperthreading technology. >> 666 static int counter = 0; >> >> 667 int pid = os::current_process_id(); >> >> 668 alloca(((pid ^ counter++) & 7) * 128); >> 0x00007ffff7154d51 <+33>: addl $0x1,0xc1db10(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> >> >> 669 >> 670 thread->initialize_thread_current(); >> => 0x00007ffff7154d4e <+30>: mov %r13,%rdi >> 0x00007ffff7154d58 <+40>: call 0x7ffff75b3800 <_ZN6Thread25initialize_thread_currentEv> >> >>> I prefer it rather than (void)!. >>> >>> Does that work in release builds too? >> >> It will not work as David said :) we need to use `(void)!` if we should left it. > >> + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128); >> + if (_stack_pad != 0) { >> + ((char*)_stack_pad)[0] = 1; >> + } > > I guess `_stack_pad` will be overwritten in each `threaad_native_entry()` call, so it might be elided. But it is a volatile ptr so should not be elided! > I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. Sorry but I'm not seeing where the stack actually gets expanded? Thanks, David > diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp > index 5af63befb58..bdb2dc89615 100644 > --- a/src/hotspot/os/linux/os_linux.cpp > +++ b/src/hotspot/os/linux/os_linux.cpp > @@ -665,7 +665,8 @@ static void *thread_native_entry(Thread *thread) { > // processors with hyperthreading technology. > static int counter = 0; > int pid = os::current_process_id(); > - alloca(((pid ^ counter++) & 7) * 128); > + void *ptr = alloca(((pid ^ counter++) & 7) * 128); > + ((char *)ptr)[0] = 1; > > thread->initialize_thread_current(); > > 659 thread->record_stack_base_and_size(); > 0x00007ffff7154d53 <+35>: call 0x7ffff75b3a80 <_ZN6Thread26record_stack_base_and_sizeEv> > > 660 > 661 // Try to randomize the cache line index of hot stack frames. > 662 // This helps when threads of the same stack traces evict each other's > 663 // cache lines. The threads can be either from the same JVM instance, or > 664 // from different JVM instances. The benefit is especially true for > 665 // processors with hyperthreading technology. > 666 static int counter = 0; > > 667 int pid = os::current_process_id(); > > 668 void *ptr = alloca(((pid ^ counter++) & 7) * 128); > 0x00007ffff7154d63 <+51>: mov 0xc1daff(%rip),%eax # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> > 0x00007ffff7154d69 <+57>: lea 0x1(%rax),%edx > 0x00007ffff7154d6c <+60>: xor %r8d,%eax > 0x00007ffff7154d6f <+63>: shl $0x7,%rax > 0x00007ffff7154d73 <+67>: mov %edx,0xc1daef(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> > 0x00007ffff7154d79 <+73>: and $0x380,%eax > 0x00007ffff7154d7e <+78>: add $0x17,%rax > 0x00007ffff7154d82 <+82>: and $0x7f0,%eax > 0x00007ffff7154d87 <+87>: sub %rax,%rsp > 0x00007ffff7154d8a <+90>: lea 0xf(%rsp),%rax > 0x00007ffff7154d8f <+95>: and $0xfffffffffffffff0,%rax > > 669 ((char *)ptr)[0] = 1; > 0x00007ffff7154d93 <+99>: movb $0x1,(%rax) > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3042 > From ysuenaga at openjdk.java.net Fri Mar 19 09:06:39 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 19 Mar 2021 09:06:39 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 08:47:39 GMT, Yasumasa Suenaga wrote: >> I checked `alloca()` call on Fedora 33 x86_64 and Alpine 3.13 x86_64. Both of them seem to elide `alloca()`. >> (It is fastdebug build - slow debug might be different, but production build might be same) >> >> 659 thread->record_stack_base_and_size(); >> 0x00007ffff7154d44 <+20>: call 0x7ffff75b3a20 <_ZN6Thread26record_stack_base_and_sizeEv> >> >> 660 >> 661 // Try to randomize the cache line index of hot stack frames. >> 662 // This helps when threads of the same stack traces evict each other's >> 663 // cache lines. The threads can be either from the same JVM instance, or >> 664 // from different JVM instances. The benefit is especially true for >> 665 // processors with hyperthreading technology. >> 666 static int counter = 0; >> >> 667 int pid = os::current_process_id(); >> >> 668 alloca(((pid ^ counter++) & 7) * 128); >> 0x00007ffff7154d51 <+33>: addl $0x1,0xc1db10(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> >> >> 669 >> 670 thread->initialize_thread_current(); >> => 0x00007ffff7154d4e <+30>: mov %r13,%rdi >> 0x00007ffff7154d58 <+40>: call 0x7ffff75b3800 <_ZN6Thread25initialize_thread_currentEv> >> >>> I prefer it rather than (void)!. >>> >>> Does that work in release builds too? >> >> It will not work as David said :) we need to use `(void)!` if we should left it. > >> + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128); >> + if (_stack_pad != 0) { >> + ((char*)_stack_pad)[0] = 1; >> + } > > I guess `_stack_pad` will be overwritten in each `threaad_native_entry()` call, so it might be elided. > I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. > > diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp > index 5af63befb58..bdb2dc89615 100644 > --- a/src/hotspot/os/linux/os_linux.cpp > +++ b/src/hotspot/os/linux/os_linux.cpp > @@ -665,7 +665,8 @@ static void *thread_native_entry(Thread *thread) { > // processors with hyperthreading technology. > static int counter = 0; > int pid = os::current_process_id(); > - alloca(((pid ^ counter++) & 7) * 128); > + void *ptr = alloca(((pid ^ counter++) & 7) * 128); > + ((char *)ptr)[0] = 1; > > thread->initialize_thread_current(); > > 659 thread->record_stack_base_and_size(); > 0x00007ffff7154d53 <+35>: call 0x7ffff75b3a80 <_ZN6Thread26record_stack_base_and_sizeEv> > > 660 > 661 // Try to randomize the cache line index of hot stack frames. > 662 // This helps when threads of the same stack traces evict each other's > 663 // cache lines. The threads can be either from the same JVM instance, or > 664 // from different JVM instances. The benefit is especially true for > 665 // processors with hyperthreading technology. > 666 static int counter = 0; > > 667 int pid = os::current_process_id(); > > 668 void *ptr = alloca(((pid ^ counter++) & 7) * 128); > 0x00007ffff7154d63 <+51>: mov 0xc1daff(%rip),%eax # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> > 0x00007ffff7154d69 <+57>: lea 0x1(%rax),%edx > 0x00007ffff7154d6c <+60>: xor %r8d,%eax > 0x00007ffff7154d6f <+63>: shl $0x7,%rax > 0x00007ffff7154d73 <+67>: mov %edx,0xc1daef(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> > 0x00007ffff7154d79 <+73>: and $0x380,%eax > 0x00007ffff7154d7e <+78>: add $0x17,%rax > 0x00007ffff7154d82 <+82>: and $0x7f0,%eax > 0x00007ffff7154d87 <+87>: sub %rax,%rsp > 0x00007ffff7154d8a <+90>: lea 0xf(%rsp),%rax > 0x00007ffff7154d8f <+95>: and $0xfffffffffffffff0,%rax > > 669 ((char *)ptr)[0] = 1; > 0x00007ffff7154d93 <+99>: movb $0x1,(%rax) > > I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. > > Sorry but I'm not seeing where the stack actually gets expanded? 0x00007ffff7154d87 <+87>: sub %rax,%rsp I guess `%rax` seems to contain the result of `((pid ^ counter++) & 7) * 128`, then `alloca()` is replaced to `sub` for `%RSP`. I saw the warning for this issue as `void* __builtin_alloca(long unsigned int)`. It might be it. We can just expand `%RSP` if we want to allocate buffer on the stack. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From tschatzl at openjdk.java.net Fri Mar 19 09:46:37 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Fri, 19 Mar 2021 09:46:37 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 03:20:21 GMT, Ioi Lam wrote: > Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: > > (gdb) p this > $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 > (gdb) p *this > $2 = > (gdb) ptype this > type = const class PlaceholderEntry { > > } * const > > The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. > > Tested with tiers 1-4. Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3084 From rkennke at openjdk.java.net Fri Mar 19 11:35:43 2021 From: rkennke at openjdk.java.net (Roman Kennke) Date: Fri, 19 Mar 2021 11:35:43 GMT Subject: RFR: 8261492: Shenandoah: reconsider forwardee accesses memory ordering [v3] In-Reply-To: References: Message-ID: <6m_sTEtwhTj1zH7SK6fdyzjavt9cykNU48ktFB96CmM=.a32e4a69-80b7-42f6-90ca-984a88209211@github.com> On Tue, 16 Feb 2021 10:26:06 GMT, Aleksey Shipilev wrote: >> Shenandoah carries forwardee information in object's mark word. Installing the new mark word is effectively "releasing" the object copy, and reading from the new mark word is "acquiring" that object copy. >> >> For the forwardee update side, Hotspot's default for atomic operations is memory_order_conservative, which emits two-way memory fences around the CASes at least on AArch64 and PPC64. This seems to be excessive for Shenandoah forwardee updates, and "release" is enough. >> >> For the forwardee load side, we need to guarantee "acquire". We do not do it now, reading the markword without memory semantics. It does not seem to pose a practical problem today, because GC does not access the object contents in the new copy, and mutators get this from the JRT-called stub that separates the fwdptr access and object contents access by a lot. It still should be cleaner to "acquire" the mark on load to avoid surprises. >> >> Additional testing: >> - [x] Linux x86_64 `hotspot_gc_shenandoah` >> - [x] Linux AArch64 `hotspot_gc_shenandoah` >> - [x] Linux AArch64 `tier1` with Shenandoah > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - A few minor touchups > - Add a blurb to x86 code as well > - Use implicit "consume" in AArch64, add more notes. > - Merge branch 'master' into JDK-8261492-shenandoah-forwardee-memord > - Make sure to access fwdptr with acquire semantics in assembler code > - 8261492: Shenandoah: reconsider forwardee accesses memory ordering Looks good to me. ------------- Marked as reviewed by rkennke (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2496 From ysuenaga at openjdk.java.net Fri Mar 19 12:53:40 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 19 Mar 2021 12:53:40 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 09:04:12 GMT, Yasumasa Suenaga wrote: >>> + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128); >>> + if (_stack_pad != 0) { >>> + ((char*)_stack_pad)[0] = 1; >>> + } >> >> I guess `_stack_pad` will be overwritten in each `threaad_native_entry()` call, so it might be elided. >> I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. >> >> diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp >> index 5af63befb58..bdb2dc89615 100644 >> --- a/src/hotspot/os/linux/os_linux.cpp >> +++ b/src/hotspot/os/linux/os_linux.cpp >> @@ -665,7 +665,8 @@ static void *thread_native_entry(Thread *thread) { >> // processors with hyperthreading technology. >> static int counter = 0; >> int pid = os::current_process_id(); >> - alloca(((pid ^ counter++) & 7) * 128); >> + void *ptr = alloca(((pid ^ counter++) & 7) * 128); >> + ((char *)ptr)[0] = 1; >> >> thread->initialize_thread_current(); >> >> 659 thread->record_stack_base_and_size(); >> 0x00007ffff7154d53 <+35>: call 0x7ffff75b3a80 <_ZN6Thread26record_stack_base_and_sizeEv> >> >> 660 >> 661 // Try to randomize the cache line index of hot stack frames. >> 662 // This helps when threads of the same stack traces evict each other's >> 663 // cache lines. The threads can be either from the same JVM instance, or >> 664 // from different JVM instances. The benefit is especially true for >> 665 // processors with hyperthreading technology. >> 666 static int counter = 0; >> >> 667 int pid = os::current_process_id(); >> >> 668 void *ptr = alloca(((pid ^ counter++) & 7) * 128); >> 0x00007ffff7154d63 <+51>: mov 0xc1daff(%rip),%eax # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> >> 0x00007ffff7154d69 <+57>: lea 0x1(%rax),%edx >> 0x00007ffff7154d6c <+60>: xor %r8d,%eax >> 0x00007ffff7154d6f <+63>: shl $0x7,%rax >> 0x00007ffff7154d73 <+67>: mov %edx,0xc1daef(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> >> 0x00007ffff7154d79 <+73>: and $0x380,%eax >> 0x00007ffff7154d7e <+78>: add $0x17,%rax >> 0x00007ffff7154d82 <+82>: and $0x7f0,%eax >> 0x00007ffff7154d87 <+87>: sub %rax,%rsp >> 0x00007ffff7154d8a <+90>: lea 0xf(%rsp),%rax >> 0x00007ffff7154d8f <+95>: and $0xfffffffffffffff0,%rax >> >> 669 ((char *)ptr)[0] = 1; >> 0x00007ffff7154d93 <+99>: movb $0x1,(%rax) > >> > I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. >> >> Sorry but I'm not seeing where the stack actually gets expanded? > > 0x00007ffff7154d87 <+87>: sub %rax,%rsp > > I guess `%rax` seems to contain the result of `((pid ^ counter++) & 7) * 128`, then `alloca()` is replaced to `sub` for `%RSP`. > I saw the warning for this issue as `void* __builtin_alloca(long unsigned int)`. It might be it. We can just expand `%RSP` if we want to allocate buffer on the stack. I objdump'ed libjvm.so in JDK 16 Linux x64 from jdk.java.net , it also does not seem to expand the stack: 0000000000bd8500 : bd8500: 55 push %rbp bd8501: 48 89 e5 mov %rsp,%rbp bd8504: 41 56 push %r14 bd8506: 41 55 push %r13 bd8508: 49 89 fd mov %rdi,%r13 bd850b: 41 54 push %r12 bd850d: 53 push %rbx bd850e: e8 ad 1e 1a 00 callq d7a3c0 bd8513: e8 08 27 66 ff callq 23ac20 bd8518: 4c 89 ef mov %r13,%rdi bd851b: 83 05 e6 a3 64 00 01 addl $0x1,0x64a3e6(%rip) # 1222908 bd8522: e8 39 1e 1a 00 callq d7a360 bd8527: 49 8b 9d 70 02 00 00 mov 0x270(%r13),%rbx bd852e: 31 c0 xor %eax,%eax Result from `getpid()` will be stored into `%RAX`, however it is not used until `xor` at bd852e. And also I could not find out both `alloca()` call and manipulating `%RSP` at here. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From coleenp at openjdk.java.net Fri Mar 19 13:41:55 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 19 Mar 2021 13:41:55 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 03:20:21 GMT, Ioi Lam wrote: > Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: > > (gdb) p this > $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 > (gdb) p *this > $2 = > (gdb) ptype this > type = const class PlaceholderEntry { > > } * const > > The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. > > Tested with tiers 1-4. I think this is fine and appreciated. I have removed the block allocation but now use placement new, so the constructors are called (vptr won't be 0xf1f1 etc), but that works fine with this change. For some reason, using normal new HashtableEntry<>() crashes malloc so I can't really do that anyway. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3084 From stuefe at openjdk.java.net Fri Mar 19 17:36:40 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Fri, 19 Mar 2021 17:36:40 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 12:51:07 GMT, Yasumasa Suenaga wrote: >>> > I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. >>> >>> Sorry but I'm not seeing where the stack actually gets expanded? >> >> 0x00007ffff7154d87 <+87>: sub %rax,%rsp >> >> I guess `%rax` seems to contain the result of `((pid ^ counter++) & 7) * 128`, then `alloca()` is replaced to `sub` for `%RSP`. >> I saw the warning for this issue as `void* __builtin_alloca(long unsigned int)`. It might be it. We can just expand `%RSP` if we want to allocate buffer on the stack. > > I objdump'ed libjvm.so in JDK 16 Linux x64 from jdk.java.net , it also does not seem to expand the stack: > > 0000000000bd8500 : > bd8500: 55 push %rbp > bd8501: 48 89 e5 mov %rsp,%rbp > bd8504: 41 56 push %r14 > bd8506: 41 55 push %r13 > bd8508: 49 89 fd mov %rdi,%r13 > bd850b: 41 54 push %r12 > bd850d: 53 push %rbx > bd850e: e8 ad 1e 1a 00 callq d7a3c0 > bd8513: e8 08 27 66 ff callq 23ac20 > bd8518: 4c 89 ef mov %r13,%rdi > bd851b: 83 05 e6 a3 64 00 01 addl $0x1,0x64a3e6(%rip) # 1222908 > bd8522: e8 39 1e 1a 00 callq d7a360 > bd8527: 49 8b 9d 70 02 00 00 mov 0x270(%r13),%rbx > bd852e: 31 c0 xor %eax,%eax > > Result from `getpid()` will be stored into `%RAX`, however it is not used until `xor` at bd852e. > And also I could not find out both `alloca()` call and manipulating `%RSP` at here. Writing to the *end* of the allocated area may do the trick. ..Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From fweimer at openjdk.java.net Fri Mar 19 17:53:42 2021 From: fweimer at openjdk.java.net (Florian Weimer) Date: Fri, 19 Mar 2021 17:53:42 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 17:33:35 GMT, Thomas Stuefe wrote: >> I objdump'ed libjvm.so in JDK 16 Linux x64 from jdk.java.net , it also does not seem to expand the stack: >> >> 0000000000bd8500 : >> bd8500: 55 push %rbp >> bd8501: 48 89 e5 mov %rsp,%rbp >> bd8504: 41 56 push %r14 >> bd8506: 41 55 push %r13 >> bd8508: 49 89 fd mov %rdi,%r13 >> bd850b: 41 54 push %r12 >> bd850d: 53 push %rbx >> bd850e: e8 ad 1e 1a 00 callq d7a3c0 >> bd8513: e8 08 27 66 ff callq 23ac20 >> bd8518: 4c 89 ef mov %r13,%rdi >> bd851b: 83 05 e6 a3 64 00 01 addl $0x1,0x64a3e6(%rip) # 1222908 >> bd8522: e8 39 1e 1a 00 callq d7a360 >> bd8527: 49 8b 9d 70 02 00 00 mov 0x270(%r13),%rbx >> bd852e: 31 c0 xor %eax,%eax >> >> Result from `getpid()` will be stored into `%RAX`, however it is not used until `xor` at bd852e. >> And also I could not find out both `alloca()` call and manipulating `%RSP` at here. > > Writing to the *end* of the allocated area may do the trick. > > ..Thomas The use of `getpid` in this code suggests it dates back to LinuxThreads, where the PID differed from thread to thread. On current Linux, `getpid` really returns the PID, so this does nothing to randomize the offset within a single process. I would expect the system thread library to do this if it is beneficial. glibc replaced `COLORING_INCREMENT` (similar to this `alloca`, I believe) with `MULTI_PAGE_ALIASING` on i386 around 2003. `MULTI_PAGE_ALIASING` is implemented in a completely different way; it tweaks stack sizes to avoid accidental higher-level alignment (above the page level) between different threads. @hjl-tools Do you think we need anything like this on current CPUs? ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From thomas.stuefe at gmail.com Fri Mar 19 18:53:58 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 19 Mar 2021 19:53:58 +0100 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> Message-ID: On Fri, Mar 19, 2021 at 6:54 PM Florian Weimer wrote: > On Fri, 19 Mar 2021 17:33:35 GMT, Thomas Stuefe > wrote: > > >> I objdump'ed libjvm.so in JDK 16 Linux x64 from jdk.java.net , it also > does not seem to expand the stack: > >> > >> 0000000000bd8500 : > >> bd8500: 55 push %rbp > >> bd8501: 48 89 e5 mov %rsp,%rbp > >> bd8504: 41 56 push %r14 > >> bd8506: 41 55 push %r13 > >> bd8508: 49 89 fd mov %rdi,%r13 > >> bd850b: 41 54 push %r12 > >> bd850d: 53 push %rbx > >> bd850e: e8 ad 1e 1a 00 callq d7a3c0 > > >> bd8513: e8 08 27 66 ff callq 23ac20 > >> bd8518: 4c 89 ef mov %r13,%rdi > >> bd851b: 83 05 e6 a3 64 00 01 addl $0x1,0x64a3e6(%rip) > # 1222908 > >> bd8522: e8 39 1e 1a 00 callq d7a360 > > >> bd8527: 49 8b 9d 70 02 00 00 mov 0x270(%r13),%rbx > >> bd852e: 31 c0 xor %eax,%eax > >> > >> Result from `getpid()` will be stored into `%RAX`, however it is not > used until `xor` at bd852e. > >> And also I could not find out both `alloca()` call and manipulating > `%RSP` at here. > > > > Writing to the *end* of the allocated area may do the trick. > > > > ..Thomas > > The use of `getpid` in this code suggests it dates back to LinuxThreads, > where the PID differed from thread to thread. On current Linux, `getpid` > really returns the PID, so this does nothing to randomize the offset within > a single process. > > I would expect the system thread library to do this if it is beneficial. > glibc replaced `COLORING_INCREMENT` (similar to this `alloca`, I believe) > with `MULTI_PAGE_ALIASING` on i386 around 2003. `MULTI_PAGE_ALIASING` is > implemented in a completely different way; it tweaks stack sizes to avoid > accidental higher-level alignment (above the page level) between different > threads. > > Interesting. Ist this observable (eg via pthread_attr_getstacksize)? ..Thomas From fweimer at redhat.com Fri Mar 19 18:58:29 2021 From: fweimer at redhat.com (Florian Weimer) Date: Fri, 19 Mar 2021 19:58:29 +0100 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: ("Thomas =?utf-8?Q?St=C3=BCfe=22's?= message of "Fri, 19 Mar 2021 19:53:58 +0100") References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> Message-ID: <87lfajj7hm.fsf@oldenburg.str.redhat.com> * Thomas St?fe: >> I would expect the system thread library to do this if it is beneficial. >> glibc replaced `COLORING_INCREMENT` (similar to this `alloca`, I believe) >> with `MULTI_PAGE_ALIASING` on i386 around 2003. `MULTI_PAGE_ALIASING` is >> implemented in a completely different way; it tweaks stack sizes to avoid >> accidental higher-level alignment (above the page level) between different >> threads. > > Interesting. Ist this observable (eg via pthread_attr_getstacksize)? Yes, but that's actually a bug: pthread_getattr_np reports wrong stack size with MULTI_PAGE_ALIASING H.J. has just updated the bug, indicating that the code is no longer needed. Thanks, Florian From github.com+1072356+hjl-tools at openjdk.java.net Fri Mar 19 19:53:42 2021 From: github.com+1072356+hjl-tools at openjdk.java.net (hjl-tools) Date: Fri, 19 Mar 2021 19:53:42 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 17:51:08 GMT, Florian Weimer wrote: > The use of `getpid` in this code suggests it dates back to LinuxThreads, where the PID differed from thread to thread. On current Linux, `getpid` really returns the PID, so this does nothing to randomize the offset within a single process. > > I would expect the system thread library to do this if it is beneficial. glibc replaced `COLORING_INCREMENT` (similar to this `alloca`, I believe) with `MULTI_PAGE_ALIASING` on i386 around 2003. `MULTI_PAGE_ALIASING` is implemented in a completely different way; it tweaks stack sizes to avoid accidental higher-level alignment (above the page level) between different threads. > > @hjl-tools Do you think we need anything like this on current CPUs? We don't need MULTI_PAGE_ALIASING anymore in glibc. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From iklam at openjdk.java.net Fri Mar 19 21:26:39 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 19 Mar 2021 21:26:39 GMT Subject: RFR: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 07:06:28 GMT, Thomas Stuefe wrote: >> Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: >> >> (gdb) p this >> $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 >> (gdb) p *this >> $2 = >> (gdb) ptype this >> type = const class PlaceholderEntry { >> >> } * const >> >> The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. >> >> Tested with tiers 1-4. > > Makes sense. Thanks @tstuefe @tschatzl @coleenp @dholmes-ora for the review! ------------- PR: https://git.openjdk.java.net/jdk/pull/3084 From iklam at openjdk.java.net Fri Mar 19 21:26:40 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 19 Mar 2021 21:26:40 GMT Subject: Integrated: 8263834: Work around gdb for HashtableEntry In-Reply-To: References: Message-ID: On Fri, 19 Mar 2021 03:20:21 GMT, Ioi Lam wrote: > Please review this one liner to work around a gdb bug in printing subtypes of `HashtableEntry`: > > (gdb) p this > $1 = (const PlaceholderEntry * const) 0x7ffff0242a30 > (gdb) p *this > $2 = > (gdb) ptype this > type = const class PlaceholderEntry { > > } * const > > The fix is to no longer subclass `HashtableEntry` from `CHeapObj`. Apparently that makes gdb happy. None of our code requires this subclass relationship. > > Tested with tiers 1-4. This pull request has now been integrated. Changeset: 4d9517d2 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/4d9517d2 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8263834: Work around gdb for HashtableEntry Reviewed-by: dholmes, stuefe, tschatzl, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/3084 From david.holmes at oracle.com Fri Mar 19 22:24:08 2021 From: david.holmes at oracle.com (David Holmes) Date: Sat, 20 Mar 2021 08:24:08 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: <8dff2d94-d1b3-cda5-4b64-b613fe3e80c6@oracle.com> On 19/03/2021 7:06 pm, Yasumasa Suenaga wrote: > On Fri, 19 Mar 2021 08:47:39 GMT, Yasumasa Suenaga wrote: > >>> I checked `alloca()` call on Fedora 33 x86_64 and Alpine 3.13 x86_64. Both of them seem to elide `alloca()`. >>> (It is fastdebug build - slow debug might be different, but production build might be same) >>> >>> 659 thread->record_stack_base_and_size(); >>> 0x00007ffff7154d44 <+20>: call 0x7ffff75b3a20 <_ZN6Thread26record_stack_base_and_sizeEv> >>> >>> 660 >>> 661 // Try to randomize the cache line index of hot stack frames. >>> 662 // This helps when threads of the same stack traces evict each other's >>> 663 // cache lines. The threads can be either from the same JVM instance, or >>> 664 // from different JVM instances. The benefit is especially true for >>> 665 // processors with hyperthreading technology. >>> 666 static int counter = 0; >>> >>> 667 int pid = os::current_process_id(); >>> >>> 668 alloca(((pid ^ counter++) & 7) * 128); >>> 0x00007ffff7154d51 <+33>: addl $0x1,0xc1db10(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> >>> >>> 669 >>> 670 thread->initialize_thread_current(); >>> => 0x00007ffff7154d4e <+30>: mov %r13,%rdi >>> 0x00007ffff7154d58 <+40>: call 0x7ffff75b3800 <_ZN6Thread25initialize_thread_currentEv> >>> >>>> I prefer it rather than (void)!. >>>> >>>> Does that work in release builds too? >>> >>> It will not work as David said :) we need to use `(void)!` if we should left it. >> >>> + static void* volatile _stack_pad = alloca(((pid ^ counter++) & 7) * 128); >>> + if (_stack_pad != 0) { >>> + ((char*)_stack_pad)[0] = 1; >>> + } >> >> I guess `_stack_pad` will be overwritten in each `threaad_native_entry()` call, so it might be elided. >> I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. >> >> diff --git a/src/hotspot/os/linux/os_linux.cpp b/src/hotspot/os/linux/os_linux.cpp >> index 5af63befb58..bdb2dc89615 100644 >> --- a/src/hotspot/os/linux/os_linux.cpp >> +++ b/src/hotspot/os/linux/os_linux.cpp >> @@ -665,7 +665,8 @@ static void *thread_native_entry(Thread *thread) { >> // processors with hyperthreading technology. >> static int counter = 0; >> int pid = os::current_process_id(); >> - alloca(((pid ^ counter++) & 7) * 128); >> + void *ptr = alloca(((pid ^ counter++) & 7) * 128); >> + ((char *)ptr)[0] = 1; >> >> thread->initialize_thread_current(); >> >> 659 thread->record_stack_base_and_size(); >> 0x00007ffff7154d53 <+35>: call 0x7ffff75b3a80 <_ZN6Thread26record_stack_base_and_sizeEv> >> >> 660 >> 661 // Try to randomize the cache line index of hot stack frames. >> 662 // This helps when threads of the same stack traces evict each other's >> 663 // cache lines. The threads can be either from the same JVM instance, or >> 664 // from different JVM instances. The benefit is especially true for >> 665 // processors with hyperthreading technology. >> 666 static int counter = 0; >> >> 667 int pid = os::current_process_id(); >> >> 668 void *ptr = alloca(((pid ^ counter++) & 7) * 128); >> 0x00007ffff7154d63 <+51>: mov 0xc1daff(%rip),%eax # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> >> 0x00007ffff7154d69 <+57>: lea 0x1(%rax),%edx >> 0x00007ffff7154d6c <+60>: xor %r8d,%eax >> 0x00007ffff7154d6f <+63>: shl $0x7,%rax >> 0x00007ffff7154d73 <+67>: mov %edx,0xc1daef(%rip) # 0x7ffff7d72868 <_ZZL19thread_native_entryP6ThreadE7counter> >> 0x00007ffff7154d79 <+73>: and $0x380,%eax >> 0x00007ffff7154d7e <+78>: add $0x17,%rax >> 0x00007ffff7154d82 <+82>: and $0x7f0,%eax >> 0x00007ffff7154d87 <+87>: sub %rax,%rsp >> 0x00007ffff7154d8a <+90>: lea 0xf(%rsp),%rax >> 0x00007ffff7154d8f <+95>: and $0xfffffffffffffff0,%rax >> >> 669 ((char *)ptr)[0] = 1; >> 0x00007ffff7154d93 <+99>: movb $0x1,(%rax) > >>> I modified the code as following, it seems to work - we cannot see `alloca()`, however the stack is expanded. >> >> Sorry but I'm not seeing where the stack actually gets expanded? > > 0x00007ffff7154d87 <+87>: sub %rax,%rsp Doh! Thanks. I'm re-running some benchmarks on Linux. It would be good to confirm that the alloca is also being elided with clang and VS. David ----- > I guess `%rax` seems to contain the result of `((pid ^ counter++) & 7) * 128`, then `alloca()` is replaced to `sub` for `%RSP`. > I saw the warning for this issue as `void* __builtin_alloca(long unsigned int)`. It might be it. We can just expand `%RSP` if we want to allocate buffer on the stack. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3042 > From ysuenaga at openjdk.java.net Sat Mar 20 02:31:39 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Sat, 20 Mar 2021 02:31:39 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Fri, 19 Mar 2021 18:38:53 GMT, hjl-tools wrote: >> The use of `getpid` in this code suggests it dates back to LinuxThreads, where the PID differed from thread to thread. On current Linux, `getpid` really returns the PID, so this does nothing to randomize the offset within a single process. >> >> I would expect the system thread library to do this if it is beneficial. glibc replaced `COLORING_INCREMENT` (similar to this `alloca`, I believe) with `MULTI_PAGE_ALIASING` on i386 around 2003. `MULTI_PAGE_ALIASING` is implemented in a completely different way; it tweaks stack sizes to avoid accidental higher-level alignment (above the page level) between different threads. >> >> @hjl-tools Do you think we need anything like this on current CPUs? > >> The use of `getpid` in this code suggests it dates back to LinuxThreads, where the PID differed from thread to thread. On current Linux, `getpid` really returns the PID, so this does nothing to randomize the offset within a single process. >> >> I would expect the system thread library to do this if it is beneficial. glibc replaced `COLORING_INCREMENT` (similar to this `alloca`, I believe) with `MULTI_PAGE_ALIASING` on i386 around 2003. `MULTI_PAGE_ALIASING` is implemented in a completely different way; it tweaks stack sizes to avoid accidental higher-level alignment (above the page level) between different threads. >> >> @hjl-tools Do you think we need anything like this on current CPUs? > > We don't need MULTI_PAGE_ALIASING anymore in glibc. > It would be good to confirm that the alloca is also being elided with > clang and VS. I debug'ed fastdebug JDK on Visual Studio, it seems to work `alloca()`: // Try to randomize the cache line index of hot stack frames. // This helps when threads of the same stack traces evict each other's // cache lines. The threads can be either from the same JVM instance, or // from different JVM instances. The benefit is especially true for // processors with hyperthreading technology. static int counter = 0; int pid = os::current_process_id(); 00007FFFDEBDCFF0 add byte ptr [rbx-7A78FEF3h],cl _alloca(((pid ^ counter++) & 7) * 128); 00007FFFDEBDCFF6 add byte ptr [rbx-2FCC2Fh],cl 00007FFFDEBDCFFC ror dword ptr [rcx-7A790AF3h],0 00007FFFDEBDD003 and edx,7 00007FFFDEBDD006 shl edx,7 00007FFFDEBDD009 mov eax,edx 00007FFFDEBDD00B lea rcx,[rdx+0Fh] 00007FFFDEBDD00F cmp rcx,rax 00007FFFDEBDD012 ja thread_native_entry+6Eh (07FFFDEBDD01Eh) 00007FFFDEBDD014 mov rcx,0FFFFFFFFFFFFFF0h 00007FFFDEBDD01E and rcx,0FFFFFFFFFFFFFFF0h 00007FFFDEBDD022 mov rax,rcx 00007FFFDEBDD025 call __chkstk (07FFFDEEFF430h) 00007FFFDEBDD02A sub rsp,rcx `alloca()` is also in os_bsd.cpp and os_aix.cpp, but I cannot check them because I do not have them. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Sat Mar 20 03:05:39 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Sat, 20 Mar 2021 03:05:39 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Sat, 20 Mar 2021 02:29:16 GMT, Yasumasa Suenaga wrote: > I debug'ed fastdebug JDK on Visual Studio, it seems to work `alloca()`: I used cl.exe from Visual Studio 2019 (16.9.1). The generated code might be different by compiler version of course. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From yyang at openjdk.java.net Sat Mar 20 05:23:38 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sat, 20 Mar 2021 05:23:38 GMT Subject: RFR: 8263002: Remove CDS MiscCode region [v3] In-Reply-To: References: <4EsEk0VVQDAbkJzXmrMh2a_B4sFq7diglgCo3UqZpBQ=.5f4541c2-76aa-46da-82b5-dc8e537a85f3@github.com> Message-ID: On Wed, 10 Mar 2021 06:07:04 GMT, Ioi Lam wrote: >> wow. Looks good to me. > > Thanks @coleenp and @dholmes-ora for the review! Hi @iklam, sorry to bother you, I found there are some comments that still refer to [lingering mc regions](https://github.com/openjdk/jdk/compare/master...kelthuzadx:residual_mc_region). ------------- PR: https://git.openjdk.java.net/jdk/pull/2861 From ihse at openjdk.java.net Sat Mar 20 10:23:39 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Sat, 20 Mar 2021 10:23:39 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Sat, 20 Mar 2021 03:03:09 GMT, Yasumasa Suenaga wrote: >>> It would be good to confirm that the alloca is also being elided with >>> clang and VS. >> >> I debug'ed fastdebug JDK on Visual Studio, it seems to work `alloca()`: >> >> // Try to randomize the cache line index of hot stack frames. >> // This helps when threads of the same stack traces evict each other's >> // cache lines. The threads can be either from the same JVM instance, or >> // from different JVM instances. The benefit is especially true for >> // processors with hyperthreading technology. >> static int counter = 0; >> int pid = os::current_process_id(); >> 00007FFFDEBDCFF0 add byte ptr [rbx-7A78FEF3h],cl >> _alloca(((pid ^ counter++) & 7) * 128); >> 00007FFFDEBDCFF6 add byte ptr [rbx-2FCC2Fh],cl >> 00007FFFDEBDCFFC ror dword ptr [rcx-7A790AF3h],0 >> 00007FFFDEBDD003 and edx,7 >> 00007FFFDEBDD006 shl edx,7 >> 00007FFFDEBDD009 mov eax,edx >> 00007FFFDEBDD00B lea rcx,[rdx+0Fh] >> 00007FFFDEBDD00F cmp rcx,rax >> 00007FFFDEBDD012 ja thread_native_entry+6Eh (07FFFDEBDD01Eh) >> 00007FFFDEBDD014 mov rcx,0FFFFFFFFFFFFFF0h >> 00007FFFDEBDD01E and rcx,0FFFFFFFFFFFFFFF0h >> 00007FFFDEBDD022 mov rax,rcx >> 00007FFFDEBDD025 call __chkstk (07FFFDEEFF430h) >> 00007FFFDEBDD02A sub rsp,rcx >> >> `alloca()` is also in os_bsd.cpp and os_aix.cpp, but I cannot check them because I do not have them. > >> I debug'ed fastdebug JDK on Visual Studio, it seems to work `alloca()`: > > I used cl.exe from Visual Studio 2019 (16.9.1). The generated code might be different by compiler version of course. @YaSuenag The code generated by debug builds is often significantly different from release builds. You might want to check the latter instead to figure out if this has any effect on what we actually ship. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Sat Mar 20 12:37:40 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Sat, 20 Mar 2021 12:37:40 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Sat, 20 Mar 2021 10:20:50 GMT, Magnus Ihse Bursie wrote: >>> I debug'ed fastdebug JDK on Visual Studio, it seems to work `alloca()`: >> >> I used cl.exe from Visual Studio 2019 (16.9.1). The generated code might be different by compiler version of course. > > @YaSuenag The code generated by debug builds is often significantly different from release builds. You might want to check the latter instead to figure out if this has any effect on what we actually ship. @magicus I checked assembly code of release build, it is similar to fastdebug build. The stack (RSP) is expanded. (I confirmed it with Visual Studio 16.9.2 because I received update notification before your reply...) // Try to randomize the cache line index of hot stack frames. // This helps when threads of the same stack traces evict each other's // cache lines. The threads can be either from the same JVM instance, or // from different JVM instances. The benefit is especially true for // processors with hyperthreading technology. static int counter = 0; int pid = os::current_process_id(); 00007FF80F4E10ED mov eax,dword ptr [_initial_pid (07FF80F9D8164h)] 00007FF80F4E10F3 test eax,eax 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) 00007FF80F4E10F7 call qword ptr [__imp__getpid (07FF80F6EF748h)] _alloca(((pid ^ counter++) & 7) * 128); 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] 00007FF80F4E1103 mov edx,ecx 00007FF80F4E1105 xor edx,eax 00007FF80F4E1107 inc ecx 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx 00007FF80F4E110F and edx,7 00007FF80F4E1112 shl edx,7 00007FF80F4E1115 mov eax,edx 00007FF80F4E1117 lea rcx,[rdx+0Fh] 00007FF80F4E111B cmp rcx,rax 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h 00007FF80F4E112E mov rax,rcx 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) 00007FF80F4E1136 sub rsp,rcx ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From yyang at openjdk.java.net Sun Mar 21 04:30:53 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sun, 21 Mar 2021 04:30:53 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors Message-ID: cl.exe(19.28.29334) can not build JDK on windows_x64 because it treats many warnings as errors thus prohibiting further compilation. (See detailed failure logs on JBS) 1. methodMatcher.cpp cl.exe can not handle advanced usage of sscanf(i.e. regex-like sscanf) correctly. This looks like an msvc compiler bug, it has been there for a long while, so I temporarily disable these warnings in a limited region. Outside of this region, the compiler still treats them as errors. This change does not affect the functionality of MethodMatcher::parse_method_pattern, it can parse class name and method name in a desired manner. 2. vm_version_x86.cpp Some comments contain characters(Register Trademark) that cannot be represented in the current code page (936). Replacing them with ASCII-characters makes the compiler happy. Test manually! Best Regards, Yang ------------- Commit messages: - fix build failure on windows Changes: https://git.openjdk.java.net/jdk/pull/3107/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3107&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263028 Stats: 41 lines in 2 files changed: 5 ins; 0 del; 36 mod Patch: https://git.openjdk.java.net/jdk/pull/3107.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3107/head:pull/3107 PR: https://git.openjdk.java.net/jdk/pull/3107 From david.holmes at oracle.com Sun Mar 21 05:07:34 2021 From: david.holmes at oracle.com (David Holmes) Date: Sun, 21 Mar 2021 15:07:34 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: Message-ID: <02db4d31-aba4-6113-4c61-0b4ae03759df@oracle.com> On 20/03/2021 10:37 pm, Yasumasa Suenaga wrote: > On Sat, 20 Mar 2021 10:20:50 GMT, Magnus Ihse Bursie wrote: > >>>> I debug'ed fastdebug JDK on Visual Studio, it seems to work `alloca()`: >>> >>> I used cl.exe from Visual Studio 2019 (16.9.1). The generated code might be different by compiler version of course. >> >> @YaSuenag The code generated by debug builds is often significantly different from release builds. You might want to check the latter instead to figure out if this has any effect on what we actually ship. > > @magicus I checked assembly code of release build, it is similar to fastdebug build. The stack (RSP) is expanded. Thanks for confirming - I was going to make the same point about needing to check product build. Now we need someone who can check clang output. The rough benchmarking I did did not show any benefit to using the alloca on Linux. So I think we can safely say it can go on Linux. The benchmarking on Windows, with alloca removed, did not show any degradation in performance. So I think we can asafely say it is okay to remove on Windows too. The benchmarking on macOS show no degradation either, but until I know whether the alloca was being elided by clang that may not mean anything. If it is being elided we need to restore it and then check performance again. But I need to know the trick that worked on gcc will also work for clang. David ----- > (I confirmed it with Visual Studio 16.9.2 because I received update notification before your reply...) > // Try to randomize the cache line index of hot stack frames. > // This helps when threads of the same stack traces evict each other's > // cache lines. The threads can be either from the same JVM instance, or > // from different JVM instances. The benefit is especially true for > // processors with hyperthreading technology. > static int counter = 0; > int pid = os::current_process_id(); > 00007FF80F4E10ED mov eax,dword ptr [_initial_pid (07FF80F9D8164h)] > 00007FF80F4E10F3 test eax,eax > 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) > 00007FF80F4E10F7 call qword ptr [__imp__getpid (07FF80F6EF748h)] > _alloca(((pid ^ counter++) & 7) * 128); > 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] > 00007FF80F4E1103 mov edx,ecx > 00007FF80F4E1105 xor edx,eax > 00007FF80F4E1107 inc ecx > 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx > 00007FF80F4E110F and edx,7 > 00007FF80F4E1112 shl edx,7 > 00007FF80F4E1115 mov eax,edx > 00007FF80F4E1117 lea rcx,[rdx+0Fh] > 00007FF80F4E111B cmp rcx,rax > 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) > 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h > 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h > 00007FF80F4E112E mov rax,rcx > 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) > 00007FF80F4E1136 sub rsp,rcx > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3042 > From thomas.stuefe at gmail.com Sun Mar 21 05:57:33 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 21 Mar 2021 06:57:33 +0100 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <02db4d31-aba4-6113-4c61-0b4ae03759df@oracle.com> References: <02db4d31-aba4-6113-4c61-0b4ae03759df@oracle.com> Message-ID: On Sun, Mar 21, 2021 at 6:08 AM David Holmes wrote: > On 20/03/2021 10:37 pm, Yasumasa Suenaga wrote: > > On Sat, 20 Mar 2021 10:20:50 GMT, Magnus Ihse Bursie > wrote: > > > >>>> I debug'ed fastdebug JDK on Visual Studio, it seems to work > `alloca()`: > >>> > >>> I used cl.exe from Visual Studio 2019 (16.9.1). The generated code > might be different by compiler version of course. > >> > >> @YaSuenag The code generated by debug builds is often significantly > different from release builds. You might want to check the latter instead > to figure out if this has any effect on what we actually ship. > > > > @magicus I checked assembly code of release build, it is similar to > fastdebug build. The stack (RSP) is expanded. > > Thanks for confirming - I was going to make the same point about needing > to check product build. > > Now we need someone who can check clang output. > > The rough benchmarking I did did not show any benefit to using the > alloca on Linux. So I think we can safely say it can go on Linux. > > The benchmarking on Windows, with alloca removed, did not show any > degradation in performance. So I think we can asafely say it is okay to > remove on Windows too. > > The benchmarking on macOS show no degradation either, but until I know > whether the alloca was being elided by clang that may not mean anything. > If it is being elided we need to restore it and then check performance > again. But I need to know the trick that worked on gcc will also work > for clang. > > Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure we measure the right thing? I wish there were regression tests telling us when to re-apply this optimization. I dislike that this leaves us at the mercy of the underlying libc for something which is reasonably cheap and simple to do (just one alloca). ..Thomas (Please leave the alloca in the AIX implementation; we currently don't have the cycles to run regression tests for this) > ----- > > > (I confirmed it with Visual Studio 16.9.2 because I received update > notification before your reply...) > > // Try to randomize the cache line index of hot stack frames. > > // This helps when threads of the same stack traces evict each other's > > // cache lines. The threads can be either from the same JVM instance, > or > > // from different JVM instances. The benefit is especially true for > > // processors with hyperthreading technology. > > static int counter = 0; > > int pid = os::current_process_id(); > > 00007FF80F4E10ED mov eax,dword ptr [_initial_pid > (07FF80F9D8164h)] > > 00007FF80F4E10F3 test eax,eax > > 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) > > 00007FF80F4E10F7 call qword ptr [__imp__getpid (07FF80F6EF748h)] > > _alloca(((pid ^ counter++) & 7) * 128); > > 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] > > 00007FF80F4E1103 mov edx,ecx > > 00007FF80F4E1105 xor edx,eax > > 00007FF80F4E1107 inc ecx > > 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx > > 00007FF80F4E110F and edx,7 > > 00007FF80F4E1112 shl edx,7 > > 00007FF80F4E1115 mov eax,edx > > 00007FF80F4E1117 lea rcx,[rdx+0Fh] > > 00007FF80F4E111B cmp rcx,rax > > 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) > > 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h > > 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h > > 00007FF80F4E112E mov rax,rcx > > 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) > > 00007FF80F4E1136 sub rsp,rcx > > > > ------------- > > > > PR: https://git.openjdk.java.net/jdk/pull/3042 > > > From ysuenaga at openjdk.java.net Sun Mar 21 06:42:52 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Sun, 21 Mar 2021 06:42:52 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Sat, 20 Mar 2021 12:35:18 GMT, Yasumasa Suenaga wrote: >> @YaSuenag The code generated by debug builds is often significantly different from release builds. You might want to check the latter instead to figure out if this has any effect on what we actually ship. > > @magicus I checked assembly code of release build, it is similar to fastdebug build. The stack (RSP) is expanded. > (I confirmed it with Visual Studio 16.9.2 because I received update notification before your reply...) > // Try to randomize the cache line index of hot stack frames. > // This helps when threads of the same stack traces evict each other's > // cache lines. The threads can be either from the same JVM instance, or > // from different JVM instances. The benefit is especially true for > // processors with hyperthreading technology. > static int counter = 0; > int pid = os::current_process_id(); > 00007FF80F4E10ED mov eax,dword ptr [_initial_pid (07FF80F9D8164h)] > 00007FF80F4E10F3 test eax,eax > 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) > 00007FF80F4E10F7 call qword ptr [__imp__getpid (07FF80F6EF748h)] > _alloca(((pid ^ counter++) & 7) * 128); > 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] > 00007FF80F4E1103 mov edx,ecx > 00007FF80F4E1105 xor edx,eax > 00007FF80F4E1107 inc ecx > 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx > 00007FF80F4E110F and edx,7 > 00007FF80F4E1112 shl edx,7 > 00007FF80F4E1115 mov eax,edx > 00007FF80F4E1117 lea rcx,[rdx+0Fh] > 00007FF80F4E111B cmp rcx,rax > 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) > 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h > 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h > 00007FF80F4E112E mov rax,rcx > 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) > 00007FF80F4E1136 sub rsp,rcx > Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure we > measure the right thing? I wish there were regression tests telling us when > to re-apply this optimization. I think we can decide by whether `alloca()` or equivalent stack operation is contained in current binary. If current binary does not contain it like JDK 17 Linux x64, we can remove it because it already does not work - performance degradation will not happen. In its context, I think Alpine + musl libc is also ok to remove it because I confirmed JDK 17 for Alpine x64 (from jdk.java.net) does not contain stack operation same as x64. 0000000000b889b0 : b889b0: 55 push %rbp b889b1: 48 89 e5 mov %rsp,%rbp b889b4: 41 56 push %r14 b889b6: 41 55 push %r13 b889b8: 49 89 fd mov %rdi,%r13 b889bb: 41 54 push %r12 b889bd: 53 push %rbx b889be: e8 6d a5 1a 00 callq d32f30 b889c3: e8 e8 fb 6a ff callq 2385b0 b889c8: 4c 89 ef mov %r13,%rdi b889cb: 83 05 b6 98 61 00 01 addl $0x1,0x6198b6(%rip) # 11a2288 b889d2: e8 f9 a4 1a 00 callq d32ed0 b889d7: 49 8b 9d 68 02 00 00 mov 0x268(%r13),%rbx b889de: 31 c0 xor %eax,%eax > (Please leave the alloca in the AIX implementation; we currently don't have > the cycles to run regression tests for this) Ok, I got it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From thomas.stuefe at gmail.com Sun Mar 21 07:22:14 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 21 Mar 2021 08:22:14 +0100 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> Message-ID: On Sun, Mar 21, 2021 at 7:43 AM Yasumasa Suenaga wrote: > On Sat, 20 Mar 2021 12:35:18 GMT, Yasumasa Suenaga > wrote: > > >> @YaSuenag The code generated by debug builds is often significantly > different from release builds. You might want to check the latter instead > to figure out if this has any effect on what we actually ship. > > > > @magicus I checked assembly code of release build, it is similar to > fastdebug build. The stack (RSP) is expanded. > > (I confirmed it with Visual Studio 16.9.2 because I received update > notification before your reply...) > > // Try to randomize the cache line index of hot stack frames. > > // This helps when threads of the same stack traces evict each other's > > // cache lines. The threads can be either from the same JVM instance, > or > > // from different JVM instances. The benefit is especially true for > > // processors with hyperthreading technology. > > static int counter = 0; > > int pid = os::current_process_id(); > > 00007FF80F4E10ED mov eax,dword ptr [_initial_pid > (07FF80F9D8164h)] > > 00007FF80F4E10F3 test eax,eax > > 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) > > 00007FF80F4E10F7 call qword ptr [__imp__getpid > (07FF80F6EF748h)] > > _alloca(((pid ^ counter++) & 7) * 128); > > 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] > > 00007FF80F4E1103 mov edx,ecx > > 00007FF80F4E1105 xor edx,eax > > 00007FF80F4E1107 inc ecx > > 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx > > 00007FF80F4E110F and edx,7 > > 00007FF80F4E1112 shl edx,7 > > 00007FF80F4E1115 mov eax,edx > > 00007FF80F4E1117 lea rcx,[rdx+0Fh] > > 00007FF80F4E111B cmp rcx,rax > > 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) > > 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h > > 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h > > 00007FF80F4E112E mov rax,rcx > > 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) > > 00007FF80F4E1136 sub rsp,rcx > > > Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure > we > > measure the right thing? I wish there were regression tests telling us > when > > to re-apply this optimization. > > I think we can decide by whether `alloca()` or equivalent stack operation > is contained in current binary. > If current binary does not contain it like JDK 17 Linux x64, we can remove > it because it already does not work - performance degradation will not > happen. > > How do you know this? We may have had a performance degradation since years without noticing. The question is, if a performance optimization, considered necessary in the past, did bitrot, should we remove or repair it. In its context, I think Alpine + musl libc is also ok to remove it because > I confirmed JDK 17 for Alpine x64 (from jdk.java.net) does not contain > stack operation same as x64. > > 0000000000b889b0 : > b889b0: 55 push %rbp > b889b1: 48 89 e5 mov %rsp,%rbp > b889b4: 41 56 push %r14 > b889b6: 41 55 push %r13 > b889b8: 49 89 fd mov %rdi,%r13 > b889bb: 41 54 push %r12 > b889bd: 53 push %rbx > b889be: e8 6d a5 1a 00 callq d32f30 > > b889c3: e8 e8 fb 6a ff callq 2385b0 > b889c8: 4c 89 ef mov %r13,%rdi > b889cb: 83 05 b6 98 61 00 01 addl $0x1,0x6198b6(%rip) > # 11a2288 > b889d2: e8 f9 a4 1a 00 callq d32ed0 > > b889d7: 49 8b 9d 68 02 00 00 mov 0x268(%r13),%rbx > b889de: 31 c0 xor %eax,%eax > > which does not prove it is not beneficial, it just proves it is not there. > > (Please leave the alloca in the AIX implementation; we currently don't > have > > the cycles to run regression tests for this) > > Ok, I got it. > > Cheers, Thomas > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3042 > From david.holmes at oracle.com Sun Mar 21 13:25:38 2021 From: david.holmes at oracle.com (David Holmes) Date: Sun, 21 Mar 2021 23:25:38 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <02db4d31-aba4-6113-4c61-0b4ae03759df@oracle.com> Message-ID: <2fdd402a-170a-f60a-be74-de22c4f049d3@oracle.com> Hi Thomas, On 21/03/2021 3:57 pm, Thomas St?fe wrote: > On Sun, Mar 21, 2021 at 6:08 AM David Holmes > wrote: > > On 20/03/2021 10:37 pm, Yasumasa Suenaga wrote: > > On Sat, 20 Mar 2021 10:20:50 GMT, Magnus Ihse Bursie > > wrote: > > > >>>> I debug'ed fastdebug JDK on Visual Studio, it seems to work > `alloca()`: > >>> > >>> I used cl.exe from Visual Studio 2019 (16.9.1). The generated > code might be different by compiler version of course. > >> > >> @YaSuenag The code generated by debug builds is often > significantly different from release builds. You might want to check > the latter instead to figure out if this has any effect on what we > actually ship. > > > > @magicus I checked assembly code of release build, it is similar > to fastdebug build. The stack (RSP) is expanded. > > Thanks for confirming - I was going to make the same point about > needing > to check product build. > > Now we need someone who can check clang output. > > The rough benchmarking I did did not show any benefit to using the > alloca on Linux. So I think we can safely say it can go on Linux. > > The benchmarking on Windows, with alloca removed, did not show any > degradation in performance. So I think we can asafely say it is okay to > remove on Windows too. > > The benchmarking on macOS show no degradation either, but until I know > whether the alloca was being elided by clang that may not mean > anything. > If it is being elided we need to restore it and then check performance > again. But I need to know the trick that worked on gcc will also work > for clang. > > > Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure > we measure the right thing? I wish there were regression tests telling > us when to re-apply this optimization. No, no, no, and me too. It is really frustrating to have an optimization like this put in place and absolutely zero information as to how the purported benefit of it was measured and under what conditions. > I dislike that this leaves us at the mercy of the underlying libc for > something which is reasonably cheap and simple to do (just one alloca). Not sure what you mean. We have no idea if the alloca is helpful or harmful nor whether that depends on whether libc does, or does not, do something "clever" itself, nor whether the OS (eg. via ASLR) does, or does not do, something "clever" itself. And even if we do want the alloca we have to fight the compiler to ensure it gets used as intended! Cheers, David > ..Thomas > > (Please leave the alloca in the AIX implementation; we currently don't > have the cycles to run regression tests for this) > > ----- > > > (I confirmed it with Visual Studio 16.9.2 because I received > update notification before your reply...) > >? ? // Try to randomize the cache line index of hot stack frames. > >? ? // This helps when threads of the same stack traces evict each > other's > >? ? // cache lines. The threads can be either from the same JVM > instance, or > >? ? // from different JVM instances. The benefit is especially > true for > >? ? // processors with hyperthreading technology. > >? ? static int counter = 0; > >? ? int pid = os::current_process_id(); > > 00007FF80F4E10ED? mov? ? ? ? ?eax,dword ptr [_initial_pid > (07FF80F9D8164h)] > > 00007FF80F4E10F3? test? ? ? ? eax,eax > > 00007FF80F4E10F5? jne? ? ? ? ?thread_native_entry+3Dh > (07FF80F4E10FDh) > > 00007FF80F4E10F7? call? ? ? ? qword ptr [__imp__getpid > (07FF80F6EF748h)] > >? ? _alloca(((pid ^ counter++) & 7) * 128); > > 00007FF80F4E10FD? mov? ? ? ? ?ecx,dword ptr [counter > (07FF80F9D8300h)] > > 00007FF80F4E1103? mov? ? ? ? ?edx,ecx > > 00007FF80F4E1105? xor? ? ? ? ?edx,eax > > 00007FF80F4E1107? inc? ? ? ? ?ecx > > 00007FF80F4E1109? mov? ? ? ? ?dword ptr [counter > (07FF80F9D8300h)],ecx > > 00007FF80F4E110F? and? ? ? ? ?edx,7 > > 00007FF80F4E1112? shl? ? ? ? ?edx,7 > > 00007FF80F4E1115? mov? ? ? ? ?eax,edx > > 00007FF80F4E1117? lea? ? ? ? ?rcx,[rdx+0Fh] > > 00007FF80F4E111B? cmp? ? ? ? ?rcx,rax > > 00007FF80F4E111E? ja? ? ? ? ? thread_native_entry+6Ah > (07FF80F4E112Ah) > > 00007FF80F4E1120? mov? ? ? ? ?rcx,0FFFFFFFFFFFFFF0h > > 00007FF80F4E112A? and? ? ? ? ?rcx,0FFFFFFFFFFFFFFF0h > > 00007FF80F4E112E? mov? ? ? ? ?rax,rcx > > 00007FF80F4E1131? call? ? ? ? __chkstk (07FF80F6ED370h) > > 00007FF80F4E1136? sub? ? ? ? ?rsp,rcx > > > > ------------- > > > > PR: https://git.openjdk.java.net/jdk/pull/3042 > > > > From david.holmes at oracle.com Sun Mar 21 13:30:15 2021 From: david.holmes at oracle.com (David Holmes) Date: Sun, 21 Mar 2021 23:30:15 +1000 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: Message-ID: <59beca9b-c30f-635f-80d0-de24d7db8b2e@oracle.com> On 21/03/2021 5:22 pm, Thomas St?fe wrote: > On Sun, Mar 21, 2021 at 7:43 AM Yasumasa Suenaga > wrote: > >> On Sat, 20 Mar 2021 12:35:18 GMT, Yasumasa Suenaga >> wrote: >> >>>> @YaSuenag The code generated by debug builds is often significantly >> different from release builds. You might want to check the latter instead >> to figure out if this has any effect on what we actually ship. >>> >>> @magicus I checked assembly code of release build, it is similar to >> fastdebug build. The stack (RSP) is expanded. >>> (I confirmed it with Visual Studio 16.9.2 because I received update >> notification before your reply...) >>> // Try to randomize the cache line index of hot stack frames. >>> // This helps when threads of the same stack traces evict each other's >>> // cache lines. The threads can be either from the same JVM instance, >> or >>> // from different JVM instances. The benefit is especially true for >>> // processors with hyperthreading technology. >>> static int counter = 0; >>> int pid = os::current_process_id(); >>> 00007FF80F4E10ED mov eax,dword ptr [_initial_pid >> (07FF80F9D8164h)] >>> 00007FF80F4E10F3 test eax,eax >>> 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) >>> 00007FF80F4E10F7 call qword ptr [__imp__getpid >> (07FF80F6EF748h)] >>> _alloca(((pid ^ counter++) & 7) * 128); >>> 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] >>> 00007FF80F4E1103 mov edx,ecx >>> 00007FF80F4E1105 xor edx,eax >>> 00007FF80F4E1107 inc ecx >>> 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx >>> 00007FF80F4E110F and edx,7 >>> 00007FF80F4E1112 shl edx,7 >>> 00007FF80F4E1115 mov eax,edx >>> 00007FF80F4E1117 lea rcx,[rdx+0Fh] >>> 00007FF80F4E111B cmp rcx,rax >>> 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) >>> 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h >>> 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h >>> 00007FF80F4E112E mov rax,rcx >>> 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) >>> 00007FF80F4E1136 sub rsp,rcx >> >>> Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure >> we >>> measure the right thing? I wish there were regression tests telling us >> when >>> to re-apply this optimization. >> >> I think we can decide by whether `alloca()` or equivalent stack operation >> is contained in current binary. >> If current binary does not contain it like JDK 17 Linux x64, we can remove >> it because it already does not work - performance degradation will not >> happen. >> >> > How do you know this? We may have had a performance degradation since years > without noticing. The question is, if a performance optimization, > considered necessary in the past, did bitrot, should we remove or repair it. > > In its context, I think Alpine + musl libc is also ok to remove it because >> I confirmed JDK 17 for Alpine x64 (from jdk.java.net) does not contain >> stack operation same as x64. >> >> 0000000000b889b0 : >> b889b0: 55 push %rbp >> b889b1: 48 89 e5 mov %rsp,%rbp >> b889b4: 41 56 push %r14 >> b889b6: 41 55 push %r13 >> b889b8: 49 89 fd mov %rdi,%r13 >> b889bb: 41 54 push %r12 >> b889bd: 53 push %rbx >> b889be: e8 6d a5 1a 00 callq d32f30 >> >> b889c3: e8 e8 fb 6a ff callq 2385b0 >> b889c8: 4c 89 ef mov %r13,%rdi >> b889cb: 83 05 b6 98 61 00 01 addl $0x1,0x6198b6(%rip) >> # 11a2288 >> b889d2: e8 f9 a4 1a 00 callq d32ed0 >> >> b889d7: 49 8b 9d 68 02 00 00 mov 0x268(%r13),%rbx >> b889de: 31 c0 xor %eax,%eax >> >> > which does not prove it is not beneficial, it just proves it is not there. Right. It may be that it is of no benefit on glibc because glibc already does something clever here. But musl may not, in which case repairing the alloca call may improve performance on musl. This is frustrating because on the one hand there is no point fixing a warning for something the compiler elides anyway; while on the other we can't even know if this quirky optimisation would be of benefit under some conditions. And we don't know how to try and measure it regardless. :( David ----- > >>> (Please leave the alloca in the AIX implementation; we currently don't >> have >>> the cycles to run regression tests for this) >> >> Ok, I got it. >> >> > Cheers, Thomas > > >> ------------- >> >> PR: https://git.openjdk.java.net/jdk/pull/3042 >> From thomas.stuefe at gmail.com Sun Mar 21 15:45:39 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sun, 21 Mar 2021 16:45:39 +0100 Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: <59beca9b-c30f-635f-80d0-de24d7db8b2e@oracle.com> References: <59beca9b-c30f-635f-80d0-de24d7db8b2e@oracle.com> Message-ID: On Sun, Mar 21, 2021 at 2:30 PM David Holmes wrote: > On 21/03/2021 5:22 pm, Thomas St?fe wrote: > > On Sun, Mar 21, 2021 at 7:43 AM Yasumasa Suenaga < > ysuenaga at openjdk.java.net> > > wrote: > > > >> On Sat, 20 Mar 2021 12:35:18 GMT, Yasumasa Suenaga < > ysuenaga at openjdk.org> > >> wrote: > >> > >>>> @YaSuenag The code generated by debug builds is often significantly > >> different from release builds. You might want to check the latter > instead > >> to figure out if this has any effect on what we actually ship. > >>> > >>> @magicus I checked assembly code of release build, it is similar to > >> fastdebug build. The stack (RSP) is expanded. > >>> (I confirmed it with Visual Studio 16.9.2 because I received update > >> notification before your reply...) > >>> // Try to randomize the cache line index of hot stack frames. > >>> // This helps when threads of the same stack traces evict each > other's > >>> // cache lines. The threads can be either from the same JVM > instance, > >> or > >>> // from different JVM instances. The benefit is especially true for > >>> // processors with hyperthreading technology. > >>> static int counter = 0; > >>> int pid = os::current_process_id(); > >>> 00007FF80F4E10ED mov eax,dword ptr [_initial_pid > >> (07FF80F9D8164h)] > >>> 00007FF80F4E10F3 test eax,eax > >>> 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) > >>> 00007FF80F4E10F7 call qword ptr [__imp__getpid > >> (07FF80F6EF748h)] > >>> _alloca(((pid ^ counter++) & 7) * 128); > >>> 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] > >>> 00007FF80F4E1103 mov edx,ecx > >>> 00007FF80F4E1105 xor edx,eax > >>> 00007FF80F4E1107 inc ecx > >>> 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx > >>> 00007FF80F4E110F and edx,7 > >>> 00007FF80F4E1112 shl edx,7 > >>> 00007FF80F4E1115 mov eax,edx > >>> 00007FF80F4E1117 lea rcx,[rdx+0Fh] > >>> 00007FF80F4E111B cmp rcx,rax > >>> 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) > >>> 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h > >>> 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h > >>> 00007FF80F4E112E mov rax,rcx > >>> 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) > >>> 00007FF80F4E1136 sub rsp,rcx > >> > >>> Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure > >> we > >>> measure the right thing? I wish there were regression tests telling us > >> when > >>> to re-apply this optimization. > >> > >> I think we can decide by whether `alloca()` or equivalent stack > operation > >> is contained in current binary. > >> If current binary does not contain it like JDK 17 Linux x64, we can > remove > >> it because it already does not work - performance degradation will not > >> happen. > >> > >> > > How do you know this? We may have had a performance degradation since > years > > without noticing. The question is, if a performance optimization, > > considered necessary in the past, did bitrot, should we remove or repair > it. > > > > In its context, I think Alpine + musl libc is also ok to remove it > because > >> I confirmed JDK 17 for Alpine x64 (from jdk.java.net) does not contain > >> stack operation same as x64. > >> > >> 0000000000b889b0 : > >> b889b0: 55 push %rbp > >> b889b1: 48 89 e5 mov %rsp,%rbp > >> b889b4: 41 56 push %r14 > >> b889b6: 41 55 push %r13 > >> b889b8: 49 89 fd mov %rdi,%r13 > >> b889bb: 41 54 push %r12 > >> b889bd: 53 push %rbx > >> b889be: e8 6d a5 1a 00 callq d32f30 > >> > >> b889c3: e8 e8 fb 6a ff callq 2385b0 > >> b889c8: 4c 89 ef mov %r13,%rdi > >> b889cb: 83 05 b6 98 61 00 01 addl $0x1,0x6198b6(%rip) > >> # 11a2288 > >> b889d2: e8 f9 a4 1a 00 callq d32ed0 > >> > >> b889d7: 49 8b 9d 68 02 00 00 mov 0x268(%r13),%rbx > >> b889de: 31 c0 xor %eax,%eax > >> > >> > > which does not prove it is not beneficial, it just proves it is not > there. > > Right. It may be that it is of no benefit on glibc because glibc already > does something clever here. But musl may not, in which case repairing > the alloca call may improve performance on musl. > > This is frustrating because on the one hand there is no point fixing a > warning for something the compiler elides anyway; while on the other we > can't even know if this quirky optimisation would be of benefit under > some conditions. > > And we don't know how to try and measure it regardless. :( > > Sigh. This feels like cargo cult programming. This is really frustrating. ..Thomas David > ----- > > > > >>> (Please leave the alloca in the AIX implementation; we currently don't > >> have > >>> the cycles to run regression tests for this) > >> > >> Ok, I got it. > >> > >> > > Cheers, Thomas > > > > > >> ------------- > >> > >> PR: https://git.openjdk.java.net/jdk/pull/3042 > >> > From kbarrett at openjdk.java.net Mon Mar 22 04:42:40 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 22 Mar 2021 04:42:40 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 08:32:09 GMT, Amit Pawar wrote: > [https://bugs.openjdk.java.net/browse/JDK-8254699](JDK-8254699) contains test results in XL file to show PreTouchParallelChunkSize was recently changed from 1GB to 4MB on Linux after testing various sizes. I have downloaded the same XL file and same is updated for Oldgen case during resize and it gives some rough idea about the improvement for this fix and follow up fix. Please check "PretouchOldgenDuringResize" sheet for "Co-operative Fix" and "Adaptive Resize Fix" columns. > > [PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx](https://github.com/openjdk/jdk/files/6147180/PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx) I'm not sure how to interpret this. In particular, it says "Test done using November 27rd Openjdk build". That predates a number of recent changes in this area that seem relevant. Is that correct? Or is that stale and the data has been updated for a newer baseline. Also, can you provide details about how this information is generated and collected? I would like to reproduce the results and do some other experiments. I think that, as proposed, the change is making already messy code still more messy, and some refactoring is needed. I don't know what the details of that might look like yet; I'm still exploring and poking at the code. I also need to look at the corresponding part of G1 that you mentioned. > The "Adaptive Resize Fix" column in the sheet is for next suggested fix and may possibly help to improve further. For server JVM, expansion size of 512KB, 2MB (hugepages) and 64MB looks good for first resize but later needs some attention I think. JVM flag "MinHeapDeltaBytes" needs to be known by the user and need to set it upfront. I think this can be consider for first resize in every GC and later dynamically go for higher size like double the previous size to adopt to application nature. This way it may help to reduce the GC pause time during the expansion. I thought to share my observation and my understanding could be wrong. So please check and suggest. I think something like this has a serious risk of growing the oldgen a lot more than needed, which may have serious downsides. ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From tobias.hartmann at oracle.com Mon Mar 22 07:01:40 2021 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Mon, 22 Mar 2021 08:01:40 +0100 Subject: Result: New HotSpot Group Member: Christian Hagedorn Message-ID: <612ff140-1838-6e6d-2bbf-b080af4685fc@oracle.com> The vote for Christian Hagedorn [1] is now closed. Yes: 9 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Best regards, Tobias [1] https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-March/049618.html From akozlov at openjdk.java.net Mon Mar 22 12:50:14 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 22 Mar 2021 12:50:14 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: - Merge branch 'master' into jdk-macos - JDK-8262491: bsd_aarch64 part - JDK-8263002: bsd_aarch64 part - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos - Wider #ifdef block - Fix most of issues in java/foreign/ tests Failures related to va_args are tracked in JDK-8263512. - Add Azul copyright - Update Oracle copyright years - Use Thread::current_or_null_safe in SafeFetch - 8262903: [macos_aarch64] Thread::current() called on detached thread - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 ------------- Changes: https://git.openjdk.java.net/jdk/pull/2200/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=28 Stats: 2947 lines in 75 files changed: 2838 ins; 27 del; 82 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From lucy at openjdk.java.net Mon Mar 22 14:55:01 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Mon, 22 Mar 2021 14:55:01 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) [v2] In-Reply-To: References: Message-ID: > 8263260: [s390] Support latest hardware (z14 and z15) Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: Changes as requested by TheRealMDoerr ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2918/files - new: https://git.openjdk.java.net/jdk/pull/2918/files/c8dcfb44..1ab0d054 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2918&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2918&range=00-01 Stats: 58 lines in 2 files changed: 24 ins; 16 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/2918.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2918/head:pull/2918 PR: https://git.openjdk.java.net/jdk/pull/2918 From coleenp at openjdk.java.net Mon Mar 22 15:03:58 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 15:03:58 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain Message-ID: Please review this mostly trivial fix to move SystemDictionary::validate_protection_domain into Dictionary and hide the functions in Dictionary that it calls. This change also removes some #include dictionary.hpp and a TRAPS parameter where not needed. See CR for more details. This function was in the middle of others that I want to keep together in systemDictionary.cpp. Tested with tier1 on 4 Oracle supported platforms. ------------- Commit messages: - move some functions back, delete equals function. - 8263974: Move SystemDictionary::verify_protection_domain Changes: https://git.openjdk.java.net/jdk/pull/3120/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3120&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263974 Stats: 175 lines in 8 files changed: 80 ins; 85 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/3120.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3120/head:pull/3120 PR: https://git.openjdk.java.net/jdk/pull/3120 From mdoerr at openjdk.java.net Mon Mar 22 15:12:49 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 22 Mar 2021 15:12:49 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) [v2] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 14:55:01 GMT, Lutz Schmidt wrote: >> 8263260: [s390] Support latest hardware (z14 and z15) > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > Changes as requested by TheRealMDoerr Thanks for cleaning up! LGTM. ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2918 From lucy at openjdk.java.net Mon Mar 22 15:12:50 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Mon, 22 Mar 2021 15:12:50 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) [v2] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 15:09:05 GMT, Martin Doerr wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> Changes as requested by TheRealMDoerr > > Thanks for cleaning up! LGTM. Goetz and Martin, thanks a lot for your reviews. Lutz > src/hotspot/cpu/s390/vm_version_s390.cpp line 54: > >> 52: static const char* z_name[] = {" ", "z900", "z990", "z9 EC", "z10 EC", "z196 EC", "ec12", "z13", "z14", "z15" }; >> 53: static const char* z_WDFM[] = {" ", "2006-06-30", "2008-06-30", "2010-06-30", "2012-06-30", "2014-06-30", "2016-12-31", "2019-06-30", "2021-06-30", "tbd" }; >> 54: static const char* z_EOS[] = {" ", "2014-12-31", "2014-12-31", "2017-10-31", "2019-12-31", "2021-12-31", "tbd", "tbd", "tbd", "tbd" }; > > Table provides a nice overview, but seems like only z_name is used in the code. The rest only serves as comments. You are right, as of now, the tables are mostly for documentation. z_EOS is used now to fill _features_string. I'd like to keep the tables as they are for possible future use. > src/hotspot/cpu/s390/vm_version_s390.cpp line 309: > >> 307: } >> 308: if (is_z9()) { >> 309: _features_string = "system-z, g3-z9, ldisp_fast, extimm, out-of-support_as_of_2016-04-01"; > > How does this relate to the table above? Changed such that End-of-Support information is taken from table z_EOS[]. > src/hotspot/cpu/s390/vm_version_s390.hpp line 48: > >> 46: // z13: 2015-03 >> 47: // z14: 2017-09 >> 48: // z15: 2019-09 > > How does this relate to the table in the .cpp file? I'd prefer to have such kind of information consolidated at one place. Consolidated information in vm_version_s390.cpp. ------------- PR: https://git.openjdk.java.net/jdk/pull/2918 From coleenp at openjdk.java.net Mon Mar 22 15:55:58 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 15:55:58 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable Message-ID: >From CR: The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. This change was performance tested on linux and windows. It was also tested with tier1-6. ------------- Commit messages: - 8263976: Remove block allocation from BasicHashtable Changes: https://git.openjdk.java.net/jdk/pull/3123/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3123&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263976 Stats: 169 lines in 16 files changed: 17 ins; 122 del; 30 mod Patch: https://git.openjdk.java.net/jdk/pull/3123.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3123/head:pull/3123 PR: https://git.openjdk.java.net/jdk/pull/3123 From coleenp at openjdk.java.net Mon Mar 22 16:56:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 16:56:41 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 15:49:24 GMT, Coleen Phillimore wrote: > From CR: > The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: > > AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. > > ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. > > Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. > > This change was performance tested on linux and windows. > > It was also tested with tier1-6. I forgot to add that we can eventually replace these tables with std::unordered_set once the allocation and other template parameters are decided on. There are also other cleanups that we could do with this code, ie. Hashtable isn't that different from BasicHashtable so really isn't needed. We could make the Entry type a template parameter. This change is only a step in this direction. ------------- PR: https://git.openjdk.java.net/jdk/pull/3123 From github.com+71302734+amitdpawar at openjdk.java.net Mon Mar 22 17:42:40 2021 From: github.com+71302734+amitdpawar at openjdk.java.net (Amit Pawar) Date: Mon, 22 Mar 2021 17:42:40 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: References: Message-ID: <8hmX1slrUPMWTec3K6Z1xSV7D6JEtbmLEox0JNZ93xQ=.66ea31cd-f702-4653-8a12-01cea1d50843@github.com> On Mon, 22 Mar 2021 04:39:48 GMT, Kim Barrett wrote: >>> >>> >>> _Mailing list message from [Kim Barrett](mailto:kim.barrett at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ >>> >>> > On Mar 12, 2021, at 3:01 PM, Amit Pawar wrote: >>> > In case of ParallelGC, oldgen expansion can happen during promotion. Expanding thread will touch the pages and can't request for task execution as this GC thread is already executing a task. The expanding thread holds the lock on "ExpandHeap_lock" to resize the oldgen and other threads may wait for their turn. This is a blocking call. >>> > This patch changes this behavior by adding another constructor in "MutexLocker" class to enable non blocking or try_lock operation. This way one thread will acquire the lock and other threads can join pretouch work. Threads failed to acquire the lock will join pretouch only when task is marked ready by expanding thread. >>> > Following minimum expansion size are seen during expansion. >>> > 1. 512KB without largepages and without UseNUMA. >>> > 2. 64MB without largepages and with UseNUMA, >>> > 3. 2MB (on x86) with large pages and without UseNUMA, >>> > 4. 64MB without large pages and with UseNUMA. >>> > When Oldgen is expanding repeatedly with smaller size then this change wont help. For such cases, resize size should adapt to application demand to make use of this change. For example if application nature triggers 100 expansion with smaller sizes in same GC then it is better to increase the expansion size during each resize to reduce the number of resizes. If this patch is accepted then will plan to fix this case in another patch. >>> >>> Sorry, but a change like this needs better motivation. What you say >>> above suggests this change doesn't actually help. >>> >>> It's intentional that oldgen expansions aren't generally large, as the >>> oldgen shouldn't be grown unnecessarily. There are already parameters >>> such as MinHeapDeltaBytes to control and manipulate this. >>> >>> It is also preferable to complete an expansion request quickly to make >>> the additional space available to other threads in the main allocation >>> path, rather than making them go to the expand path. Making expansions >>> larger could force more threads to take the slower expand path, which >>> doesn't seem like a win even if they then help with the pretouch part >>> of another thread's expansion. (And that also assumes UsePreTouch is >>> even enabled.) >>> >>> So the followup change that you say is needed to make this one >>> profitable seems questionable. >>> >>> The proposed change is also surprisingly large and intrusive for >>> something that seems like it should be very localized. >>> >>> > Jtreg all test passed. >>> >>> A change like this needs a lot more testing than that, both functionally >>> and performance. >> >> [https://bugs.openjdk.java.net/browse/JDK-8254699](JDK-8254699) contains test results in XL file to show PreTouchParallelChunkSize was recently changed from 1GB to 4MB on Linux after testing various sizes. I have downloaded the same XL file and same is updated for Oldgen case during resize and it gives some rough idea about the improvement for this fix and follow up fix. Please check "PretouchOldgenDuringResize" sheet for "Co-operative Fix" and "Adaptive Resize Fix" columns. >> >> [PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx](https://github.com/openjdk/jdk/files/6147180/PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx) >> >> Running SPECJbb composite shows 30-40% reduction in GC pause time when old-gen expands upto ~1GB for UseNUMA case (I tested for minimum oldgen size to trigger the resize). Non UseNUMA case will show improvement only when resize expands more than minimum "PreTouchParallelChunkSize" size to let other thread participate in pretouch work. Two cases 1 & 3 (without UseNUMA and with/without hugpages) above didnt shows any improvement because of "PreTouchParallelChunkSize" limit and that is the reason why I suggested another fix. >> >> The "Adaptive Resize Fix" column in the sheet is for next suggested fix and may possibly help to improve further. For server JVM, expansion size of 512KB, 2MB (hugepages) and 64MB looks good for first resize but later needs some attention I think. JVM flag "MinHeapDeltaBytes" needs to be known by the user and need to set it upfront. I think this can be consider for first resize in every GC and later dynamically go for higher size like double the previous size to adopt to application nature. This way it may help to reduce the GC pause time during the expansion. I thought to share my observation and my understanding could be wrong. So please check and suggest. >> >> Again, thanks for your feedback. > >> [https://bugs.openjdk.java.net/browse/JDK-8254699](JDK-8254699) contains test results in XL file to show PreTouchParallelChunkSize was recently changed from 1GB to 4MB on Linux after testing various sizes. I have downloaded the same XL file and same is updated for Oldgen case during resize and it gives some rough idea about the improvement for this fix and follow up fix. Please check "PretouchOldgenDuringResize" sheet for "Co-operative Fix" and "Adaptive Resize Fix" columns. >> >> [PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx](https://github.com/openjdk/jdk/files/6147180/PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx) > > I'm not sure how to interpret this. In particular, it says "Test done using November 27rd Openjdk build". That predates a number of recent changes in this area that seem relevant. Is that correct? Or is that stale and the data has been updated for a newer baseline. > > Also, can you provide details about how this information is generated and collected? I would like to reproduce the results and do some other experiments. > > I think that, as proposed, the change is making already messy code still more messy, and some refactoring is needed. I don't know what the details of that might look like yet; I'm still exploring and poking at the code. I also need to look at the corresponding part of G1 that you mentioned. > >> The "Adaptive Resize Fix" column in the sheet is for next suggested fix and may possibly help to improve further. For server JVM, expansion size of 512KB, 2MB (hugepages) and 64MB looks good for first resize but later needs some attention I think. JVM flag "MinHeapDeltaBytes" needs to be known by the user and need to set it upfront. I think this can be consider for first resize in every GC and later dynamically go for higher size like double the previous size to adopt to application nature. This way it may help to reduce the GC pause time during the expansion. I thought to share my observation and my understanding could be wrong. So please check and suggest. > > I think something like this has a serious risk of growing the oldgen a lot more than needed, which may have serious downsides. Thank you Kim for your reply and please see my inline comments. > > > > [https://bugs.openjdk.java.net/browse/JDK-8254699](JDK-8254699) contains test results in XL file to show PreTouchParallelChunkSize was recently changed from 1GB to 4MB on Linux after testing various sizes. I have downloaded the same XL file and same is updated for Oldgen case during resize and it gives some rough idea about the improvement for this fix and follow up fix. Please check "PretouchOldgenDuringResize" sheet for "Co-operative Fix" and "Adaptive Resize Fix" columns. > > [PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx](https://github.com/openjdk/jdk/files/6147180/PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx) > > I'm not sure how to interpret this. In particular, it says "Test done using November 27rd Openjdk build". That predates a number of recent changes in this area that seem relevant. Is that correct? Or is that stale and the data has been updated for a newer baseline. Sorry for the confusion. I didnt test again and pre-touch time taken with different chunk size per thread was already recorded in the spreadsheet and thought to use it for reference to reply to David feedback "A change like this needs a lot more testing than that, both functionally and performance.". > > Also, can you provide details about how this information is generated and collected? I would like to reproduce the results and do some other experiments. JDK-8254699 was fixed by me and test was done using SPECJbb composite with following command along with JVM flag "PreTouchParallelChunkSize" with values 1G (earlier default on x86) and 4M (current default on x86). To generate this info, "Ticks::now()" function was called to record the time taken for pretouch operation. The first table (please ignore resize related) in the sheet "SPECjbb_Summary" was created from the sheet "SPECjbb_128C256T_GCPretouchTime" that contains pretouch log dumps for SPECjbb composite tests. Command (also mentioned in SPECjbb_128C256T_GCPretouchTime sheet): -Xms24g -Xmx970g -Xmn960g -server -XX:+UseParallelGC -XX:+AlwaysPreTouch -XX:+UseLargePages -XX:LargePageSizeInBytes=2m -XX:MaxTenuringThreshold=15 -XX:+ScavengeBeforeFullGC -XX:+UseAdaptiveSizePolicy -XX:ParallelGCThreads=256 -XX:-UseBiasedLocking -XX:SurvivorRatio=200 -XX:TargetSurvivorRatio=95 -XX:+UseNUMA -XX:-UseNUMAInterleaving -XX:+UseTransparentHugePages -XX:-UseAdaptiveNUMAChunkSizing > > I think that, as proposed, the change is making already messy code still more messy, and some refactoring is needed. I don't know what the details of that might look like yet; I'm still exploring and poking at the code. I also need to look at the corresponding part of G1 that you mentioned. First, I thought to push this change for ParallelGC and next G1GC. Earlier ParallelGC pretouch was done with single threaded and later this was fixed by moving multithreaded pretouch common code from G1GC and ParallelGC to PretouchTask class. I thought to do it similarly for this case too. > > > The "Adaptive Resize Fix" column in the sheet is for next suggested fix and may possibly help to improve further. For server JVM, expansion size of 512KB, 2MB (hugepages) and 64MB looks good for first resize but later needs some attention I think. JVM flag "MinHeapDeltaBytes" needs to be known by the user and need to set it upfront. I think this can be consider for first resize in every GC and later dynamically go for higher size like double the previous size to adopt to application nature. This way it may help to reduce the GC pause time during the expansion. I thought to share my observation and my understanding could be wrong. So please check and suggest. > > I think something like this has a serious risk of growing the oldgen a lot more than needed, which may have serious downsides. I believe this depends on application runtime nature and it may expand as needed. If you still think that this change is not useful then please suggest. Will withdraw this PR. Thanks, Amit ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From hseigel at openjdk.java.net Mon Mar 22 17:57:40 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 22 Mar 2021 17:57:40 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 14:57:02 GMT, Coleen Phillimore wrote: > Please review this mostly trivial fix to move SystemDictionary::validate_protection_domain into Dictionary and hide the functions in Dictionary that it calls. This change also removes some #include dictionary.hpp and a TRAPS parameter where not needed. > > See CR for more details. This function was in the middle of others that I want to keep together in systemDictionary.cpp. > > Tested with tier1 on 4 Oracle supported platforms. Other than that one nit, this looks good. Thanks, Harold src/hotspot/share/classfile/dictionary.cpp line 362: > 360: TRAPS) { > 361: > 362: // Now we have to call back to java to check if the initating class has access Can you move this comment down about 10 lines? ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3120 From coleenp at openjdk.java.net Mon Mar 22 18:06:04 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 18:06:04 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain [v2] In-Reply-To: References: Message-ID: > Please review this mostly trivial fix to move SystemDictionary::validate_protection_domain into Dictionary and hide the functions in Dictionary that it calls. This change also removes some #include dictionary.hpp and a TRAPS parameter where not needed. > > See CR for more details. This function was in the middle of others that I want to keep together in systemDictionary.cpp. > > Tested with tier1 on 4 Oracle supported platforms. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Move comment down to the code it describes. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3120/files - new: https://git.openjdk.java.net/jdk/pull/3120/files/33283cbf..46949f21 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3120&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3120&range=00-01 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3120.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3120/head:pull/3120 PR: https://git.openjdk.java.net/jdk/pull/3120 From coleenp at openjdk.java.net Mon Mar 22 18:06:05 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 18:06:05 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain [v2] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 17:54:27 GMT, Harold Seigel wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Move comment down to the code it describes. > > src/hotspot/share/classfile/dictionary.cpp line 362: > >> 360: TRAPS) { >> 361: >> 362: // Now we have to call back to java to check if the initating class has access > > Can you move this comment down about 10 lines? Ok Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/3120 From iklam at openjdk.java.net Mon Mar 22 20:04:40 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 22 Mar 2021 20:04:40 GMT Subject: RFR: 8263002: Remove CDS MiscCode region [v3] In-Reply-To: References: <4EsEk0VVQDAbkJzXmrMh2a_B4sFq7diglgCo3UqZpBQ=.5f4541c2-76aa-46da-82b5-dc8e537a85f3@github.com> Message-ID: On Wed, 10 Mar 2021 06:07:04 GMT, Ioi Lam wrote: >> wow. Looks good to me. > > Thanks @coleenp and @dholmes-ora for the review! > Hi @iklam, sorry to bother you, I found there are some comments that still refer to [lingering mc regions](https://github.com/openjdk/jdk/compare/master...kelthuzadx:residual_mc_region). Thanks @kelthuzadx , I have filed [JDK-8263998](https://bugs.openjdk.java.net/browse/JDK-8263998) (Remove mentions of mc region in comments) ------------- PR: https://git.openjdk.java.net/jdk/pull/2861 From lfoltan at openjdk.java.net Mon Mar 22 20:24:40 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Mon, 22 Mar 2021 20:24:40 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain [v2] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 18:06:04 GMT, Coleen Phillimore wrote: >> Please review this mostly trivial fix to move SystemDictionary::validate_protection_domain into Dictionary and hide the functions in Dictionary that it calls. This change also removes some #include dictionary.hpp and a TRAPS parameter where not needed. >> >> See CR for more details. This function was in the middle of others that I want to keep together in systemDictionary.cpp. >> >> Tested with tier1 on 4 Oracle supported platforms. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Move comment down to the code it describes. Looks good! Lois ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3120 From lfoltan at openjdk.java.net Mon Mar 22 20:45:40 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Mon, 22 Mar 2021 20:45:40 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 15:49:24 GMT, Coleen Phillimore wrote: > From CR: > The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: > > AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. > > ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. > > Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. > > This change was performance tested on linux and windows. > > It was also tested with tier1-6. Great clean up Coleen! ModuleEntryTable and PackageEntryTable changes looks good as does the other changes as well. Lois ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3123 From iklam at openjdk.java.net Mon Mar 22 21:13:43 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 22 Mar 2021 21:13:43 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: Message-ID: On Sun, 21 Mar 2021 04:22:42 GMT, Yi Yang wrote: > cl.exe(19.28.29334) can not build JDK on windows_x64 because it treats many warnings as errors thus prohibiting further compilation. (See detailed failure logs on JBS) > > 1. methodMatcher.cpp > > cl.exe can not handle advanced usage of sscanf(i.e. regex-like sscanf) correctly. This looks like an msvc compiler bug, it has been there for a long while, so I temporarily disable these warnings in a limited region. Outside of this region, the compiler still treats them as errors. This change does not affect the functionality of MethodMatcher::parse_method_pattern, it can parse class name and method name in a desired manner. > > 2. vm_version_x86.cpp > > Some comments contain characters(Register Trademark) that cannot be represented in the current code page (936). Replacing them with ASCII-characters makes the compiler happy. > > Best Regards, > Yang The problem in methodMatcher.cpp is caused by this: #define RANGEBASE "\x1\x2\x3\x4\x5\x6\x7\x8\xa\xb\xc\xd\xe\xf" \ "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ "\x21\x22\x23\x24\x25\x26\x27\x2a\x2b\x2c\x2d" ..... where the literal `\x25` is the `%` character. It seems like VC++ tries to interpret the `%` character even when it's inside square brackets, like if (1 == sscanf(line, "%1022[[);/" RANGEBASE "]%n", sig+1, &bytes_read)) { -> if (1 == sscanf(line, "%1022[[);.....#$%&.....]%n", sig+1, &bytes_read)) { ^ The [C++ reference](https://en.cppreference.com/w/c/io/fscanf) is unclear about how characters like `%` can be escaped inside square brackets (or whether they should be escaped at all). Trying to use sscanf for this purpose makes the code hard to understand and non portable. It's better to ditch sscanf and read the characters byte-by-byte. That way, you can get rid of the original `PRAGMA_DISABLE_MSVC_WARNING(4819)` as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From coleenp at openjdk.java.net Mon Mar 22 21:53:46 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 21:53:46 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain [v2] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 20:22:02 GMT, Lois Foltan wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Move comment down to the code it describes. > > Looks good! > Lois Thanks Harold and Lois! ------------- PR: https://git.openjdk.java.net/jdk/pull/3120 From iklam at openjdk.java.net Mon Mar 22 21:57:41 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 22 Mar 2021 21:57:41 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 15:49:24 GMT, Coleen Phillimore wrote: > From CR: > The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: > > AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. > > ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. > > Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. > > This change was performance tested on linux and windows. > > It was also tested with tier1-6. Looks good overall. Just some small nits. src/hotspot/share/utilities/hashtable.cpp line 54: > 52: BasicHashtableEntry* entry = ::new (NEW_C_HEAP_ARRAY(char, this->entry_size(), F)) > 53: BasicHashtableEntry(hashValue); > 54: assert(_entry_size % HeapWordSize == 0, ""); Is this assert still needed? What's its purpose? src/hotspot/share/utilities/hashtable.cpp line 64: > 62: > 63: if (DumpSharedSpaces) { > 64: // Avoid random bits in structure padding so we can have deterministic content in CDS archive Hmm, the sequence looks a little odd: the constructor initializes some fields, they are then zeroed here, and then initialized again below .... Actually, I think you can get rid of the `if (DumpSharedSpaces)` block for now. I am not sure if it's needed. Deterministic CDS archive is broken anyway (https://bugs.openjdk.java.net/browse/JDK-8253495). That way you can get rid of the entry->set_xxx below as well. ------------- Changes requested by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3123 From coleenp at openjdk.java.net Mon Mar 22 21:57:43 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 21:57:43 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 21:42:51 GMT, Ioi Lam wrote: >> From CR: >> The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: >> >> AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. >> >> ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. >> >> Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. >> >> This change was performance tested on linux and windows. >> >> It was also tested with tier1-6. > > src/hotspot/share/utilities/hashtable.cpp line 54: > >> 52: BasicHashtableEntry* entry = ::new (NEW_C_HEAP_ARRAY(char, this->entry_size(), F)) >> 53: BasicHashtableEntry(hashValue); >> 54: assert(_entry_size % HeapWordSize == 0, ""); > > Is this assert still needed? What's its purpose? Yeah, seems a bit strange. Of course the entry is a heapword size. I'll remove the assert. > src/hotspot/share/utilities/hashtable.cpp line 64: > >> 62: >> 63: if (DumpSharedSpaces) { >> 64: // Avoid random bits in structure padding so we can have deterministic content in CDS archive > > Hmm, the sequence looks a little odd: the constructor initializes some fields, they are then zeroed here, and then initialized again below .... > > Actually, I think you can get rid of the `if (DumpSharedSpaces)` block for now. I am not sure if it's needed. Deterministic CDS archive is broken anyway (https://bugs.openjdk.java.net/browse/JDK-8253495). > > That way you can get rid of the entry->set_xxx below as well. Yes, I'd like to make the constructors initialize the fields, but didn't know what to do about this block zeroing code. Would you have to add it back with deterministic GC? ------------- PR: https://git.openjdk.java.net/jdk/pull/3123 From iklam at openjdk.java.net Mon Mar 22 22:00:43 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 22 Mar 2021 22:00:43 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 21:54:23 GMT, Coleen Phillimore wrote: >> src/hotspot/share/utilities/hashtable.cpp line 64: >> >>> 62: >>> 63: if (DumpSharedSpaces) { >>> 64: // Avoid random bits in structure padding so we can have deterministic content in CDS archive >> >> Hmm, the sequence looks a little odd: the constructor initializes some fields, they are then zeroed here, and then initialized again below .... >> >> Actually, I think you can get rid of the `if (DumpSharedSpaces)` block for now. I am not sure if it's needed. Deterministic CDS archive is broken anyway (https://bugs.openjdk.java.net/browse/JDK-8253495). >> >> That way you can get rid of the entry->set_xxx below as well. > > Yes, I'd like to make the constructors initialize the fields, but didn't know what to do about this block zeroing code. Would you have to add it back with deterministic GC? > Yes, I'd like to make the constructors initialize the fields, but didn't know what to do about this block zeroing code. Would you have to add it back with deterministic CDS? I am not sure yet. I think we may not need it because CDS doesn't copy HashtableEntries into the archive anymore. ------------- PR: https://git.openjdk.java.net/jdk/pull/3123 From coleenp at openjdk.java.net Mon Mar 22 22:03:38 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 22:03:38 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable In-Reply-To: References: Message-ID: <_BbDo3ko5RZXjFskK5vat6YiY9_80_h9uiY3cJqmNd4=.384b4792-1690-4e99-a8ec-cae1a23ef39e@github.com> On Mon, 22 Mar 2021 21:58:09 GMT, Ioi Lam wrote: >> Yes, I'd like to make the constructors initialize the fields, but didn't know what to do about this block zeroing code. Would you have to add it back with deterministic GC? > >> Yes, I'd like to make the constructors initialize the fields, but didn't know what to do about this block zeroing code. Would you have to add it back with deterministic CDS? > > I am not sure yet. I think we may not need it because CDS doesn't copy HashtableEntries into the archive anymore. Instead of block zeroing hashtableEntry, you could add a field to zero in between _hash and _next and that would not have stray unaligned bytes. template class BasicHashtableEntry { friend class VMStructs; private: unsigned int _hash; // 32-bit hash for item // Link to next element in the linked list for this bucket. BasicHashtableEntry* _next; ------------- PR: https://git.openjdk.java.net/jdk/pull/3123 From xliu at openjdk.java.net Mon Mar 22 22:17:53 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Mon, 22 Mar 2021 22:17:53 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging Message-ID: This patch provides a buffer to store asynchrounous messages and flush them to underlying files periodically. ------------- Commit messages: - 8229517: Support for optional asynchronous/buffered logging - 8229517: Support for optional asynchronous/buffered logging - 8229517: Support for optional asynchronous/buffered logging - 8229517: Support for optional asynchronous/buffered logging Changes: https://git.openjdk.java.net/jdk/pull/3135/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3135&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8229517 Stats: 523 lines in 12 files changed: 508 ins; 5 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/3135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3135/head:pull/3135 PR: https://git.openjdk.java.net/jdk/pull/3135 From coleenp at openjdk.java.net Mon Mar 22 22:48:05 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 22 Mar 2021 22:48:05 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable [v2] In-Reply-To: References: Message-ID: <27GM9pmZuaikjHuuvXVRYgOPnTqMxNOXhd1CR7BzaKo=.ab0ec842-335c-4c85-807b-dc6bcb6b77e3@github.com> > From CR: > The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: > > AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. > > ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. > > Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. > > This change was performance tested on linux and windows. > > It was also tested with tier1-6. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix Hashtable constructor and comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3123/files - new: https://git.openjdk.java.net/jdk/pull/3123/files/328f72c5..7a0ad9ee Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3123&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3123&range=00-01 Stats: 20 lines in 2 files changed: 0 ins; 16 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3123.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3123/head:pull/3123 PR: https://git.openjdk.java.net/jdk/pull/3123 From iklam at openjdk.java.net Mon Mar 22 23:10:40 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 22 Mar 2021 23:10:40 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable [v2] In-Reply-To: <27GM9pmZuaikjHuuvXVRYgOPnTqMxNOXhd1CR7BzaKo=.ab0ec842-335c-4c85-807b-dc6bcb6b77e3@github.com> References: <27GM9pmZuaikjHuuvXVRYgOPnTqMxNOXhd1CR7BzaKo=.ab0ec842-335c-4c85-807b-dc6bcb6b77e3@github.com> Message-ID: On Mon, 22 Mar 2021 22:48:05 GMT, Coleen Phillimore wrote: >> From CR: >> The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: >> >> AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. >> >> ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. >> >> Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. >> >> This change was performance tested on linux and windows. >> >> It was also tested with tier1-6. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix Hashtable constructor and comments. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3123 From minqi at openjdk.java.net Mon Mar 22 23:32:05 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Mon, 22 Mar 2021 23:32:05 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v5] In-Reply-To: References: Message-ID: <7MvNg06FGz8ZD3bUvH9FbCKYmqTX7V0XMUgh3jnQUwc=.0d011816-9758-4173-919c-00c45b203584@github.com> > Hi, Please review > > Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with > `java -XX:DumpLoadedClassList= .... ` > to collect shareable class names and saved in file `` , then > `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` > With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. > The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. > New added jcmd option: > `jcmd VM.cds static_dump ` > or > `jcmd VM.cds dynamic_dump ` > To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. > The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. > > Tests: tier1,tier2,tier3,tier4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Fix according to review comment and add more tests ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2737/files - new: https://git.openjdk.java.net/jdk/pull/2737/files/a9010f8f..e882a074 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=03-04 Stats: 142 lines in 7 files changed: 61 ins; 35 del; 46 mod Patch: https://git.openjdk.java.net/jdk/pull/2737.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737 PR: https://git.openjdk.java.net/jdk/pull/2737 From iklam at openjdk.java.net Mon Mar 22 23:43:52 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 22 Mar 2021 23:43:52 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup Message-ID: Please review this trivial removal of dead code. The word `base_library_lookup` does not exist in any C source code in the entire JDK. ------------- Commit messages: - 8263992: Remove dead code NativeLookup::base_library_lookup Changes: https://git.openjdk.java.net/jdk/pull/3139/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3139&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263992 Stats: 24 lines in 2 files changed: 0 ins; 23 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3139.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3139/head:pull/3139 PR: https://git.openjdk.java.net/jdk/pull/3139 From coleenp at openjdk.java.net Tue Mar 23 00:24:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 00:24:41 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 23:26:01 GMT, Ioi Lam wrote: > Please review this trivial removal of dead code. The word `base_library_lookup` does not exist in any C source code in the entire JDK. Good going - you got rid of some CATCHes + trivial. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3139 From coleenp at openjdk.java.net Tue Mar 23 01:10:46 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 01:10:46 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown Message-ID: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. Tested with vmTestbase/nsk/jvmti and tier1 (in progress). ------------- Commit messages: - Don't pass TRAPS to functions that don't throw exceptions. Changes: https://git.openjdk.java.net/jdk/pull/3141/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264004 Stats: 179 lines in 4 files changed: 1 ins; 32 del; 146 mod Patch: https://git.openjdk.java.net/jdk/pull/3141.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3141/head:pull/3141 PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Tue Mar 23 01:22:56 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 01:22:56 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: > Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. > Tested with vmTestbase/nsk/jvmti and tier1 (in progress). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: missed THREAD that should be CHECK_false argument. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3141/files - new: https://git.openjdk.java.net/jdk/pull/3141/files/c3f57eb7..1e3f00d4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3141.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3141/head:pull/3141 PR: https://git.openjdk.java.net/jdk/pull/3141 From yyang at openjdk.java.net Tue Mar 23 02:24:40 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 23 Mar 2021 02:24:40 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: Message-ID: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> On Mon, 22 Mar 2021 21:10:33 GMT, Ioi Lam wrote: > The problem in methodMatcher.cpp is caused by this: > > ``` > #define RANGEBASE "\x1\x2\x3\x4\x5\x6\x7\x8\xa\xb\xc\xd\xe\xf" \ > "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ > "\x21\x22\x23\x24\x25\x26\x27\x2a\x2b\x2c\x2d" ..... > ``` > > where the literal `\x25` is the `%` character. It seems like VC++ tries to interpret the `%` character even when it's inside square brackets, like > > ``` > if (1 == sscanf(line, "%1022[[);/" RANGEBASE "]%n", sig+1, &bytes_read)) { > -> > if (1 == sscanf(line, "%1022[[);.....#$%&.....]%n", sig+1, &bytes_read)) { > ^ > ``` > > The [C++ reference](https://en.cppreference.com/w/c/io/fscanf) is unclear about how characters like `%` can be escaped inside square brackets (or whether they should be escaped at all). > > Trying to use sscanf for this purpose makes the code hard to understand and non portable. It's better to ditch sscanf and read the characters byte-by-byte. That way, you can get rid of the original `PRAGMA_DISABLE_MSVC_WARNING(4819)` as well. The explanation makes sense. We can parse class name and method name via byte-by-byte stream instead of advanced regex-like sscanf. For this reason, I also put a FIXME comment above MethodMatcher::parse_method_pattern. This build failure also appears in the [downstream JDK](https://github.com/alibaba/dragonwell11/issues/70), blocking further development. So the purpose of this PR is to address these treat-warning-as-error problems. I'd like to rewrite this function in another PR. ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From iklam at openjdk.java.net Tue Mar 23 03:09:39 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 03:09:39 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> Message-ID: On Tue, 23 Mar 2021 02:22:00 GMT, Yi Yang wrote: > > The problem in methodMatcher.cpp is caused by this: > > ``` > > #define RANGEBASE "\x1\x2\x3\x4\x5\x6\x7\x8\xa\xb\xc\xd\xe\xf" \ > > "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ > > "\x21\x22\x23\x24\x25\x26\x27\x2a\x2b\x2c\x2d" ..... > > ``` > > > > > > where the literal `\x25` is the `%` character. It seems like VC++ tries to interpret the `%` character even when it's inside square brackets, like > > ``` > > if (1 == sscanf(line, "%1022[[);/" RANGEBASE "]%n", sig+1, &bytes_read)) { > > -> > > if (1 == sscanf(line, "%1022[[);.....#$%&.....]%n", sig+1, &bytes_read)) { > > ^ > > ``` > > > > > > The [C++ reference](https://en.cppreference.com/w/c/io/fscanf) is unclear about how characters like `%` can be escaped inside square brackets (or whether they should be escaped at all). > > Trying to use sscanf for this purpose makes the code hard to understand and non portable. It's better to ditch sscanf and read the characters byte-by-byte. That way, you can get rid of the original `PRAGMA_DISABLE_MSVC_WARNING(4819)` as well. > > The explanation makes sense. We can parse class name and method name via byte-by-byte stream instead of advanced regex-like sscanf. For this reason, I also put a FIXME comment above MethodMatcher::parse_method_pattern. > > This build failure also appears in the [downstream JDK](https://github.com/alibaba/dragonwell11/issues/70), blocking further development. So the purpose of this PR is to address these treat-warning-as-error problems. I'd like to rewrite this function in another PR. I think we should add `#pragmas` one only as a last resort. We need to understand why the problem is happening. This code has been there for a long time. I wonder what happened to cause it to fail in your build system. Could it be related to a particular version of VC++? I checked the build logs from Oracle's CI as well as GitHub actions 19.27.29111 Oracle -- OK 19.28.29334 Alibaba -- warnings 19.28.29910 GitHub -- OK Or, is it related to the system language setting of your build machine (e.g., could it be set to Chinese?) Most importantly, does the `#pragma` actually make the code work, or does it merely hides the problem? I.e., will sscanf fail at runtime when it sees the `%&`? Have you run any tests to validate that the affected code works with your toolchain? ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From dholmes at openjdk.java.net Tue Mar 23 03:24:38 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 23 Mar 2021 03:24:38 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 23:26:01 GMT, Ioi Lam wrote: > Please review this trivial removal of dead code. The word `base_library_lookup` does not exist in any C source code in the entire JDK. Hi Ioi, Looks good but do we have a problem with the notion of "in base library" ??? Should we perhaps have been using this method elsewhere? Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/3139 From iklam at openjdk.java.net Tue Mar 23 04:12:38 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 04:12:38 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup In-Reply-To: References: Message-ID: <9R96xCYl9pa_DIxoOIodKnp-jNoEFSwWB9Cca2c6AM8=.7b83c080-b032-4d2d-a553-f0602d92f463@github.com> On Tue, 23 Mar 2021 03:21:52 GMT, David Holmes wrote: > Hi Ioi, > > Looks good but do we have a problem with the notion of "in base library" ??? Should we perhaps have been using this method elsewhere? > > Thanks, > David This function was last used here in 2014. (JDK-8031819: Remove legacy jdk checks and code) So one has found any use of it for the past 6+ years. I think it's pretty dead. http://hg.openjdk.java.net/jdk9/jdk9/hotspot/diff/afe58d604f28/src/share/vm/prims/jvm.cpp ------------- PR: https://git.openjdk.java.net/jdk/pull/3139 From dholmes at openjdk.java.net Tue Mar 23 04:40:40 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 23 Mar 2021 04:40:40 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain [v2] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 18:06:04 GMT, Coleen Phillimore wrote: >> Please review this mostly trivial fix to move SystemDictionary::validate_protection_domain into Dictionary and hide the functions in Dictionary that it calls. This change also removes some #include dictionary.hpp and a TRAPS parameter where not needed. >> >> See CR for more details. This function was in the middle of others that I want to keep together in systemDictionary.cpp. >> >> Tested with tier1 on 4 Oracle supported platforms. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Move comment down to the code it describes. Looks good! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3120 From yyang at openjdk.java.net Tue Mar 23 04:41:39 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Tue, 23 Mar 2021 04:41:39 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> Message-ID: On Tue, 23 Mar 2021 03:06:42 GMT, Ioi Lam wrote: >>> The problem in methodMatcher.cpp is caused by this: >>> >>> ``` >>> #define RANGEBASE "\x1\x2\x3\x4\x5\x6\x7\x8\xa\xb\xc\xd\xe\xf" \ >>> "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ >>> "\x21\x22\x23\x24\x25\x26\x27\x2a\x2b\x2c\x2d" ..... >>> ``` >>> >>> where the literal `\x25` is the `%` character. It seems like VC++ tries to interpret the `%` character even when it's inside square brackets, like >>> >>> ``` >>> if (1 == sscanf(line, "%1022[[);/" RANGEBASE "]%n", sig+1, &bytes_read)) { >>> -> >>> if (1 == sscanf(line, "%1022[[);.....#$%&.....]%n", sig+1, &bytes_read)) { >>> ^ >>> ``` >>> >>> The [C++ reference](https://en.cppreference.com/w/c/io/fscanf) is unclear about how characters like `%` can be escaped inside square brackets (or whether they should be escaped at all). >>> >>> Trying to use sscanf for this purpose makes the code hard to understand and non portable. It's better to ditch sscanf and read the characters byte-by-byte. That way, you can get rid of the original `PRAGMA_DISABLE_MSVC_WARNING(4819)` as well. >> >> The explanation makes sense. We can parse class name and method name via byte-by-byte stream instead of advanced regex-like sscanf. For this reason, I also put a FIXME comment above MethodMatcher::parse_method_pattern. >> >> This build failure also appears in the [downstream JDK](https://github.com/alibaba/dragonwell11/issues/70), blocking further development. So the purpose of this PR is to address these treat-warning-as-error problems. I'd like to rewrite this function in another PR. > >> > The problem in methodMatcher.cpp is caused by this: >> > ``` >> > #define RANGEBASE "\x1\x2\x3\x4\x5\x6\x7\x8\xa\xb\xc\xd\xe\xf" \ >> > "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ >> > "\x21\x22\x23\x24\x25\x26\x27\x2a\x2b\x2c\x2d" ..... >> > ``` >> > >> > >> > where the literal `\x25` is the `%` character. It seems like VC++ tries to interpret the `%` character even when it's inside square brackets, like >> > ``` >> > if (1 == sscanf(line, "%1022[[);/" RANGEBASE "]%n", sig+1, &bytes_read)) { >> > -> >> > if (1 == sscanf(line, "%1022[[);.....#$%&.....]%n", sig+1, &bytes_read)) { >> > ^ >> > ``` >> > >> > >> > The [C++ reference](https://en.cppreference.com/w/c/io/fscanf) is unclear about how characters like `%` can be escaped inside square brackets (or whether they should be escaped at all). >> > Trying to use sscanf for this purpose makes the code hard to understand and non portable. It's better to ditch sscanf and read the characters byte-by-byte. That way, you can get rid of the original `PRAGMA_DISABLE_MSVC_WARNING(4819)` as well. >> >> The explanation makes sense. We can parse class name and method name via byte-by-byte stream instead of advanced regex-like sscanf. For this reason, I also put a FIXME comment above MethodMatcher::parse_method_pattern. >> >> This build failure also appears in the [downstream JDK](https://github.com/alibaba/dragonwell11/issues/70), blocking further development. So the purpose of this PR is to address these treat-warning-as-error problems. I'd like to rewrite this function in another PR. > > I think we should add `#pragmas` one only as a last resort. We need to understand why the problem is happening. > > This code has been there for a long time. I wonder what happened to cause it to fail in your build system. Could it be related to a particular version of VC++? I checked the build logs from Oracle's CI as well as GitHub actions > > 19.27.29111 Oracle -- OK > 19.28.29334 Alibaba -- warnings > 19.28.29910 GitHub -- OK > > Or, is it related to the system language setting of your build machine (e.g., could it be set to Chinese?) > > Most importantly, does the `#pragma` actually make the code work, or does it merely hides the problem? I.e., will sscanf fail at runtime when it sees the `%&`? Have you run any tests to validate that the affected code works with your toolchain? I have confirmed that this problem is related to a specific msvc version(At least it happens in msvc 19.28.29334, I have not tested other msvc versions). After applying this patch, I can build successfully, both on upstream JDK and Alibaba JDK. I have written a minimal reproducible demo. When the /WX(Treats all compiler warnings as errors) option is turned on, the compiler issued the same warnings. After disabling these warnings via `#pragma`, the compiler will not complain about anything, the execution result is exactly as I expected. Besides, all test cases under `compiler/compilercontrol/` are passed except ClearDirectivesFileStackTest.java which has been problem-listed before. So I believe this is a false positive of the compilation warning, turning it off or on will not affect the runtime behavior. FYI: See detailed jtreg log and minimal reproducible demo on JBS attachments. Thanks, Yang ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From david.holmes at oracle.com Tue Mar 23 04:46:24 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Mar 2021 14:46:24 +1000 Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup In-Reply-To: <9R96xCYl9pa_DIxoOIodKnp-jNoEFSwWB9Cca2c6AM8=.7b83c080-b032-4d2d-a553-f0602d92f463@github.com> References: <9R96xCYl9pa_DIxoOIodKnp-jNoEFSwWB9Cca2c6AM8=.7b83c080-b032-4d2d-a553-f0602d92f463@github.com> Message-ID: <9ffeb790-da80-98df-0bc3-dda6b4e7cea4@oracle.com> On 23/03/2021 2:12 pm, Ioi Lam wrote: > On Tue, 23 Mar 2021 03:21:52 GMT, David Holmes wrote: > >> Hi Ioi, >> >> Looks good but do we have a problem with the notion of "in base library" ??? Should we perhaps have been using this method elsewhere? >> >> Thanks, >> David > > This function was last used here in 2014. (JDK-8031819: Remove legacy jdk checks and code) > > So one has found any use of it for the past 6+ years. I think it's pretty dead. > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/diff/afe58d604f28/src/share/vm/prims/jvm.cpp Yes it looks like this should have been removed as part of 8031819. But I remain a little confused as to what "in base library" means in the rest of the code. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3139 > From dholmes at openjdk.java.net Tue Mar 23 05:04:39 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 23 Mar 2021 05:04:39 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 01:22:56 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > missed THREAD that should be CHECK_false argument. Hi Coleen, This looks great! Good to see all those false TRAPS usages disappear. I found one more. Thanks, David PS. Annoying that we often needs TRAPS through a call chain just because some leaf method may trigger an OOME. No escaping that unfortunately. src/hotspot/share/prims/jvmtiRedefineClasses.hpp line 484: > 482: void rewrite_cp_refs_in_method(methodHandle method, > 483: methodHandle * new_method_p, TRAPS); > 484: bool rewrite_cp_refs_in_methods(InstanceKlass* scratch_class, TRAPS); This method clears any pending exception and so should not be a TRAPS method. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From iklam at openjdk.java.net Tue Mar 23 05:07:56 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 05:07:56 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup [v2] In-Reply-To: References: Message-ID: > Please review this removal of dead code, and unused parameter of `in_base_library` for the functions that remain. The word `base_library_lookup` does not exist in any C source code in the entire JDK. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: removed unused "in_base_library" parameter; removed unnecessary include of nativeLookup.hpp ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3139/files - new: https://git.openjdk.java.net/jdk/pull/3139/files/039ca686..daa3705e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3139&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3139&range=00-01 Stats: 33 lines in 12 files changed: 0 ins; 12 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/3139.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3139/head:pull/3139 PR: https://git.openjdk.java.net/jdk/pull/3139 From iklam at openjdk.java.net Tue Mar 23 05:07:56 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 05:07:56 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup [v2] In-Reply-To: <9R96xCYl9pa_DIxoOIodKnp-jNoEFSwWB9Cca2c6AM8=.7b83c080-b032-4d2d-a553-f0602d92f463@github.com> References: <9R96xCYl9pa_DIxoOIodKnp-jNoEFSwWB9Cca2c6AM8=.7b83c080-b032-4d2d-a553-f0602d92f463@github.com> Message-ID: On Tue, 23 Mar 2021 04:09:22 GMT, Ioi Lam wrote: >> Hi Ioi, >> >> Looks good but do we have a problem with the notion of "in base library" ??? Should we perhaps have been using this method elsewhere? >> >> Thanks, >> David > >> Hi Ioi, >> >> Looks good but do we have a problem with the notion of "in base library" ??? Should we perhaps have been using this method elsewhere? >> >> Thanks, >> David > > This function was last used here in 2014. (JDK-8031819: Remove legacy jdk checks and code) > > So one has found any use of it for the past 6+ years. I think it's pretty dead. > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/diff/afe58d604f28/src/share/vm/prims/jvm.cpp > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > On 23/03/2021 2:12 pm, Ioi Lam wrote: > > > On Tue, 23 Mar 2021 03:21:52 GMT, David Holmes wrote: > > > Hi Ioi, > > > Looks good but do we have a problem with the notion of "in base library" ??? Should we perhaps have been using this method elsewhere? > > > Thanks, > > > David > > > > > > This function was last used here in 2014. (JDK-8031819: Remove legacy jdk checks and code) > > So one has found any use of it for the past 6+ years. I think it's pretty dead. > > http://hg.openjdk.java.net/jdk9/jdk9/hotspot/diff/afe58d604f28/src/share/vm/prims/jvm.cpp > > Yes it looks like this should have been removed as part of 8031819. > > But I remain a little confused as to what "in base library" means in the > rest of the code. > > Cheers, > David I removed the `int& in_base_library` parameter. There were only 2 callers that passed this parameter, but they don't use the returned value. I also removed unnecessary inclusion of nativeLookup.hpp. And I edited the PR description as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/3139 From dholmes at openjdk.java.net Tue Mar 23 05:34:40 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 23 Mar 2021 05:34:40 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup [v2] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 05:07:56 GMT, Ioi Lam wrote: >> Please review this removal of dead code, and unused parameter of `in_base_library` for the functions that remain. The word `base_library_lookup` does not exist in any C source code in the entire JDK. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > removed unused "in_base_library" parameter; removed unnecessary include of nativeLookup.hpp Looks even better! Thanks. Of course now I'm going to have to go and find out why in_base_library was introduced :) Cheers, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3139 From iklam at openjdk.java.net Tue Mar 23 05:56:40 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 05:56:40 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 01:22:56 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > missed THREAD that should be CHECK_false argument. LGTM. I think suggested TRAPS changes could be done in a separate REF. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From iklam at openjdk.java.net Tue Mar 23 05:56:41 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 05:56:41 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 04:59:10 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> missed THREAD that should be CHECK_false argument. > > src/hotspot/share/prims/jvmtiRedefineClasses.hpp line 484: > >> 482: void rewrite_cp_refs_in_method(methodHandle method, >> 483: methodHandle * new_method_p, TRAPS); >> 484: bool rewrite_cp_refs_in_methods(InstanceKlass* scratch_class, TRAPS); > > This method clears any pending exception and so should not be a TRAPS method. `VM_RedefineClasses::load_new_class_versions` also seems to never throw. These functions should be changed to take a `Thread*` parameter, and should use `HandleMark em(thread);` to guarantee that an exception never leaves the function. ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From david.holmes at oracle.com Tue Mar 23 06:01:29 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Mar 2021 16:01:29 +1000 Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup [v2] In-Reply-To: References: Message-ID: <145b8c46-aeb7-b1e0-a9c5-7c0ea7c3bc7a@oracle.com> On 23/03/2021 3:34 pm, David Holmes wrote: > On Tue, 23 Mar 2021 05:07:56 GMT, Ioi Lam wrote: > >>> Please review this removal of dead code, and unused parameter of `in_base_library` for the functions that remain. The word `base_library_lookup` does not exist in any C source code in the entire JDK. >> >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> removed unused "in_base_library" parameter; removed unnecessary include of nativeLookup.hpp > > Looks even better! Thanks. > > Of course now I'm going to have to go and find out why in_base_library was introduced :) "Day 1" one code from Feb 1998. David > Cheers, > David > > ------------- > > Marked as reviewed by dholmes (Reviewer). > > PR: https://git.openjdk.java.net/jdk/pull/3139 > From never at openjdk.java.net Tue Mar 23 06:17:54 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Tue, 23 Mar 2021 06:17:54 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI Message-ID: 8264016: [JVMCI] add some thread local fields for use by JVMCI ------------- Commit messages: - 8264016: [JVMCI] add some thread local fields for use by JVMCI Changes: https://git.openjdk.java.net/jdk/pull/3147/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3147&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264016 Stats: 14 lines in 3 files changed: 14 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3147.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3147/head:pull/3147 PR: https://git.openjdk.java.net/jdk/pull/3147 From dholmes at openjdk.java.net Tue Mar 23 06:31:46 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 23 Mar 2021 06:31:46 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 06:11:44 GMT, Tom Rodriguez wrote: > 8264016: [JVMCI] add some thread local fields for use by JVMCI Hi Tom, Is it feasible to create a JVMCI helper side-object that is only created when needed, rather than embedding all the fields directly in the JavaThread instance? Thanks, David src/hotspot/share/runtime/thread.hpp line 1020: > 1018: intptr_t* _jvmci_reserved0; > 1019: intptr_t* _jvmci_reserved1; > 1020: oop _jvmci_reserved_oop0; Can this use OopStorage? We've been getting rid of oop fields and the corresponding oops_do support. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From stefank at openjdk.java.net Tue Mar 23 07:37:00 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 23 Mar 2021 07:37:00 GMT Subject: RFR: 8263721: Unify oop casting Message-ID: In fastdebug builds we replace the "oopDesc* to oop" typedef with a wrapper class that holds an oopDesc*. This wrapper class allows any kind of pointer to be implicitly converted to an oop. So you can write code like this: Metadata* m = (Metadata*)0x123; oop o = m; and the compiler will unfortunately accept it. Fortunately, this will be caught in release builds, because you can't convert a Method* into an oopDesc*. One interesting thing is that you can't convert values of integral type too oops: uintptr_t m = uintptr_t(123); oop o = m; This fails in both fastdebug and release builds. To be able to convert integral values to oops, there are two helper functions: // For CHECK_UNHANDLED_OOPS, it is ambiguous C++ behavior to have the oop // structure contain explicit user defined conversions of both numerical // and pointer type. Define inline methods to provide the numerical conversions. template inline oop cast_to_oop(T value) { return (oop)(CHECK_UNHANDLED_OOPS_ONLY((void *))(value)); } template inline T cast_from_oop(oop o) { return (T)(CHECK_UNHANDLED_OOPS_ONLY((oopDesc*))o); } So, the above example would have to be written as: uintptr_t m = uintptr_t(123); oop o = cast_to_oop(m); My proposal is that we stop allowing implicit (and explicit) casts from void*, and instead use cast_to_oop whenever we want to cast to oops. We would still allow oopDesc* to be implicitly converted to oop. This would also allow NULL to be converted too oop without casting: oop o = NULL; This will make the code to convert oops a little bit longer. It could be argued that that's a good thing, because everyone should be cautious about converting things into oops. This will also give us one entry-point where we could add (probably temporary) verification code. An alternative to the suggestion above, could be to completely get rid of cast_to_oop and cast_from_oop. But for that to work we need to stop using NULL, which is an integral 0, and start to use nullptr for oops. I've prototyped this as well, but initial investigations showed that some tended to prefer having the cast_to_oop function. (We could still move from NULL to nullptr, if we think that is a good idea). ------------- Commit messages: - Convert tab to spaces - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting - Stricter oop casts Changes: https://git.openjdk.java.net/jdk/pull/3047/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3047&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263721 Stats: 250 lines in 90 files changed: 3 ins; 4 del; 243 mod Patch: https://git.openjdk.java.net/jdk/pull/3047.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3047/head:pull/3047 PR: https://git.openjdk.java.net/jdk/pull/3047 From aph at openjdk.java.net Tue Mar 23 09:56:56 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 23 Mar 2021 09:56:56 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v24] In-Reply-To: References: Message-ID: <3fAiKgcWOdYNUYMfY0LSvyMswzjlKJWJaxZgGf7tdYE=.aa5e7ae8-2744-4c2c-9e66-b72e19d9ebec@github.com> On Fri, 12 Mar 2021 16:32:10 GMT, Andrew Haley wrote: >> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 105 commits: >> >> - Merge commit 'refs/pull/11/head' of https://github.com/AntonKozlov/jdk into jdk-macos >> - workaround JDK-8262895 by disabling subtest >> - Fix typo >> - Rename threadWXSetters.hpp -> threadWXSetters.inline.hpp >> - JDK-8259937: bsd_aarch64 part >> - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos >> - Fix after JDK-8259539, partially revert preconditions >> - JDK-8260471: bsd_aarch64 part >> - JDK-8259539: bsd_aarch64 part >> - JDK-8257828: bsd_aarch64 part >> - ... and 95 more: https://git.openjdk.java.net/jdk/compare/a6e34b3d...a72f6834 > >> @theRealAph, could you elaborate on what is need to be done for [#2200 (review)](https://github.com/openjdk/jdk/pull/2200#pullrequestreview-600597066). > > I think that what you've got now is fine. > _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [build-dev](mailto:build-dev at openjdk.java.net):_ > > On 3/15/21 6:56 PM, Anton Kozlov wrote: > > > On Wed, 10 Mar 2021 11:21:44 GMT, Andrew Haley wrote: > > > > We always check for `R18_RESERVED` with `#if(n)def`, is there any reason to define the value for the macro? > > > > > > > > > Robustness, clarity, maintainability, convention. Why not? > > > > > > I've tried to implement the suggestion, but it pulled more unnecessary changes. It makes the intended way to check the condition less clear (`#ifdef` and not `#if`). > > No, no, no! I am not suggesting you change anything else, just that > you do not define contentless macros. You might as well define it > to be something, and true is a reasonable default, that's all. It's > not terribly important, it's just good practice. I'm quite prepared to drop this if it's holding up the port. It's a style thing, but it's not critical. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From aph at openjdk.java.net Tue Mar 23 09:56:57 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 23 Mar 2021 09:56:57 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v24] In-Reply-To: <3fAiKgcWOdYNUYMfY0LSvyMswzjlKJWJaxZgGf7tdYE=.aa5e7ae8-2744-4c2c-9e66-b72e19d9ebec@github.com> References: <3fAiKgcWOdYNUYMfY0LSvyMswzjlKJWJaxZgGf7tdYE=.aa5e7ae8-2744-4c2c-9e66-b72e19d9ebec@github.com> Message-ID: On Tue, 23 Mar 2021 09:53:54 GMT, Andrew Haley wrote: >>> @theRealAph, could you elaborate on what is need to be done for [#2200 (review)](https://github.com/openjdk/jdk/pull/2200#pullrequestreview-600597066). >> >> I think that what you've got now is fine. > >> _Mailing list message from [Andrew Haley](mailto:aph at redhat.com) on [build-dev](mailto:build-dev at openjdk.java.net):_ >> >> On 3/15/21 6:56 PM, Anton Kozlov wrote: >> >> > On Wed, 10 Mar 2021 11:21:44 GMT, Andrew Haley wrote: >> > > > We always check for `R18_RESERVED` with `#if(n)def`, is there any reason to define the value for the macro? >> > > >> > > >> > > Robustness, clarity, maintainability, convention. Why not? >> > >> > >> > I've tried to implement the suggestion, but it pulled more unnecessary changes. It makes the intended way to check the condition less clear (`#ifdef` and not `#if`). >> >> No, no, no! I am not suggesting you change anything else, just that >> you do not define contentless macros. You might as well define it >> to be something, and true is a reasonable default, that's all. It's >> not terribly important, it's just good practice. > > I'm quite prepared to drop this if it's holding up the port. It's a style thing, but it's not critical. So, where are we up to now? Are we done yet? ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From stefank at openjdk.java.net Tue Mar 23 10:18:06 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 23 Mar 2021 10:18:06 GMT Subject: RFR: 8263721: Unify oop casting [v2] In-Reply-To: References: Message-ID: <9WBFjpLGVwGtmj6k1PosqnNsp5qbQEObjtS6A_B5Apg=.e29ef71a-8355-437a-a868-39ea123733b0@github.com> > In fastdebug builds we replace the "oopDesc* to oop" typedef with a wrapper class that holds an oopDesc*. This wrapper class allows any kind of pointer to be implicitly converted to an oop. So you can write code like this: > > Metadata* m = (Metadata*)0x123; > oop o = m; > > and the compiler will unfortunately accept it. Fortunately, this will be caught in release builds, because you can't convert a Method* into an oopDesc*. > > One interesting thing is that you can't convert values of integral type too oops: > > uintptr_t m = uintptr_t(123); > oop o = m; > > This fails in both fastdebug and release builds. To be able to convert integral values to oops, there are two helper functions: > > // For CHECK_UNHANDLED_OOPS, it is ambiguous C++ behavior to have the oop > // structure contain explicit user defined conversions of both numerical > // and pointer type. Define inline methods to provide the numerical conversions. > template inline oop cast_to_oop(T value) { > return (oop)(CHECK_UNHANDLED_OOPS_ONLY((void *))(value)); > } > template inline T cast_from_oop(oop o) { > return (T)(CHECK_UNHANDLED_OOPS_ONLY((oopDesc*))o); > } > > So, the above example would have to be written as: > > uintptr_t m = uintptr_t(123); > oop o = cast_to_oop(m); > > My proposal is that we stop allowing implicit (and explicit) casts from void*, and instead use cast_to_oop whenever we want to cast to oops. We would still allow oopDesc* to be implicitly converted to oop. This would also allow NULL to be converted too oop without casting: > > oop o = NULL; > > This will make the code to convert oops a little bit longer. It could be argued that that's a good thing, because everyone should be cautious about converting things into oops. This will also give us one entry-point where we could add (probably temporary) verification code. > > An alternative to the suggestion above, could be to completely get rid of cast_to_oop and cast_from_oop. But for that to work we need to stop using NULL, which is an integral 0, and start to use nullptr for oops. I've prototyped this as well, but initial investigations showed that some tended to prefer having the cast_to_oop function. (We could still move from NULL to nullptr, if we think that is a good idea). Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Remove casts in merged changes - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting - Convert tab to spaces - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting - Stricter oop casts ------------- Changes: https://git.openjdk.java.net/jdk/pull/3047/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3047&range=01 Stats: 252 lines in 90 files changed: 3 ins; 4 del; 245 mod Patch: https://git.openjdk.java.net/jdk/pull/3047.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3047/head:pull/3047 PR: https://git.openjdk.java.net/jdk/pull/3047 From kbarrett at openjdk.java.net Tue Mar 23 10:21:43 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 23 Mar 2021 10:21:43 GMT Subject: RFR: 8263721: Unify oop casting [v2] In-Reply-To: <9WBFjpLGVwGtmj6k1PosqnNsp5qbQEObjtS6A_B5Apg=.e29ef71a-8355-437a-a868-39ea123733b0@github.com> References: <9WBFjpLGVwGtmj6k1PosqnNsp5qbQEObjtS6A_B5Apg=.e29ef71a-8355-437a-a868-39ea123733b0@github.com> Message-ID: On Tue, 23 Mar 2021 10:18:06 GMT, Stefan Karlsson wrote: >> In fastdebug builds we replace the "oopDesc* to oop" typedef with a wrapper class that holds an oopDesc*. This wrapper class allows any kind of pointer to be implicitly converted to an oop. So you can write code like this: >> >> Metadata* m = (Metadata*)0x123; >> oop o = m; >> >> and the compiler will unfortunately accept it. Fortunately, this will be caught in release builds, because you can't convert a Method* into an oopDesc*. >> >> One interesting thing is that you can't convert values of integral type too oops: >> >> uintptr_t m = uintptr_t(123); >> oop o = m; >> >> This fails in both fastdebug and release builds. To be able to convert integral values to oops, there are two helper functions: >> >> // For CHECK_UNHANDLED_OOPS, it is ambiguous C++ behavior to have the oop >> // structure contain explicit user defined conversions of both numerical >> // and pointer type. Define inline methods to provide the numerical conversions. >> template inline oop cast_to_oop(T value) { >> return (oop)(CHECK_UNHANDLED_OOPS_ONLY((void *))(value)); >> } >> template inline T cast_from_oop(oop o) { >> return (T)(CHECK_UNHANDLED_OOPS_ONLY((oopDesc*))o); >> } >> >> So, the above example would have to be written as: >> >> uintptr_t m = uintptr_t(123); >> oop o = cast_to_oop(m); >> >> My proposal is that we stop allowing implicit (and explicit) casts from void*, and instead use cast_to_oop whenever we want to cast to oops. We would still allow oopDesc* to be implicitly converted to oop. This would also allow NULL to be converted too oop without casting: >> >> oop o = NULL; >> >> This will make the code to convert oops a little bit longer. It could be argued that that's a good thing, because everyone should be cautious about converting things into oops. This will also give us one entry-point where we could add (probably temporary) verification code. >> >> An alternative to the suggestion above, could be to completely get rid of cast_to_oop and cast_from_oop. But for that to work we need to stop using NULL, which is an integral 0, and start to use nullptr for oops. I've prototyped this as well, but initial investigations showed that some tended to prefer having the cast_to_oop function. (We could still move from NULL to nullptr, if we think that is a good idea). > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Remove casts in merged changes > - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting > - Convert tab to spaces > - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting > - Stricter oop casts Looks good. Just the one minor nit. There are a lot of casts to `void*` as an argument that ought to be cleaned up. And the set of valid types for the cast functions ought to be restricted to "reasonable" types. But that can be a followup. src/hotspot/share/oops/oopsHierarchy.hpp line 89: > 87: > 88: public: > 89: oop() : _o(NULL) { register_if_checking(); } NULL -> nullptr ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3047 From vkempik at openjdk.java.net Tue Mar 23 11:20:52 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Tue, 23 Mar 2021 11:20:52 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v24] In-Reply-To: References: <3fAiKgcWOdYNUYMfY0LSvyMswzjlKJWJaxZgGf7tdYE=.aa5e7ae8-2744-4c2c-9e66-b72e19d9ebec@github.com> Message-ID: On Tue, 23 Mar 2021 09:54:16 GMT, Andrew Haley wrote: > So, where are we up to now? Are we done yet? Hello we would like to get approval for the final version we have now and then integrate this pr as soon as Mark will target it to jdk17 ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Tue Mar 23 11:40:50 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 11:40:50 GMT Subject: Integrated: 8263974: Move SystemDictionary::verify_protection_domain In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 14:57:02 GMT, Coleen Phillimore wrote: > Please review this mostly trivial fix to move SystemDictionary::validate_protection_domain into Dictionary and hide the functions in Dictionary that it calls. This change also removes some #include dictionary.hpp and a TRAPS parameter where not needed. > > See CR for more details. This function was in the middle of others that I want to keep together in systemDictionary.cpp. > > Tested with tier1 on 4 Oracle supported platforms. This pull request has now been integrated. Changeset: de2ff256 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/de2ff256 Stats: 175 lines in 8 files changed: 80 ins; 85 del; 10 mod 8263974: Move SystemDictionary::verify_protection_domain Reviewed-by: hseigel, lfoltan, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/3120 From coleenp at openjdk.java.net Tue Mar 23 11:40:50 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 11:40:50 GMT Subject: RFR: 8263974: Move SystemDictionary::verify_protection_domain [v2] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 04:37:24 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Move comment down to the code it describes. > > Looks good! > > Thanks, > David Thanks David! ------------- PR: https://git.openjdk.java.net/jdk/pull/3120 From lucy at openjdk.java.net Tue Mar 23 11:53:42 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 23 Mar 2021 11:53:42 GMT Subject: RFR: 8263260: [s390] Support latest hardware (z14 and z15) [v2] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 15:10:15 GMT, Lutz Schmidt wrote: >> Thanks for cleaning up! LGTM. > > Goetz and Martin, > thanks a lot for your reviews. > Lutz Build errors on Win-x64 are unrelated to these s390-only changes. ------------- PR: https://git.openjdk.java.net/jdk/pull/2918 From lucy at openjdk.java.net Tue Mar 23 11:53:42 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 23 Mar 2021 11:53:42 GMT Subject: Integrated: 8263260: [s390] Support latest hardware (z14 and z15) In-Reply-To: References: Message-ID: On Wed, 10 Mar 2021 17:35:20 GMT, Lutz Schmidt wrote: > 8263260: [s390] Support latest hardware (z14 and z15) This pull request has now been integrated. Changeset: fbd57bd4 Author: Lutz Schmidt URL: https://git.openjdk.java.net/jdk/commit/fbd57bd4 Stats: 178 lines in 2 files changed: 89 ins; 53 del; 36 mod 8263260: [s390] Support latest hardware (z14 and z15) Reviewed-by: goetz, mdoerr ------------- PR: https://git.openjdk.java.net/jdk/pull/2918 From coleenp at openjdk.java.net Tue Mar 23 11:54:39 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 11:54:39 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 05:52:51 GMT, Ioi Lam wrote: >> src/hotspot/share/prims/jvmtiRedefineClasses.hpp line 484: >> >>> 482: void rewrite_cp_refs_in_method(methodHandle method, >>> 483: methodHandle * new_method_p, TRAPS); >>> 484: bool rewrite_cp_refs_in_methods(InstanceKlass* scratch_class, TRAPS); >> >> This method clears any pending exception and so should not be a TRAPS method. > > `VM_RedefineClasses::load_new_class_versions` also seems to never throw. These functions should be changed to take a `Thread*` parameter, and should use `HandleMark em(thread);` to guarantee that an exception never leaves the function. Both of these functions are good examples of the convention that we're trying to agree on. In, load_new_class_versions, TRAPS is passed to materialize THREAD. THREAD is used then in a lot of places, and also to pass to SystemDictionary::parse_stream(...TRAPS), which does have an exceptional return that must be handled. Removing TRAPS then adding: JavaThread* current = JavaThread::current(); changing THREAD to current in most of the places seems ok, but passing 'current' to SystemDictionary::resolve_from_stream loses the information visually that this function returns an exception that must be handled. We need some meta-writeup rather than these decisions made in pull requests, because if we made these decisions collectively, I missed out. ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Tue Mar 23 12:14:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 12:14:40 GMT Subject: RFR: 8263976: Remove block allocation from BasicHashtable [v2] In-Reply-To: References: <27GM9pmZuaikjHuuvXVRYgOPnTqMxNOXhd1CR7BzaKo=.ab0ec842-335c-4c85-807b-dc6bcb6b77e3@github.com> Message-ID: On Mon, 22 Mar 2021 23:08:09 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix Hashtable constructor and comments. > > Marked as reviewed by iklam (Reviewer). Thanks Lois and Ioi! ------------- PR: https://git.openjdk.java.net/jdk/pull/3123 From coleenp at openjdk.java.net Tue Mar 23 12:14:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 12:14:41 GMT Subject: Integrated: 8263976: Remove block allocation from BasicHashtable In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 15:49:24 GMT, Coleen Phillimore wrote: > From CR: > The useful/general BasicHashtable uses a block allocation scheme to reportedly reduce fragmentation. When the StringTable and SymbolTable used to use this hashtable, performance benefits were reportedly observed because of the block allocation scheme. Since these tables were moved to the concurrent hashtables, the tables left that use the block allocation scheme are: > > AdapterHandlerLibrary, ResolutionError, LoaderConstraints, Leak profiler bitset table and Placeholders. 3 of these tables are very small and never needed block allocation to prevent fragmentation at least. Also there are 3 KVHashtables, which are built from BasicHashtable. 2 are used during dumping and 1 is ID2KlassTable which appears small. > > ModuleEntry, PackageEntry, Dictionary, G1RootSet for nmethods, and JvmtiTagMap tables didn't use the block allocation scheme. > > Removing this removes 7 pointers per table, and for each ClassLoaderData, which has 3 tables, removes 21 pointers. > > This change was performance tested on linux and windows. > > It was also tested with tier1-6. This pull request has now been integrated. Changeset: 5bc382fb Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/5bc382fb Stats: 178 lines in 16 files changed: 10 ins; 131 del; 37 mod 8263976: Remove block allocation from BasicHashtable Reviewed-by: lfoltan, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/3123 From ihse at openjdk.java.net Tue Mar 23 12:52:56 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Tue, 23 Mar 2021 12:52:56 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 12:50:14 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: > > - Merge branch 'master' into jdk-macos > - JDK-8262491: bsd_aarch64 part > - JDK-8263002: bsd_aarch64 part > - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos > - Wider #ifdef block > - Fix most of issues in java/foreign/ tests > > Failures related to va_args are tracked in JDK-8263512. > - Add Azul copyright > - Update Oracle copyright years > - Use Thread::current_or_null_safe in SafeFetch > - 8262903: [macos_aarch64] Thread::current() called on detached thread > - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 Build changes still look good. Hope you can get this done now! :) ------------- Marked as reviewed by ihse (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2200 From erikj at openjdk.java.net Tue Mar 23 12:52:55 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Tue, 23 Mar 2021 12:52:55 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 12:50:14 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: > > - Merge branch 'master' into jdk-macos > - JDK-8262491: bsd_aarch64 part > - JDK-8263002: bsd_aarch64 part > - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos > - Wider #ifdef block > - Fix most of issues in java/foreign/ tests > > Failures related to va_args are tracked in JDK-8263512. > - Add Azul copyright > - Update Oracle copyright years > - Use Thread::current_or_null_safe in SafeFetch > - 8262903: [macos_aarch64] Thread::current() called on detached thread > - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 Build changes look good. ------------- Marked as reviewed by erikj (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2200 From david.holmes at oracle.com Tue Mar 23 13:01:40 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 23 Mar 2021 23:01:40 +1000 Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: <754425a7-49e7-d84b-f655-30a80e38e358@oracle.com> On 23/03/2021 9:54 pm, Coleen Phillimore wrote: > On Tue, 23 Mar 2021 05:52:51 GMT, Ioi Lam wrote: > >>> src/hotspot/share/prims/jvmtiRedefineClasses.hpp line 484: >>> >>>> 482: void rewrite_cp_refs_in_method(methodHandle method, >>>> 483: methodHandle * new_method_p, TRAPS); >>>> 484: bool rewrite_cp_refs_in_methods(InstanceKlass* scratch_class, TRAPS); >>> >>> This method clears any pending exception and so should not be a TRAPS method. >> >> `VM_RedefineClasses::load_new_class_versions` also seems to never throw. These functions should be changed to take a `Thread*` parameter, and should use `HandleMark em(thread);` to guarantee that an exception never leaves the function. > > Both of these functions are good examples of the convention that we're trying to agree on. In, load_new_class_versions, TRAPS is passed to materialize THREAD. THREAD is used then in a lot of places, and also to pass to SystemDictionary::parse_stream(...TRAPS), which does have an exceptional return that must be handled. > Removing TRAPS then adding: > JavaThread* current = JavaThread::current(); > changing THREAD to current in most of the places seems ok, but passing 'current' to SystemDictionary::resolve_from_stream loses the information visually that this function returns an exception that must be handled. Okay ... Only a function that upon return may have directly, or indirectly, caused an exception to be pending should be declared with TRAPS. The caller is then expected to use the CHECK macros under most conditions. If a function is going to call a TRAPS function but clear the exception, then it should manifest a THREAD variable and pass that, both to indicate the called function is TRAPS and to allow its own use of the exception macros that depend on THREAD. But that function should not itself declare TRAPS just to get THREAD. How to manifest THREAD depends on the exact context, and we already have these cases today: - Thread* THREAD = Thread::current(); - Thread* THREAD = In this case I would expect to see: jvmtiError VM_RedefineClasses::load_new_class_versions(Thread* current) { ... ResourceMark rm(current); JvmtiThreadState *state = JvmtiThreadState::state_for(current->as_Java_thread()); ... HandleMark hm(current); ... Handle the_class_loader(current, the_class->class_loader()); Handle protection_domain(current, the_class->protection_domain()); ... Thread* THREAD = current; // For exception processing InstanceKlass* scratch_class = SystemDictionary::parse_stream( the_class_sym, the_class_loader, &st, cl_info, THREAD); ... if (HAS_PENDING_EXCEPTION) { ... the_class->link_class(THREAD); if (HAS_PENDING_EXCEPTION) { ... I'll point out, to be clear that I recognise it, that the existing CATCH macro does not fit in with these conventions as you only apply CATCH to a function that never lets an exception escape, and such a function should not be declared TRAPS and should never be passed THREAD, but you need to manifest THREAD to use CATCH. I consider the CATCH macro to be a well-intentioned mistake. > We need some meta-writeup rather than these decisions made in pull requests, because if we made these decisions collectively, I missed out. Once we've fleshed things out we can propose them for the style guide. But it is easier to discuss concrete examples. The important things (to me at least) at the moment are: - we get rid of TRAPS from non-exception throwing code - we pave the way to allow changing of TRAPS to declare "JavaThread* THREAD" so that functions that always expect to be called on a JavaThread can explicitly indicate that. (And that goes for non-TRAPS functions too.) - we simplify/clarify code that uses an arbitrary mix of names for the current thread, and establish simple conventions to use going forward and to apply a-posteri on a time available basis when we do cleanups Cheers, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3141 > From jvernee at openjdk.java.net Tue Mar 23 13:07:38 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Tue, 23 Mar 2021 13:07:38 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> Message-ID: <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> On Tue, 23 Mar 2021 04:38:59 GMT, Yi Yang wrote: >>> > The problem in methodMatcher.cpp is caused by this: >>> > ``` >>> > #define RANGEBASE "\x1\x2\x3\x4\x5\x6\x7\x8\xa\xb\xc\xd\xe\xf" \ >>> > "\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f" \ >>> > "\x21\x22\x23\x24\x25\x26\x27\x2a\x2b\x2c\x2d" ..... >>> > ``` >>> > >>> > >>> > where the literal `\x25` is the `%` character. It seems like VC++ tries to interpret the `%` character even when it's inside square brackets, like >>> > ``` >>> > if (1 == sscanf(line, "%1022[[);/" RANGEBASE "]%n", sig+1, &bytes_read)) { >>> > -> >>> > if (1 == sscanf(line, "%1022[[);.....#$%&.....]%n", sig+1, &bytes_read)) { >>> > ^ >>> > ``` >>> > >>> > >>> > The [C++ reference](https://en.cppreference.com/w/c/io/fscanf) is unclear about how characters like `%` can be escaped inside square brackets (or whether they should be escaped at all). >>> > Trying to use sscanf for this purpose makes the code hard to understand and non portable. It's better to ditch sscanf and read the characters byte-by-byte. That way, you can get rid of the original `PRAGMA_DISABLE_MSVC_WARNING(4819)` as well. >>> >>> The explanation makes sense. We can parse class name and method name via byte-by-byte stream instead of advanced regex-like sscanf. For this reason, I also put a FIXME comment above MethodMatcher::parse_method_pattern. >>> >>> This build failure also appears in the [downstream JDK](https://github.com/alibaba/dragonwell11/issues/70), blocking further development. So the purpose of this PR is to address these treat-warning-as-error problems. I'd like to rewrite this function in another PR. >> >> I think we should add `#pragmas` one only as a last resort. We need to understand why the problem is happening. >> >> This code has been there for a long time. I wonder what happened to cause it to fail in your build system. Could it be related to a particular version of VC++? I checked the build logs from Oracle's CI as well as GitHub actions >> >> 19.27.29111 Oracle -- OK >> 19.28.29334 Alibaba -- warnings >> 19.28.29910 GitHub -- OK >> >> Or, is it related to the system language setting of your build machine (e.g., could it be set to Chinese?) >> >> Most importantly, does the `#pragma` actually make the code work, or does it merely hides the problem? I.e., will sscanf fail at runtime when it sees the `%&`? Have you run any tests to validate that the affected code works with your toolchain? > > I have confirmed that this problem is related to a specific msvc version(At least it happens in msvc 19.28.29334, I have not tested other msvc versions). After applying this patch, I can build successfully, both on upstream JDK and Alibaba JDK. > > I have written a minimal reproducible demo. When the /WX(Treats all compiler warnings as errors) option is turned on, the compiler issued the same warnings. After disabling these warnings via `#pragma`, the compiler will not complain about anything, the execution result is exactly as I expected. Besides, all test cases under `compiler/compilercontrol/` are passed except ClearDirectivesFileStackTest.java which has been problem-listed before. So I believe this is a false positive of the compilation warning, turning it off or on will not affect the runtime behavior. > > FYI: See detailed jtreg log and minimal reproducible demo on JBS attachments. > > Thanks, > Yang I happened to have the exact same version of cl.exe (19.28.29334) but have not been seeing this issue, and I can't reproduce it with the reproducer you attached to the JBS issue either. I do note that the reproducer has a bunch of `LS` characters where `LF` are expected, and Visual Studio asks if I want to normalize them, but, whether I do that or not doesn't change the outcome. I just don't get the same warnings. So, it seems like there might be an environment/configuration problem somewhere, and these warnings are just a symptom of that. It might be worth it to investigate further. ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From hseigel at openjdk.java.net Tue Mar 23 13:16:42 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 23 Mar 2021 13:16:42 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 01:22:56 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > missed THREAD that should be CHECK_false argument. src/hotspot/share/oops/constantPool.cpp line 1426: > 1424: bool match_entry = compare_entry_to(k1, cp2, k2); > 1425: bool match_operand = compare_operand_to(i1, cp2, i2); > 1426: return (match_entry && match_operand); Is it worth changing this to: If (compare_entry_to(...) && compare_operand_to(..)) { .. } Then if the first one is false the second call isn't needed? ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From akozlov at openjdk.java.net Tue Mar 23 13:28:52 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 23 Mar 2021 13:28:52 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 12:49:34 GMT, Magnus Ihse Bursie wrote: >> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: >> >> - Merge branch 'master' into jdk-macos >> - JDK-8262491: bsd_aarch64 part >> - JDK-8263002: bsd_aarch64 part >> - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos >> - Wider #ifdef block >> - Fix most of issues in java/foreign/ tests >> >> Failures related to va_args are tracked in JDK-8263512. >> - Add Azul copyright >> - Update Oracle copyright years >> - Use Thread::current_or_null_safe in SafeFetch >> - 8262903: [macos_aarch64] Thread::current() called on detached thread >> - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 > > Build changes still look good. Hope you can get this done now! :) > > No, no, no! I am not suggesting you change anything else, just that > > you do not define contentless macros. You might as well define it > > to be something, and true is a reasonable default, that's all. It's > > not terribly important, it's just good practice. > > I'm quite prepared to drop this if it's holding up the port. It's a style thing, but it's not critical. Sorry, I missed your reply. R18_RESERVED is also defined in https://github.com/openjdk/jdk/blob/master/make/hotspot/gensrc/GensrcAdlc.gmk#L96. I think changing the value here and there would be slightly out of the scope of this PR, so I would prefer to avoid the suggested change. The biggest argument from my side is that the current macro value is consistent with the rest of the macros in this file. For example https://github.com/openjdk/jdk/blob/8c1ab38ee20ed61fefbb64b6a9ee605c52d2cb4e/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp#L35 and https://github.com/openjdk/jdk/blob/b7b391b2ac4208eabdee4e93acd5b0e364953f94/src/hotspot/share/runtime/mutexLocker.cpp#L137 But https://github.com/openjdk/jdk/blob/8c1ab38ee20ed61fefbb64b6a9ee605c52d2cb4e/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp#L59 and https://github.com/openjdk/jdk/blob/b23228d152ff8fa27bd32d9ef1307bf315039dea/src/hotspot/share/runtime/arguments.cpp#L1540 ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From hseigel at openjdk.java.net Tue Mar 23 13:37:40 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 23 Mar 2021 13:37:40 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: <5XKapnZtZxjCTYxbXI4iep2_lzry40YppJM554ao4vI=.abb36d69-f855-4fbb-a8cd-e1ea32ab1c40@github.com> On Tue, 23 Mar 2021 01:22:56 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > missed THREAD that should be CHECK_false argument. The changes look good! Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Tue Mar 23 13:37:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 13:37:41 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 13:14:15 GMT, Harold Seigel wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> missed THREAD that should be CHECK_false argument. > > src/hotspot/share/oops/constantPool.cpp line 1426: > >> 1424: bool match_entry = compare_entry_to(k1, cp2, k2); >> 1425: bool match_operand = compare_operand_to(i1, cp2, i2); >> 1426: return (match_entry && match_operand); > > Is it worth changing this to: return (compare_entry_to(...) && compare_operand_to(..)); > Then if the first one is false the second call isn't needed? I kind of thought these would make a long complicated expression and so the single use variables is helpful. I don't think performance is important here. ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Tue Mar 23 13:42:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 13:42:42 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 13:33:49 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/constantPool.cpp line 1426: >> >>> 1424: bool match_entry = compare_entry_to(k1, cp2, k2); >>> 1425: bool match_operand = compare_operand_to(i1, cp2, i2); >>> 1426: return (match_entry && match_operand); >> >> Is it worth changing this to: return (compare_entry_to(...) && compare_operand_to(..)); >> Then if the first one is false the second call isn't needed? > > I kind of thought these would make a long complicated expression and so the single use variables is helpful. I don't think performance is important here. {code} - match = compare_entry_to(recur1, cp2, recur2); - if (match) { + if (compare_entry_to(recur1, cp2, recur2)) { return true; } I could do this ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From vkempik at openjdk.java.net Tue Mar 23 13:47:49 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Tue, 23 Mar 2021 13:47:49 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 13:26:19 GMT, Anton Kozlov wrote: >> Build changes still look good. Hope you can get this done now! :) > >> > No, no, no! I am not suggesting you change anything else, just that >> > you do not define contentless macros. You might as well define it >> > to be something, and true is a reasonable default, that's all. It's >> > not terribly important, it's just good practice. >> >> I'm quite prepared to drop this if it's holding up the port. It's a style thing, but it's not critical. > > Sorry, I missed your reply. > > R18_RESERVED is also defined in https://github.com/openjdk/jdk/blob/master/make/hotspot/gensrc/GensrcAdlc.gmk#L96. I think changing the value here and there would be slightly out of the scope of this PR, so I would prefer to avoid the suggested change. > > The biggest argument from my side is that the current macro value is consistent with the rest of the macros in this file. For example https://github.com/openjdk/jdk/blob/8c1ab38ee20ed61fefbb64b6a9ee605c52d2cb4e/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp#L35 > and https://github.com/openjdk/jdk/blob/b7b391b2ac4208eabdee4e93acd5b0e364953f94/src/hotspot/share/runtime/mutexLocker.cpp#L137 > > But https://github.com/openjdk/jdk/blob/8c1ab38ee20ed61fefbb64b6a9ee605c52d2cb4e/src/hotspot/cpu/aarch64/globalDefinitions_aarch64.hpp#L59 > and > https://github.com/openjdk/jdk/blob/b23228d152ff8fa27bd32d9ef1307bf315039dea/src/hotspot/share/runtime/arguments.cpp#L1540 Hello That depends on the will of openjdk11 maintainers to accept this (and few other, like jep-388, as we depend on it) contribution. > 23 ????? 2021 ?., ? 16:39, drej1 ***@***.***> ???????(?): > > > So, where are we up to now? Are we done yet? > > Hello > we would like to get approval for the final version we have now and then integrate this pr as soon as Mark will target it to jdk17 > > Hi there, will this be also supported backwards? To support java11 LTS version? > > ? > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub , or unsubscribe . > ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From jvernee at openjdk.java.net Tue Mar 23 13:50:42 2021 From: jvernee at openjdk.java.net (Jorn Vernee) Date: Tue, 23 Mar 2021 13:50:42 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> Message-ID: On Tue, 23 Mar 2021 13:04:26 GMT, Jorn Vernee wrote: >> I have confirmed that this problem is related to a specific msvc version(At least it happens in msvc 19.28.29334, I have not tested other msvc versions). After applying this patch, I can build successfully, both on upstream JDK and Alibaba JDK. >> >> I have written a minimal reproducible demo. When the /WX(Treats all compiler warnings as errors) option is turned on, the compiler issued the same warnings. After disabling these warnings via `#pragma`, the compiler will not complain about anything, the execution result is exactly as I expected. Besides, all test cases under `compiler/compilercontrol/` are passed except ClearDirectivesFileStackTest.java which has been problem-listed before. So I believe this is a false positive of the compilation warning, turning it off or on will not affect the runtime behavior. >> >> FYI: See detailed jtreg log and minimal reproducible demo on JBS attachments. >> >> Thanks, >> Yang > > I happened to have the exact same version of cl.exe (19.28.29334) but have not been seeing this issue, and I can't reproduce it with the reproducer you attached to the JBS issue either. > > I do note that the reproducer has a bunch of `LS` characters where `LF` are expected, and Visual Studio asks if I want to normalize them, but, whether I do that or not doesn't change the outcome. I just don't get the same warnings. > > So, it seems like there might be an environment/configuration problem somewhere, and these warnings are just a symptom of that. It might be worth it to investigate further. Based on Ioi's suggestion I decided to try with a different locale as well. I tried setting my system locale to `Chinese (Simplified, China)` and with that I was able to reproduce the warnings you report, so it indeed seems to be an issue with locale settings. AFAIK only `en-us` is supported. I've had problems in the past as well because I had the wrong locale set, and some of the tests were failing because of that. So, maybe rather than disabling the warnings, it might be more prudent to change the system locale of the used build systems to prevent similar issues in the future (FWIW, the display language doesn't seem to affect `cl` so that could still be whatever is convenient). ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From aph at openjdk.java.net Tue Mar 23 13:56:59 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 23 Mar 2021 13:56:59 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 12:50:14 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: > > - Merge branch 'master' into jdk-macos > - JDK-8262491: bsd_aarch64 part > - JDK-8263002: bsd_aarch64 part > - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos > - Wider #ifdef block > - Fix most of issues in java/foreign/ tests > > Failures related to va_args are tracked in JDK-8263512. > - Add Azul copyright > - Update Oracle copyright years > - Use Thread::current_or_null_safe in SafeFetch > - 8262903: [macos_aarch64] Thread::current() called on detached thread > - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From aph at openjdk.java.net Tue Mar 23 14:00:58 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 23 Mar 2021 14:00:58 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 13:54:24 GMT, Andrew Haley wrote: >> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: >> >> - Merge branch 'master' into jdk-macos >> - JDK-8262491: bsd_aarch64 part >> - JDK-8263002: bsd_aarch64 part >> - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos >> - Wider #ifdef block >> - Fix most of issues in java/foreign/ tests >> >> Failures related to va_args are tracked in JDK-8263512. >> - Add Azul copyright >> - Update Oracle copyright years >> - Use Thread::current_or_null_safe in SafeFetch >> - 8262903: [macos_aarch64] Thread::current() called on detached thread >> - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 > > Marked as reviewed by aph (Reviewer). > [ Back-porting this patch to JDK 11] depends on the will of openjdk11 maintainers to accept this (and few other, like jep-388, as we depend on it) contribution. To the extent that 11u has fixed policies :) we definitely have a policy of accepting patches to keep 11u working on current hardware. So yes. ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From vkempik at openjdk.java.net Tue Mar 23 14:03:52 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Tue, 23 Mar 2021 14:03:52 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 13:58:03 GMT, Andrew Haley wrote: > > [ Back-porting this patch to JDK 11] depends on the will of openjdk11 maintainers to accept this (and few other, like jep-388, as we depend on it) contribution. > > To the extent that 11u has fixed policies :) we definitely have a policy of accepting patches to keep 11u working on current hardware. So yes. @lewurm That sounds like a green flag for you and jep-388 (with its R18_RESERVED functionality) ;) ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From mbeckwit at openjdk.java.net Tue Mar 23 14:27:53 2021 From: mbeckwit at openjdk.java.net (Monica Beckwith) Date: Tue, 23 Mar 2021 14:27:53 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 14:01:12 GMT, Vladimir Kempik wrote: > > > [ Back-porting this patch to JDK 11] depends on the will of openjdk11 maintainers to accept this (and few other, like jep-388, as we depend on it) contribution. > > > > > > To the extent that 11u has fixed policies :) we definitely have a policy of accepting patches to keep 11u working on current hardware. So yes. > > @lewurm That sounds like a green flag for you and jep-388 (with its R18_RESERVED functionality) ;) Thanks, @theRealAph, and @VladimirKempik . We are on it! ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Tue Mar 23 15:05:59 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 15:05:59 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v3] In-Reply-To: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: > Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. > Tested with vmTestbase/nsk/jvmti and tier1 (in progress). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix some obvious single use variables. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3141/files - new: https://git.openjdk.java.net/jdk/pull/3141/files/1e3f00d4..36006162 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=01-02 Stats: 14 lines in 1 file changed: 0 ins; 7 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/3141.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3141/head:pull/3141 PR: https://git.openjdk.java.net/jdk/pull/3141 From shade at openjdk.java.net Tue Mar 23 15:11:50 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Tue, 23 Mar 2021 15:11:50 GMT Subject: RFR: 8264050: Remove unused field VM_HeapWalkOperation::_collecting_heap_roots Message-ID: SonarCloud reports field `_collecting_heap_roots` is not initialized after constructor ends. In fact, that field is not used anywhere. It was like that since the initial load. We can trivially remove it. ------------- Commit messages: - 8264050: Remove unused field VM_HeapWalkOperation::_collecting_heap_roots Changes: https://git.openjdk.java.net/jdk/pull/3153/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3153&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264050 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3153/head:pull/3153 PR: https://git.openjdk.java.net/jdk/pull/3153 From coleenp at openjdk.java.net Tue Mar 23 15:19:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 15:19:41 GMT Subject: RFR: 8264050: Remove unused field VM_HeapWalkOperation::_collecting_heap_roots In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 15:06:02 GMT, Aleksey Shipilev wrote: > SonarCloud reports field `_collecting_heap_roots` is not initialized after constructor ends. In fact, that field is not used anywhere. It was like that since the initial load. We can trivially remove it. SCCS says it was used once. Thanks for removing it. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3153 From mbeckwit at openjdk.java.net Tue Mar 23 15:27:56 2021 From: mbeckwit at openjdk.java.net (Monica Beckwith) Date: Tue, 23 Mar 2021 15:27:56 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: <8y8WYVmVF_qrroxDo516Ucz1qWX5kMpPQHeqZgJNI2Q=.004fe86c-2b18-4a90-8499-1a34e64da3db@github.com> On Mon, 22 Mar 2021 12:50:14 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: > > - Merge branch 'master' into jdk-macos > - JDK-8262491: bsd_aarch64 part > - JDK-8263002: bsd_aarch64 part > - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos > - Wider #ifdef block > - Fix most of issues in java/foreign/ tests > > Failures related to va_args are tracked in JDK-8263512. > - Add Azul copyright > - Update Oracle copyright years > - Use Thread::current_or_null_safe in SafeFetch > - 8262903: [macos_aarch64] Thread::current() called on detached thread > - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 Marked as reviewed by mbeckwit (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From hseigel at openjdk.java.net Tue Mar 23 15:38:42 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 23 Mar 2021 15:38:42 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v3] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 15:05:59 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix some obvious single use variables. Latest changes look good. Thanks, Harold ------------- Marked as reviewed by hseigel (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Tue Mar 23 15:48:43 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 15:48:43 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v3] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 15:36:02 GMT, Harold Seigel wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix some obvious single use variables. > > Latest changes look good. > Thanks, Harold In your comments above, this makes sense to me: Thread* THREAD = current; // For exception processing This way we can keep the existing macros HAS_PENDING_EXCEPTION, etc. I'll make this change here. ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Tue Mar 23 16:08:59 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 16:08:59 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v4] In-Reply-To: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: > Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. > Tested with vmTestbase/nsk/jvmti and tier1 (in progress). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix load_new_class_versions and remove more traps. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3141/files - new: https://git.openjdk.java.net/jdk/pull/3141/files/36006162..9134ce0b Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=02-03 Stats: 21 lines in 2 files changed: 4 ins; 2 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/3141.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3141/head:pull/3141 PR: https://git.openjdk.java.net/jdk/pull/3141 From luhenry at openjdk.java.net Tue Mar 23 16:23:54 2021 From: luhenry at openjdk.java.net (Ludovic Henry) Date: Tue, 23 Mar 2021 16:23:54 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 12:50:14 GMT, Anton Kozlov wrote: >> Please review the implementation of JEP 391: macOS/AArch64 Port. >> >> It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. >> >> Major changes are in: >> * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) >> * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) >> * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. >> * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) > > Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: > > - Merge branch 'master' into jdk-macos > - JDK-8262491: bsd_aarch64 part > - JDK-8263002: bsd_aarch64 part > - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos > - Wider #ifdef block > - Fix most of issues in java/foreign/ tests > > Failures related to va_args are tracked in JDK-8263512. > - Add Azul copyright > - Update Oracle copyright years > - Use Thread::current_or_null_safe in SafeFetch > - 8262903: [macos_aarch64] Thread::current() called on detached thread > - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 Marked as reviewed by luhenry (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From dcubed at openjdk.java.net Tue Mar 23 16:35:43 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 23 Mar 2021 16:35:43 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v4] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 16:08:59 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix load_new_class_versions and remove more traps. Looks good. You should also do JDI test runs for this changeset and you should wait to hear from the Serviceability team before integration. ------------- Marked as reviewed by dcubed (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From dcubed at openjdk.java.net Tue Mar 23 16:35:43 2021 From: dcubed at openjdk.java.net (Daniel D.Daugherty) Date: Tue, 23 Mar 2021 16:35:43 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v2] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 13:39:06 GMT, Coleen Phillimore wrote: >> I kind of thought these would make a long complicated expression and so the single use variables is helpful. I don't think performance is important here. > > {code} > - match = compare_entry_to(recur1, cp2, recur2); > - if (match) { > + if (compare_entry_to(recur1, cp2, recur2)) { > return true; > } > I could do this You can't return `true` after just `compare_entry_to()` because you still have to check `compare_operand_to()`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From aph at openjdk.java.net Tue Mar 23 16:36:55 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 23 Mar 2021 16:36:55 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 16:20:47 GMT, Ludovic Henry wrote: >> Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 115 commits: >> >> - Merge branch 'master' into jdk-macos >> - JDK-8262491: bsd_aarch64 part >> - JDK-8263002: bsd_aarch64 part >> - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos >> - Wider #ifdef block >> - Fix most of issues in java/foreign/ tests >> >> Failures related to va_args are tracked in JDK-8263512. >> - Add Azul copyright >> - Update Oracle copyright years >> - Use Thread::current_or_null_safe in SafeFetch >> - 8262903: [macos_aarch64] Thread::current() called on detached thread >> - ... and 105 more: https://git.openjdk.java.net/jdk/compare/a9d2267f...5add9269 > > Marked as reviewed by luhenry (Author). > > > > [ Back-porting this patch to JDK 11] depends on the will of openjdk11 maintainers to accept this (and few other, like jep-388, as we depend on it) contribution. > > > > > > > > > To the extent that 11u has fixed policies :) we definitely have a policy of accepting patches to keep 11u working on current hardware. So yes. > > > > > > @lewurm That sounds like a green flag for you and jep-388 (with its R18_RESERVED functionality) ;) > > Thanks, @theRealAph, and @VladimirKempik . We are on it! It's going to be tricky to do in a really clean way, given some of the weirdnesses of the ABI. However, I think there's probably a need for it ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From coleenp at openjdk.java.net Tue Mar 23 16:45:52 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 16:45:52 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions Message-ID: This change removes the TRAPS parameter from compute_modifier_flags(), lookup_instance_method_in_klasses and nest_host_error. There's a progressive effort to remove cases where the last parameter of a function is THREAD, and it's unclear why it is ignoring an exception or whether an exception is expected, if it doesn't subsequently have a check for HAS_PENDING_EXCEPTION. Tested locally with tier1 tests and tier1 tests on 4 Oracle platforms in progress. ------------- Commit messages: - removed more TRAPS - 8264051: Remove unused TRAPS parameters from runtime functions Changes: https://git.openjdk.java.net/jdk/pull/3157/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3157&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264051 Stats: 52 lines in 16 files changed: 0 ins; 17 del; 35 mod Patch: https://git.openjdk.java.net/jdk/pull/3157.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3157/head:pull/3157 PR: https://git.openjdk.java.net/jdk/pull/3157 From tschatzl at openjdk.java.net Tue Mar 23 17:02:39 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Tue, 23 Mar 2021 17:02:39 GMT Subject: RFR: 8264050: Remove unused field VM_HeapWalkOperation::_collecting_heap_roots In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 15:06:02 GMT, Aleksey Shipilev wrote: > SonarCloud reports field `_collecting_heap_roots` is not initialized after constructor ends. In fact, that field is not used anywhere. It was like that since the initial load. We can trivially remove it. Lgtm and trivial. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3153 From iwalulya at openjdk.java.net Tue Mar 23 17:32:42 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Tue, 23 Mar 2021 17:32:42 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs In-Reply-To: References: Message-ID: On Thu, 18 Mar 2021 14:00:00 GMT, Stefan Johansson wrote: > Please review this refactoring of the hugetlbfs reservation code. > > **Summary** > In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: > if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { > return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); > } else { > return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); > } > > The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. > > Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). > > **Testing** > Mach5 tier1-3 and a lot of local testing with different large page configurations. Changes requested by iwalulya (Committer). src/hotspot/os/linux/os_linux.cpp line 3929: > 3927: } > 3928: > 3929: char* os::Linux::reserve_and_commit_special(size_t bytes, method name `reserve_and_commit_` implicitly suggests that other methods with just `reserve_memory_` do not commit (to me). src/hotspot/os/linux/os_linux.cpp line 3981: > 3979: // and the given alignment. The larger of the two will be used. > 3980: size_t required_alignment = MAX(os::large_page_size(), alignment); > 3981: char* const aligned_start = anon_mmap_aligned(req_addr, bytes, required_alignment); Do we need to add back the comment "// First reserve - but not commit"? src/hotspot/os/linux/os_linux.cpp line 3990: > 3988: char* large_mapping = reserve_and_commit_special(large_bytes, os::large_page_size(), aligned_start, exec); > 3989: > 3990: if (bytes == large_bytes) { Shouldn't we do the check `if (large_mapping == NULL) {` before this? src/hotspot/os/linux/os_linux.cpp line 3998: > 3996: char* small_start = aligned_start + large_bytes; > 3997: size_t small_size = bytes - large_bytes; > 3998: if (large_mapping == NULL) { See comment above ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From iklam at openjdk.java.net Tue Mar 23 17:33:43 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 17:33:43 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions In-Reply-To: References: Message-ID: <9uxwV28iiTNXulJXZMM5HTpRa0DJ8cske7FRh_cyTvw=.1f44c247-929a-46af-be32-8f0c462deca4@github.com> On Tue, 23 Mar 2021 16:40:44 GMT, Coleen Phillimore wrote: > This change removes the TRAPS parameter from compute_modifier_flags(), lookup_instance_method_in_klasses and nest_host_error. > > There's a progressive effort to remove cases where the last parameter of a function is THREAD, and it's unclear why it is ignoring an exception or whether an exception is expected, if it doesn't subsequently have a check for HAS_PENDING_EXCEPTION. > > Tested locally with tier1 tests and tier1 tests on 4 Oracle platforms in progress. LGTM. Thanks for doing the cleanup. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3157 From xliu at openjdk.java.net Tue Mar 23 18:24:39 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 23 Mar 2021 18:24:39 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 22:12:14 GMT, Xin Liu wrote: > This patch provides a buffer to store asynchrounous messages and flush them to > underlying files periodically. Hi, Reviewers, I would like to restart the RFR process for the feature async logging. We (AWS) have deployed this feature over a year in a few critical services. It helps us to reduce long-tail GC pauses. On Linux, we used to experience intermittent second-level delays due to gclog writings. If those undesirable events happen to appear at safepoints, hotspot has to prolong the pause intervals, which then increase the response time of Java application/service. Originally, we observed and solved this issue on a Linux system with software RAID. In absence of hardware assistance, multiple writes have to be synchronized and it is that operation that yields long pauses. This issue may become more prevalent if Linux servers adopt ZFS in the future. We don?t think redirecting log files to tmpfs is a final solution. Hotspot should provide a self-contained and cross-platform solution. **Our solution is to provide a buffer and flush it in a standalone thread periodically.** Since then, we found more unexpected but interesting scenarios. e.g. some cloud-based applications run entirely on a AWS EBS partition. syscall `write` could be a blocking operation if the underlying infrastructure is experiencing an intermittent issue. Even stdout/stderr are not absolutely non-blocking. Someone may send `XOFF` of software flow control and pause the tty to read. As a result, the JVM process which is emitting logs to the tty is blocked then. Yes, that action may freeze the Java service accidentally! Those pain points are not AWS-exclusive. We found relevant questions on stackoverflow[1] and it seems that J9 provides an option `-Xgc:bufferedLogging` to mitigate it[2]. We hope hotspot would consider our feature. Back to implementation, this is the 2nd revision based on Unified logging. Previous RFR[3] was a top-down design. We provide a parallel header file `asynclog.hpp` and hope log-sites opt in. That design is very stiff because asynclog.hpp is full of template parameters and was turned down[4]. The new patch has deprecated the old design and achieved asynchronous logging in bottom-up way. We provide an output-option which conforms to JEP-158[5]. Developers can choose asynchronous mode for a file-based output by providing an extra option **async=true**. e.g. `-Xlog:gc=debug:file=gc.log::async=true` May we know more about LogMessageBuffer.hpp/cpp? We haven?t found a real use of it. That?s why we are hesitating to support LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator). Further, we haven?t supported async_mode for LogStdoutOutput and LogStderrOutput either. It?s not difficult but needs to big code change. [1] https://stackoverflow.com/questions/27072340/is-gc-log-writing-asynchronous-safe-to-put-gc-log-on-nfs-mount [2] https://stackoverflow.com/questions/54994943/is-openj9-gc-log-asynchronous [3] https://cr.openjdk.java.net/~xliu/8229517/01/webrev/ [4] https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-March/041034.html [5] https://openjdk.java.net/jeps/158 ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From kvn at openjdk.java.net Tue Mar 23 19:11:41 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 23 Mar 2021 19:11:41 GMT Subject: RFR: 8254050: HotSpot Style Guide should permit using the "override" virtual specifier In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 03:44:56 GMT, Kim Barrett wrote: > Please review and vote on this change to the HotSpot Style Guide to permit > the use of `override` virtual specifiers. The virtual specifiers `override` > and `final` were added in C++11, and use of `final` is already permitted in > HotSpot code. > > Using the `override` specifier provides error checking that the function is > indeed overriding a virtual function declared in a base class. This can > prevent some often surprisingly difficult to spot bugs. > > This is a modification of the Style Guide, so rough consensus among > the HotSpot Group members is required to make this change. Only Group > members should vote for approval (via the github PR), though reasoned > objections or comments from anyone will be considered. A decision on > this proposal will not be made before Tuesday 30-Mar-2021 at 12h00 UTC. > > Since we're piggybacking on github PRs here, please use the PR review > process to approve (click on Review Changes > Approve), rather than > sending a "vote: yes" email reply that would be normal for a CFV. > Other responses can still use email of course. Marked as reviewed by kvn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3021 From coleenp at openjdk.java.net Tue Mar 23 19:12:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 19:12:42 GMT Subject: RFR: 8263721: Unify oop casting [v2] In-Reply-To: <9WBFjpLGVwGtmj6k1PosqnNsp5qbQEObjtS6A_B5Apg=.e29ef71a-8355-437a-a868-39ea123733b0@github.com> References: <9WBFjpLGVwGtmj6k1PosqnNsp5qbQEObjtS6A_B5Apg=.e29ef71a-8355-437a-a868-39ea123733b0@github.com> Message-ID: On Tue, 23 Mar 2021 10:18:06 GMT, Stefan Karlsson wrote: >> In fastdebug builds we replace the "oopDesc* to oop" typedef with a wrapper class that holds an oopDesc*. This wrapper class allows any kind of pointer to be implicitly converted to an oop. So you can write code like this: >> >> Metadata* m = (Metadata*)0x123; >> oop o = m; >> >> and the compiler will unfortunately accept it. Fortunately, this will be caught in release builds, because you can't convert a Method* into an oopDesc*. >> >> One interesting thing is that you can't convert values of integral type too oops: >> >> uintptr_t m = uintptr_t(123); >> oop o = m; >> >> This fails in both fastdebug and release builds. To be able to convert integral values to oops, there are two helper functions: >> >> // For CHECK_UNHANDLED_OOPS, it is ambiguous C++ behavior to have the oop >> // structure contain explicit user defined conversions of both numerical >> // and pointer type. Define inline methods to provide the numerical conversions. >> template inline oop cast_to_oop(T value) { >> return (oop)(CHECK_UNHANDLED_OOPS_ONLY((void *))(value)); >> } >> template inline T cast_from_oop(oop o) { >> return (T)(CHECK_UNHANDLED_OOPS_ONLY((oopDesc*))o); >> } >> >> So, the above example would have to be written as: >> >> uintptr_t m = uintptr_t(123); >> oop o = cast_to_oop(m); >> >> My proposal is that we stop allowing implicit (and explicit) casts from void*, and instead use cast_to_oop whenever we want to cast to oops. We would still allow oopDesc* to be implicitly converted to oop. This would also allow NULL to be converted too oop without casting: >> >> oop o = NULL; >> >> This will make the code to convert oops a little bit longer. It could be argued that that's a good thing, because everyone should be cautious about converting things into oops. This will also give us one entry-point where we could add (probably temporary) verification code. >> >> An alternative to the suggestion above, could be to completely get rid of cast_to_oop and cast_from_oop. But for that to work we need to stop using NULL, which is an integral 0, and start to use nullptr for oops. I've prototyped this as well, but initial investigations showed that some tended to prefer having the cast_to_oop function. (We could still move from NULL to nullptr, if we think that is a good idea). > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Remove casts in merged changes > - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting > - Convert tab to spaces > - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting > - Stricter oop casts Looks like a safe to me. I didn't realize there were still so many files with oop casts. I only scrolled by the GC ones though. ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3047 From coleenp at openjdk.java.net Tue Mar 23 19:18:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 23 Mar 2021 19:18:40 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v4] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 16:33:15 GMT, Daniel D. Daugherty wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix load_new_class_versions and remove more traps. > > Looks good. > > You should also do JDI test runs for this changeset and you should wait > to hear from the Serviceability team before integration. I ran the jvmti and jdi tests, as well as the serviceability/jvmti/RedefineClasses tests, and jdk/java/lang/instrument tests. ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From sjohanss at openjdk.java.net Tue Mar 23 20:29:45 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 23 Mar 2021 20:29:45 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs In-Reply-To: References: Message-ID: On Thu, 18 Mar 2021 14:00:00 GMT, Stefan Johansson wrote: > Please review this refactoring of the hugetlbfs reservation code. > > **Summary** > In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: > if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { > return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); > } else { > return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); > } > > The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. > > Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). > > **Testing** > Mach5 tier1-3 and a lot of local testing with different large page configurations. Thanks for reviewing Ivan, I will wait to update the PR until we come up with a function name that is better than the current. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From sjohanss at openjdk.java.net Tue Mar 23 20:29:47 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Tue, 23 Mar 2021 20:29:47 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 16:58:19 GMT, Ivan Walulya wrote: >> Please review this refactoring of the hugetlbfs reservation code. >> >> **Summary** >> In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: >> if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { >> return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); >> } else { >> return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); >> } >> >> The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. >> >> Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). >> >> **Testing** >> Mach5 tier1-3 and a lot of local testing with different large page configurations. > > src/hotspot/os/linux/os_linux.cpp line 3929: > >> 3927: } >> 3928: >> 3929: char* os::Linux::reserve_and_commit_special(size_t bytes, > > method name `reserve_and_commit_` implicitly suggests that other methods with just `reserve_memory_` do not commit (to me). That is a good point, I've struggled a bit with this function name. Since we actually do the reservation using the call to `anon_mmap_aligned()` maybe this one should just be called: `commit_memory_special()` > src/hotspot/os/linux/os_linux.cpp line 3981: > >> 3979: // and the given alignment. The larger of the two will be used. >> 3980: size_t required_alignment = MAX(os::large_page_size(), alignment); >> 3981: char* const aligned_start = anon_mmap_aligned(req_addr, bytes, required_alignment); > > Do we need to add back the comment "// First reserve - but not commit"? My hope was that the new comment would be sufficient, but if you think it is needed I could add that the reserved range is not committed here. > src/hotspot/os/linux/os_linux.cpp line 3990: > >> 3988: char* large_mapping = reserve_and_commit_special(large_bytes, os::large_page_size(), aligned_start, exec); >> 3989: >> 3990: if (bytes == large_bytes) { > > Shouldn't we do the check `if (large_mapping == NULL) {` before this? We could, and I thought about adding a comment here. I probably should have. The reason we don't do the explicit check it is that if `large_mapping == NULL` then NULL will be returned as expected and there is no additional work to be done if `bytes == large_bytes`. > src/hotspot/os/linux/os_linux.cpp line 3998: > >> 3996: char* small_start = aligned_start + large_bytes; >> 3997: size_t small_size = bytes - large_bytes; >> 3998: if (large_mapping == NULL) { > > See comment above I do the check here because I need the `small_start` and `small_size` values if `large_mapping == NULL` and there was additional memory reserved for small pages. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From iklam at openjdk.java.net Tue Mar 23 21:15:44 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 21:15:44 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v4] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 16:08:59 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix load_new_class_versions and remove more traps. * THREAD = current; // for exception processing``` is used only when the current method does not declare TRAPS, which means it should never throw. As a convention, I think the above code should be accompanied with by em(THREAD);``` to ensure that the exceptions are not unintentionally leaked. src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1389: > 1387: state->set_class_being_redefined(the_class, _class_load_kind); > 1388: > 1389: Thread* THREAD = current; // for exception processing Add `ExceptionMark em(THREAD);` src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 2118: > 2116: methodHandle method(current, methods->at(i)); > 2117: methodHandle new_method; > 2118: Thread* THREAD = current; // For exception handling Add `ExceptionMark em(THREAD);` ------------- Changes requested by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From iklam at openjdk.java.net Tue Mar 23 21:18:40 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 21:18:40 GMT Subject: RFR: 8263992: Remove dead code NativeLookup::base_library_lookup [v2] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 05:31:51 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> removed unused "in_base_library" parameter; removed unnecessary include of nativeLookup.hpp > > Looks even better! Thanks. > > Of course now I'm going to have to go and find out why in_base_library was introduced :) > > Cheers, > David Thanks David for the review. Builds from tiers 1-5 passed. ------------- PR: https://git.openjdk.java.net/jdk/pull/3139 From iklam at openjdk.java.net Tue Mar 23 21:18:41 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Tue, 23 Mar 2021 21:18:41 GMT Subject: Integrated: 8263992: Remove dead code NativeLookup::base_library_lookup In-Reply-To: References: Message-ID: <80tgusi8W7TpnpOhD4uIJflNXBdBAT6BIIjAtMRwakQ=.35d06deb-59a5-4495-a94a-311bcc73dba6@github.com> On Mon, 22 Mar 2021 23:26:01 GMT, Ioi Lam wrote: > Please review this removal of dead code, and unused parameter of `in_base_library` for the functions that remain. The word `base_library_lookup` does not exist in any C source code in the entire JDK. This pull request has now been integrated. Changeset: 35102cb0 Author: Ioi Lam URL: https://git.openjdk.java.net/jdk/commit/35102cb0 Stats: 57 lines in 12 files changed: 0 ins; 35 del; 22 mod 8263992: Remove dead code NativeLookup::base_library_lookup Reviewed-by: coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/3139 From iwalulya at openjdk.java.net Tue Mar 23 23:04:42 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Tue, 23 Mar 2021 23:04:42 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 20:23:23 GMT, Stefan Johansson wrote: >> src/hotspot/os/linux/os_linux.cpp line 3990: >> >>> 3988: char* large_mapping = reserve_and_commit_special(large_bytes, os::large_page_size(), aligned_start, exec); >>> 3989: >>> 3990: if (bytes == large_bytes) { >> >> Shouldn't we do the check `if (large_mapping == NULL) {` before this? > > We could, and I thought about adding a comment here. I probably should have. The reason we don't do the explicit check it is that if `large_mapping == NULL` then NULL will be returned as expected and there is no additional work to be done if `bytes == large_bytes`. When `bytes == large_bytes` don't we need to `unmap` the original reservation in case `reserve_and_commit_special` fails i.e `large_mapping == NULL`? Below we have the comment // Large mapping failed, so we need to unmap the reminder // of the orinal reservation. Maybe I am missing something. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From iwalulya at openjdk.java.net Tue Mar 23 23:12:39 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Tue, 23 Mar 2021 23:12:39 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 20:20:15 GMT, Stefan Johansson wrote: >> src/hotspot/os/linux/os_linux.cpp line 3929: >> >>> 3927: } >>> 3928: >>> 3929: char* os::Linux::reserve_and_commit_special(size_t bytes, >> >> method name `reserve_and_commit_` implicitly suggests that other methods with just `reserve_memory_` do not commit (to me). > > That is a good point, I've struggled a bit with this function name. Since we actually do the reservation using the call to `anon_mmap_aligned()` maybe this one should just be called: `commit_memory_special()` yes, seems better to me >> src/hotspot/os/linux/os_linux.cpp line 3990: >> >>> 3988: char* large_mapping = reserve_and_commit_special(large_bytes, os::large_page_size(), aligned_start, exec); >>> 3989: >>> 3990: if (bytes == large_bytes) { >> >> Shouldn't we do the check `if (large_mapping == NULL) {` before this? > > We could, and I thought about adding a comment here. I probably should have. The reason we don't do the explicit check it is that if `large_mapping == NULL` then NULL will be returned as expected and there is no additional work to be done if `bytes == large_bytes`. ok, now I get it. I had missed that when mmap fails, the original reservation is unmapped for this range so no need to do any additional clean up. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From dholmes at openjdk.java.net Tue Mar 23 23:28:40 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 23 Mar 2021 23:28:40 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v4] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 16:08:59 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix load_new_class_versions and remove more traps. Updates look good - thanks. I agree with Ioi about adding ExceptionMark as part of this usage pattern - it captures the intent that no exceptions are allowed to escape. Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From manc at openjdk.java.net Wed Mar 24 01:16:40 2021 From: manc at openjdk.java.net (Man Cao) Date: Wed, 24 Mar 2021 01:16:40 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: On Sun, 14 Mar 2021 02:50:37 GMT, Kim Barrett wrote: >> Man Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Address comment and add a gtest. > > Tests? There should be some gtests to go with this. Could anyone continue reviewing the updated pull request? ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From coleenp at openjdk.java.net Wed Mar 24 02:10:53 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 02:10:53 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v5] In-Reply-To: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: > Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. > Tested with vmTestbase/nsk/jvmti and tier1 (in progress). Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add ExceptionMarks. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3141/files - new: https://git.openjdk.java.net/jdk/pull/3141/files/9134ce0b..2080b946 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3141&range=03-04 Stats: 6 lines in 1 file changed: 3 ins; 1 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3141.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3141/head:pull/3141 PR: https://git.openjdk.java.net/jdk/pull/3141 From dholmes at openjdk.java.net Wed Mar 24 02:25:41 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 24 Mar 2021 02:25:41 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v5] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Wed, 24 Mar 2021 02:10:53 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add ExceptionMarks. Looks good! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3141 From dholmes at openjdk.java.net Wed Mar 24 02:57:42 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 24 Mar 2021 02:57:42 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 16:40:44 GMT, Coleen Phillimore wrote: > This change removes the TRAPS parameter from compute_modifier_flags(), lookup_instance_method_in_klasses and nest_host_error. > > There's a progressive effort to remove cases where the last parameter of a function is THREAD, and it's unclear why it is ignoring an exception or whether an exception is expected, if it doesn't subsequently have a check for HAS_PENDING_EXCEPTION. > > Tested locally with tier1 tests and tier1 tests on 4 Oracle platforms in progress. Nice cleanup Coleen! I have a further suggested change to simplify things further - see below. Thanks, David src/hotspot/share/oops/instanceKlass.hpp line 468: > 466: void set_nest_host_index(u2 i) { _nest_host_index = i; } > 467: // dynamic nest member support > 468: void set_nest_host(InstanceKlass* host, JavaThread* current); I think we should drop the thread parameter for this method and nest_host_error and avoid the tramp data. We only need the current thread for a ResourceMark in a logging clause in set_nest_host, and for a ConstantPoolHandle in nest_host_error. Neither of these codepaths are hot so we can just manifest Thread::current() when needed there. Then you can also remove the thread parameter from print_nest_host_error_on. ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3157 From coleenp at openjdk.java.net Wed Mar 24 03:10:39 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 03:10:39 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions In-Reply-To: References: Message-ID: <_Kcj8x3UIVkABjqGI3izel_-s2Xsak7UsGQiXQ0Xtms=.ecc09e06-2f9b-44b6-9596-df3319ccfd26@github.com> On Wed, 24 Mar 2021 02:45:29 GMT, David Holmes wrote: >> This change removes the TRAPS parameter from compute_modifier_flags(), lookup_instance_method_in_klasses and nest_host_error. >> >> There's a progressive effort to remove cases where the last parameter of a function is THREAD, and it's unclear why it is ignoring an exception or whether an exception is expected, if it doesn't subsequently have a check for HAS_PENDING_EXCEPTION. >> >> Tested locally with tier1 tests and tier1 tests on 4 Oracle platforms in progress. > > src/hotspot/share/oops/instanceKlass.hpp line 468: > >> 466: void set_nest_host_index(u2 i) { _nest_host_index = i; } >> 467: // dynamic nest member support >> 468: void set_nest_host(InstanceKlass* host, JavaThread* current); > > I think we should drop the thread parameter for this method and nest_host_error and avoid the tramp data. We only need the current thread for a ResourceMark in a logging clause in set_nest_host, and for a ConstantPoolHandle in nest_host_error. Neither of these codepaths are hot so we can just manifest Thread::current() when needed there. > Then you can also remove the thread parameter from print_nest_host_error_on. You're right. I can remove a lot of thread arguments this way. ------------- PR: https://git.openjdk.java.net/jdk/pull/3157 From coleenp at openjdk.java.net Wed Mar 24 03:20:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 03:20:41 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v4] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 21:07:02 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix load_new_class_versions and remove more traps. > > src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1389: > >> 1387: state->set_class_being_redefined(the_class, _class_load_kind); >> 1388: >> 1389: Thread* THREAD = current; // for exception processing > > Add `ExceptionMark em(THREAD);` Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Wed Mar 24 03:22:05 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 03:22:05 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions [v2] In-Reply-To: References: Message-ID: > This change removes the TRAPS parameter from compute_modifier_flags(), lookup_instance_method_in_klasses and nest_host_error. > > There's a progressive effort to remove cases where the last parameter of a function is THREAD, and it's unclear why it is ignoring an exception or whether an exception is expected, if it doesn't subsequently have a check for HAS_PENDING_EXCEPTION. > > Tested locally with tier1 tests and tier1 tests on 4 Oracle platforms in progress. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: removed thread parameter from nest_host_error and callers. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3157/files - new: https://git.openjdk.java.net/jdk/pull/3157/files/57f5254e..3ce07844 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3157&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3157&range=00-01 Stats: 15 lines in 5 files changed: 0 ins; 0 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/3157.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3157/head:pull/3157 PR: https://git.openjdk.java.net/jdk/pull/3157 From coleenp at openjdk.java.net Wed Mar 24 03:22:05 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 03:22:05 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions [v2] In-Reply-To: <_Kcj8x3UIVkABjqGI3izel_-s2Xsak7UsGQiXQ0Xtms=.ecc09e06-2f9b-44b6-9596-df3319ccfd26@github.com> References: <_Kcj8x3UIVkABjqGI3izel_-s2Xsak7UsGQiXQ0Xtms=.ecc09e06-2f9b-44b6-9596-df3319ccfd26@github.com> Message-ID: On Wed, 24 Mar 2021 03:07:42 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/instanceKlass.hpp line 468: >> >>> 466: void set_nest_host_index(u2 i) { _nest_host_index = i; } >>> 467: // dynamic nest member support >>> 468: void set_nest_host(InstanceKlass* host, JavaThread* current); >> >> I think we should drop the thread parameter for this method and nest_host_error and avoid the tramp data. We only need the current thread for a ResourceMark in a logging clause in set_nest_host, and for a ConstantPoolHandle in nest_host_error. Neither of these codepaths are hot so we can just manifest Thread::current() when needed there. >> Then you can also remove the thread parameter from print_nest_host_error_on. > > You're right. I can remove a lot of thread arguments this way. Done. Reran runtime/Nestmates tests and tier1. ------------- PR: https://git.openjdk.java.net/jdk/pull/3157 From dholmes at openjdk.java.net Wed Mar 24 03:52:40 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 24 Mar 2021 03:52:40 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 03:22:05 GMT, Coleen Phillimore wrote: >> This change removes the TRAPS parameter from compute_modifier_flags(), lookup_instance_method_in_klasses and nest_host_error. >> >> There's a progressive effort to remove cases where the last parameter of a function is THREAD, and it's unclear why it is ignoring an exception or whether an exception is expected, if it doesn't subsequently have a check for HAS_PENDING_EXCEPTION. >> >> Tested locally with tier1 tests and tier1 tests on 4 Oracle platforms in progress. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > removed thread parameter from nest_host_error and callers. Looks good! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3157 From iklam at openjdk.java.net Wed Mar 24 05:44:39 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 24 Mar 2021 05:44:39 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v5] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Wed, 24 Mar 2021 02:10:53 GMT, Coleen Phillimore wrote: >> Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. >> Tested with vmTestbase/nsk/jvmti and tier1 (in progress). > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add ExceptionMarks. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From never at openjdk.java.net Wed Mar 24 06:06:39 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Wed, 24 Mar 2021 06:06:39 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 06:28:55 GMT, David Holmes wrote: >> 8264016: [JVMCI] add some thread local fields for use by JVMCI > > Hi Tom, > > Is it feasible to create a JVMCI helper side-object that is only created when needed, rather than embedding all the fields directly in the JavaThread instance? > > Thanks, > David Well the goal is to have storage that is only one or two loads away so adding a level of indirection isn't great. Many of the other JVMCI fields aren't really performance critical so they could be an extra indirection away but that's not a compatible JVMCI change since they are directly written by Graal for deopt. It also wouldn't save much space. I know there is sensitivity around making JavaThread larger so I tried not to go too crazy. Obviously I'd prefer to stick with what I have. Out of curiosity I'd used the clang option ``-Xclang -fdump-record-layouts`` to look at JavaThread in 8, 11 and 17. It's definitely getting fatter. 8 is 1024, 11 is 1264 and 17 is 1408, plus the huge alignment wastage caused by biased locking, though I guess that's not a problem in 17. Anyway, I could probably get back the space I'm using by rearranging some of the fields of JavaThread. There's a lot of wastage from switching between pointer, int and bool. I haven't done a deep analysis of how much could be recovered but I could look into reducing the overall size JavaThread by repacking if it makes adding these fields more palatable. > src/hotspot/share/runtime/thread.hpp line 1020: > >> 1018: intptr_t* _jvmci_reserved0; >> 1019: intptr_t* _jvmci_reserved1; >> 1020: oop _jvmci_reserved_oop0; > > Can this use OopStorage? We've been getting rid of oop fields and the corresponding oops_do support. Wouldn't using OopStorage require an extra level of indirection for the field? ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From shade at openjdk.java.net Wed Mar 24 06:53:40 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 24 Mar 2021 06:53:40 GMT Subject: Integrated: 8264050: Remove unused field VM_HeapWalkOperation::_collecting_heap_roots In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 15:06:02 GMT, Aleksey Shipilev wrote: > SonarCloud reports field `_collecting_heap_roots` is not initialized after constructor ends. In fact, that field is not used anywhere. It was like that since the initial load. We can trivially remove it. This pull request has now been integrated. Changeset: da512bf5 Author: Aleksey Shipilev URL: https://git.openjdk.java.net/jdk/commit/da512bf5 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8264050: Remove unused field VM_HeapWalkOperation::_collecting_heap_roots Reviewed-by: coleenp, tschatzl ------------- PR: https://git.openjdk.java.net/jdk/pull/3153 From david.holmes at oracle.com Wed Mar 24 06:59:38 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 24 Mar 2021 16:59:38 +1000 Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: <6a3c4e67-c5ce-f2fa-01c2-363123af8284@oracle.com> Hi Tom, On 24/03/2021 4:06 pm, Tom Rodriguez wrote: > On Tue, 23 Mar 2021 06:28:55 GMT, David Holmes wrote: > >>> 8264016: [JVMCI] add some thread local fields for use by JVMCI >> >> Hi Tom, >> >> Is it feasible to create a JVMCI helper side-object that is only created when needed, rather than embedding all the fields directly in the JavaThread instance? >> >> Thanks, >> David > > Well the goal is to have storage that is only one or two loads away so adding a level of indirection isn't great. Many of the other JVMCI fields aren't really performance critical so they could be an extra indirection away but that's not a compatible JVMCI change since they are directly written by Graal for deopt. It also wouldn't save much space. I know there is sensitivity around making JavaThread larger so I tried not to go too crazy. Obviously I'd prefer to stick with what I have. Understood. > Out of curiosity I'd used the clang option ``-Xclang -fdump-record-layouts`` to look at JavaThread in 8, 11 and 17. It's definitely getting fatter. 8 is 1024, 11 is 1264 and 17 is 1408, plus the huge alignment wastage caused by biased locking, though I guess that's not a problem in 17. Anyway, I could probably get back the space I'm using by rearranging some of the fields of JavaThread. There's a lot of wastage from switching between pointer, int and bool. I haven't done a deep analysis of how much could be recovered but I could look into reducing the overall size JavaThread by repacking if it makes adding these fields more palatable. Thanks for that info - that's quite a large growth since 11. And Biased-locking is still in 17 (removal deferred to 18) so we're still paying for additional alignment there. Does the C++ compiler attempt to do any field packing or is everything laid out as we write it? I could see a couple of int fields in Thread that might trigger gaps, and then in JavaThread there is a very suspect layout here: // suspend/resume support volatile bool _suspend_equivalent; // Suspend equivalent condition jint _in_deopt_handler; // count of deoptimization // handlers thread is in volatile bool _doing_unsafe_access; // Thread may fault due to unsafe access bool _do_not_unlock_if_synchronized; I don't know how frugal the C++ compiler actually is with bools, but perhaps some simple reordering there might reclaim some space? Obviously I'd prefer not to have Thread/JavaThread keep growing, but just because you're the most recent requestor doesn't mean the burden should fall to you to fix the overall layout problem. So I'm okay with what you propose and will file a RFE to have someone look into optimising the layout to see if we can recoup some space. >> src/hotspot/share/runtime/thread.hpp line 1020: >> >>> 1018: intptr_t* _jvmci_reserved0; >>> 1019: intptr_t* _jvmci_reserved1; >>> 1020: oop _jvmci_reserved_oop0; >> >> Can this use OopStorage? We've been getting rid of oop fields and the corresponding oops_do support. > > Wouldn't using OopStorage require an extra level of indirection for the field? Possibly - but there has been a very strong move to using oopStorage in any case. Probably best to ask Erik O./Kim/Coleen about that. Thanks, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3147 > From dholmes at openjdk.java.net Wed Mar 24 07:15:41 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 24 Mar 2021 07:15:41 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 06:11:44 GMT, Tom Rodriguez wrote: > 8264016: [JVMCI] add some thread local fields for use by JVMCI Marked as reviewed by dholmes (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From stefank at openjdk.java.net Wed Mar 24 08:01:58 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 24 Mar 2021 08:01:58 GMT Subject: RFR: 8263721: Unify oop casting [v3] In-Reply-To: References: Message-ID: > In fastdebug builds we replace the "oopDesc* to oop" typedef with a wrapper class that holds an oopDesc*. This wrapper class allows any kind of pointer to be implicitly converted to an oop. So you can write code like this: > > Metadata* m = (Metadata*)0x123; > oop o = m; > > and the compiler will unfortunately accept it. Fortunately, this will be caught in release builds, because you can't convert a Method* into an oopDesc*. > > One interesting thing is that you can't convert values of integral type too oops: > > uintptr_t m = uintptr_t(123); > oop o = m; > > This fails in both fastdebug and release builds. To be able to convert integral values to oops, there are two helper functions: > > // For CHECK_UNHANDLED_OOPS, it is ambiguous C++ behavior to have the oop > // structure contain explicit user defined conversions of both numerical > // and pointer type. Define inline methods to provide the numerical conversions. > template inline oop cast_to_oop(T value) { > return (oop)(CHECK_UNHANDLED_OOPS_ONLY((void *))(value)); > } > template inline T cast_from_oop(oop o) { > return (T)(CHECK_UNHANDLED_OOPS_ONLY((oopDesc*))o); > } > > So, the above example would have to be written as: > > uintptr_t m = uintptr_t(123); > oop o = cast_to_oop(m); > > My proposal is that we stop allowing implicit (and explicit) casts from void*, and instead use cast_to_oop whenever we want to cast to oops. We would still allow oopDesc* to be implicitly converted to oop. This would also allow NULL to be converted too oop without casting: > > oop o = NULL; > > This will make the code to convert oops a little bit longer. It could be argued that that's a good thing, because everyone should be cautious about converting things into oops. This will also give us one entry-point where we could add (probably temporary) verification code. > > An alternative to the suggestion above, could be to completely get rid of cast_to_oop and cast_from_oop. But for that to work we need to stop using NULL, which is an integral 0, and start to use nullptr for oops. I've prototyped this as well, but initial investigations showed that some tended to prefer having the cast_to_oop function. (We could still move from NULL to nullptr, if we think that is a good idea). Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Kims review comment - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting - Remove casts in merged changes - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting - Convert tab to spaces - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting - Stricter oop casts ------------- Changes: https://git.openjdk.java.net/jdk/pull/3047/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3047&range=02 Stats: 252 lines in 90 files changed: 3 ins; 4 del; 245 mod Patch: https://git.openjdk.java.net/jdk/pull/3047.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3047/head:pull/3047 PR: https://git.openjdk.java.net/jdk/pull/3047 From pli at openjdk.java.net Wed Mar 24 08:09:53 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Wed, 24 Mar 2021 08:09:53 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line Message-ID: Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. HotSpot AOT tests failed because the shared library compiled with the same VM options on the same machine are skipped when loaded back. Below command sequence shows a simple way to reproduce this issue. $ getconf -a | grep LEVEL1_DCACHE_LINESIZE LEVEL1_DCACHE_LINESIZE 256 $ jaotc --output a.so Hello.class $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' 4 1 skipped ./a.so aot library The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs with L1 dcache line size larger than 128 bytes, the value is adjusted to the cache line size in `VM_Version_init()`. This adjustment is done after AOT library loading in `codeCache_init()`. So the AOT lib verifier still assumes the `ContendedPaddingWidth` in the compiled library should be 128 and thus causes the loaded library skipped. In my proposed fix, `AOTLoader::initialize()` is moved out of the general codecache initialization and placed after `VM_Version_init()`. The order of `codeCache_init()` and `VM_Version_init()` is not changed since there may be code emitted during `VM_Version_init()`, which depends on the general codecache init. Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. ------------- Commit messages: - 8264006: Fix AOT library loading on CPUs with 256-byte dcache line Changes: https://git.openjdk.java.net/jdk/pull/3169/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3169&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264006 Stats: 11 lines in 4 files changed: 5 ins; 0 del; 6 mod Patch: https://git.openjdk.java.net/jdk/pull/3169.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3169/head:pull/3169 PR: https://git.openjdk.java.net/jdk/pull/3169 From aph at openjdk.java.net Wed Mar 24 10:24:40 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Wed, 24 Mar 2021 10:24:40 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 08:04:47 GMT, Pengfei Li wrote: > Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. > HotSpot AOT tests failed because the shared library compiled with the > same VM options on the same machine are skipped when loaded back. > > Below command sequence shows a simple way to reproduce this issue. > > $ getconf -a | grep LEVEL1_DCACHE_LINESIZE > LEVEL1_DCACHE_LINESIZE 256 > > $ jaotc --output a.so Hello.class > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello > Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' > 4 1 skipped ./a.so aot library > > The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs > with L1 dcache line size larger than 128 bytes, the value is adjusted to > the cache line size in `VM_Version_init()`. This adjustment is done after > AOT library loading in `codeCache_init()`. So the AOT lib verifier still > assumes the `ContendedPaddingWidth` in the compiled library should be 128 > and thus causes the loaded library skipped. > > In my proposed fix, `AOTLoader::initialize()` is moved out of the general > codecache initialization and placed after `VM_Version_init()`. The order > of `codeCache_init()` and `VM_Version_init()` is not changed since there may > be code emitted during `VM_Version_init()`, which depends on the general > codecache init. > > Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From stefank at openjdk.java.net Wed Mar 24 10:32:41 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 24 Mar 2021 10:32:41 GMT Subject: RFR: 8263721: Unify oop casting [v2] In-Reply-To: References: <9WBFjpLGVwGtmj6k1PosqnNsp5qbQEObjtS6A_B5Apg=.e29ef71a-8355-437a-a868-39ea123733b0@github.com> Message-ID: On Tue, 23 Mar 2021 19:10:00 GMT, Coleen Phillimore wrote: >> Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Remove casts in merged changes >> - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting >> - Convert tab to spaces >> - Merge remote-tracking branch 'origin/master' into 8263721_unify_oop_casting >> - Stricter oop casts > > Looks like a safe to me. I didn't realize there were still so many files with oop casts. I only scrolled by the GC ones though. Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/3047 From stefank at openjdk.java.net Wed Mar 24 10:32:42 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 24 Mar 2021 10:32:42 GMT Subject: Integrated: 8263721: Unify oop casting In-Reply-To: References: Message-ID: On Wed, 17 Mar 2021 13:11:31 GMT, Stefan Karlsson wrote: > In fastdebug builds we replace the "oopDesc* to oop" typedef with a wrapper class that holds an oopDesc*. This wrapper class allows any kind of pointer to be implicitly converted to an oop. So you can write code like this: > > Metadata* m = (Metadata*)0x123; > oop o = m; > > and the compiler will unfortunately accept it. Fortunately, this will be caught in release builds, because you can't convert a Method* into an oopDesc*. > > One interesting thing is that you can't convert values of integral type too oops: > > uintptr_t m = uintptr_t(123); > oop o = m; > > This fails in both fastdebug and release builds. To be able to convert integral values to oops, there are two helper functions: > > // For CHECK_UNHANDLED_OOPS, it is ambiguous C++ behavior to have the oop > // structure contain explicit user defined conversions of both numerical > // and pointer type. Define inline methods to provide the numerical conversions. > template inline oop cast_to_oop(T value) { > return (oop)(CHECK_UNHANDLED_OOPS_ONLY((void *))(value)); > } > template inline T cast_from_oop(oop o) { > return (T)(CHECK_UNHANDLED_OOPS_ONLY((oopDesc*))o); > } > > So, the above example would have to be written as: > > uintptr_t m = uintptr_t(123); > oop o = cast_to_oop(m); > > My proposal is that we stop allowing implicit (and explicit) casts from void*, and instead use cast_to_oop whenever we want to cast to oops. We would still allow oopDesc* to be implicitly converted to oop. This would also allow NULL to be converted too oop without casting: > > oop o = NULL; > > This will make the code to convert oops a little bit longer. It could be argued that that's a good thing, because everyone should be cautious about converting things into oops. This will also give us one entry-point where we could add (probably temporary) verification code. > > An alternative to the suggestion above, could be to completely get rid of cast_to_oop and cast_from_oop. But for that to work we need to stop using NULL, which is an integral 0, and start to use nullptr for oops. I've prototyped this as well, but initial investigations showed that some tended to prefer having the cast_to_oop function. (We could still move from NULL to nullptr, if we think that is a good idea). This pull request has now been integrated. Changeset: a79f0956 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/a79f0956 Stats: 252 lines in 90 files changed: 3 ins; 4 del; 245 mod 8263721: Unify oop casting Reviewed-by: kbarrett, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/3047 From rpressler at openjdk.java.net Wed Mar 24 11:13:41 2021 From: rpressler at openjdk.java.net (Ron Pressler) Date: Wed, 24 Mar 2021 11:13:41 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 07:12:26 GMT, David Holmes wrote: >> 8264016: [JVMCI] add some thread local fields for use by JVMCI > > Marked as reviewed by dholmes (Reviewer). Note that in Loom, j.l.Thread is no longer tied to a JavaThread, and could migrate among JavaThreads, potentially at any safepoint, unless specifically pinned. Thread-local information that is logically associated with the j.l.Thread can be placed as a field on JavaThread if it's only set and used while the j.l.Thread is mounted on the JavaThread (i.e. between consecutive safepoints); otherwise, it should go on the j.l.Thread class. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Wed Mar 24 12:19:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 12:19:41 GMT Subject: RFR: 8264004: Don't use TRAPS if no exceptions are thrown [v5] In-Reply-To: References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: <0T26-UOgZBHAhgtrUuJHstrWl-GdyZ-CiqqowSrGdHI=.d837da6e-b8ee-43ef-b92d-4bf9fcc2d530@github.com> On Wed, 24 Mar 2021 05:42:06 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Add ExceptionMarks. > > Marked as reviewed by iklam (Reviewer). Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Wed Mar 24 12:19:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 12:19:42 GMT Subject: Integrated: 8264004: Don't use TRAPS if no exceptions are thrown In-Reply-To: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> References: <1KBghJAnPzq1F4LTwaaa8sUTy-BiLlO4uXSmYC1XGPA=.46cc3327-0af0-409f-bf1a-07066c60b0de@github.com> Message-ID: On Tue, 23 Mar 2021 01:04:00 GMT, Coleen Phillimore wrote: > Removed the TRAPS in function declarations in jvmtiRedefineClasses and in ConstantPool merging functions. > Tested with vmTestbase/nsk/jvmti and tier1 (in progress). This pull request has now been integrated. Changeset: 5d7e93c8 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/5d7e93c8 Stats: 209 lines in 4 files changed: 7 ins; 41 del; 161 mod 8264004: Don't use TRAPS if no exceptions are thrown Reviewed-by: dholmes, iklam, hseigel, dcubed ------------- PR: https://git.openjdk.java.net/jdk/pull/3141 From coleenp at openjdk.java.net Wed Mar 24 12:22:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 12:22:41 GMT Subject: Integrated: 8264051: Remove unused TRAPS parameters from runtime functions In-Reply-To: References: Message-ID: <6a-CuUWVEzmQ3dbzXYGI6wCB0nnWEJGiSABpwRDSaMQ=.d5cd639f-040c-4a6d-87ef-6189dcb84c03@github.com> On Tue, 23 Mar 2021 16:40:44 GMT, Coleen Phillimore wrote: > This change removes the TRAPS parameter from compute_modifier_flags(), lookup_instance_method_in_klasses and nest_host_error. > > There's a progressive effort to remove cases where the last parameter of a function is THREAD, and it's unclear why it is ignoring an exception or whether an exception is expected, if it doesn't subsequently have a check for HAS_PENDING_EXCEPTION. > > Tested locally with tier1 tests and tier1 tests on 4 Oracle platforms in progress. This pull request has now been integrated. Changeset: bc91596c Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/bc91596c Stats: 52 lines in 16 files changed: 0 ins; 17 del; 35 mod 8264051: Remove unused TRAPS parameters from runtime functions Reviewed-by: iklam, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/3157 From coleenp at openjdk.java.net Wed Mar 24 12:22:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 12:22:40 GMT Subject: RFR: 8264051: Remove unused TRAPS parameters from runtime functions [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 03:50:14 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> removed thread parameter from nest_host_error and callers. > > Looks good! > > Thanks, > David Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/3157 From sjohanss at openjdk.java.net Wed Mar 24 13:10:03 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 24 Mar 2021 13:10:03 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v2] In-Reply-To: References: Message-ID: > Please review this refactoring of the hugetlbfs reservation code. > > **Summary** > In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: > if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { > return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); > } else { > return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); > } > > The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. > > Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). > > **Testing** > Mach5 tier1-3 and a lot of local testing with different large page configurations. Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: Ivan review Renamed helper to commit_memory_special and updated the comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3073/files - new: https://git.openjdk.java.net/jdk/pull/3073/files/6faf7e19..38a13144 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=00-01 Stats: 19 lines in 2 files changed: 2 ins; 0 del; 17 mod Patch: https://git.openjdk.java.net/jdk/pull/3073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3073/head:pull/3073 PR: https://git.openjdk.java.net/jdk/pull/3073 From iwalulya at openjdk.java.net Wed Mar 24 13:17:41 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Wed, 24 Mar 2021 13:17:41 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 13:10:03 GMT, Stefan Johansson wrote: >> Please review this refactoring of the hugetlbfs reservation code. >> >> **Summary** >> In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: >> if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { >> return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); >> } else { >> return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); >> } >> >> The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. >> >> Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). >> >> **Testing** >> Mach5 tier1-3 and a lot of local testing with different large page configurations. > > Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: > > Ivan review > > Renamed helper to commit_memory_special and updated the comments. lgtm! ------------- Marked as reviewed by iwalulya (Committer). PR: https://git.openjdk.java.net/jdk/pull/3073 From dholmes at openjdk.java.net Wed Mar 24 13:29:39 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 24 Mar 2021 13:29:39 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 10:21:28 GMT, Andrew Haley wrote: >> Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. >> HotSpot AOT tests failed because the shared library compiled with the >> same VM options on the same machine are skipped when loaded back. >> >> Below command sequence shows a simple way to reproduce this issue. >> >> $ getconf -a | grep LEVEL1_DCACHE_LINESIZE >> LEVEL1_DCACHE_LINESIZE 256 >> >> $ jaotc --output a.so Hello.class >> >> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello >> Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' >> 4 1 skipped ./a.so aot library >> >> The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs >> with L1 dcache line size larger than 128 bytes, the value is adjusted to >> the cache line size in `VM_Version_init()`. This adjustment is done after >> AOT library loading in `codeCache_init()`. So the AOT lib verifier still >> assumes the `ContendedPaddingWidth` in the compiled library should be 128 >> and thus causes the loaded library skipped. >> >> In my proposed fix, `AOTLoader::initialize()` is moved out of the general >> codecache initialization and placed after `VM_Version_init()`. The order >> of `codeCache_init()` and `VM_Version_init()` is not changed since there may >> be code emitted during `VM_Version_init()`, which depends on the general >> codecache init. >> >> Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. > > I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? The compiler folk will need to see if VM_Version::initialize itself has any dependencies on the AOTLoader initialization. Changing the initialization order is always risky. On a side note please don't modify any copyright line except for Oracle's when modifying files, unless instructed to by the owner of that copyright. Thanks, David ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From dholmes at openjdk.java.net Wed Mar 24 13:32:43 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 24 Mar 2021 13:32:43 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 13:26:31 GMT, David Holmes wrote: >> I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? > > The compiler folk will need to see if VM_Version::initialize itself has any dependencies on the AOTLoader initialization. Changing the initialization order is always risky. > > On a side note please don't modify any copyright line except for Oracle's when modifying files, unless instructed to by the owner of that copyright. > > Thanks, > David It may be possible to factor out the necessary logic to the VM_Version::early_initialize() function instead. ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From fdavid at openjdk.java.net Wed Mar 24 14:44:44 2021 From: fdavid at openjdk.java.net (Florian David) Date: Wed, 24 Mar 2021 14:44:44 GMT Subject: Withdrawn: 8258414: OldObjectSample events too expensive In-Reply-To: <1_LsNBt-Yy5NlHbfwtRSRNvGa2AbTuhMGYuiw3Hy8gU=.3b79e283-87fe-451e-8e60-25b59c5e837a@github.com> References: <1_LsNBt-Yy5NlHbfwtRSRNvGa2AbTuhMGYuiw3Hy8gU=.3b79e283-87fe-451e-8e60-25b59c5e837a@github.com> Message-ID: <5MSScgUTaCRsk8cAhcUyHFkJ59F9teapMWImRepZNPU=.1e5bae48-8816-4c40-a23d-ed89165bfc1e@github.com> On Fri, 19 Feb 2021 14:45:00 GMT, Florian David wrote: > The purpose of this change is to reduce the size of JFR recordings when the OldObjectSample event is enabled. > > ## Problem > JFR recordings size blows up when the OldObjectSample is enabled. The memory allocation events are known to be very high traffic and will cause a lot of data, just the sheer number of events produced, and if stacktraces are added to this, the associated metadata can be huge as well. Sampled object are stored in a priority queue and their associated stack traces stored in JFRStackTraceRepository. When sample candidates are removed from the priority queue, their stacktraces remain in the repository, which will be later written at chunk rotation even if the sample has been removed. > > ## Implementation > This PR adds a JFRStackTraceRepository dedicated to store stack traces for the OldObjectSample event. At chunk rotation, every sample stack trace is looked up in this repository and is serialized. Other stack traces are simply removed. > > ## Benchmarks > On an AWS c5.metal instance (96 cores, 192 Gib), running SPECjvm2008 with default profile.jfc configuration with OldObjectSample event enabled gives: > - a recording size of 20.73Mb without the PR fix > - a recording size of 2.78Mb with the PR fix This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/2645 From minqi at openjdk.java.net Wed Mar 24 15:35:55 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 24 Mar 2021 15:35:55 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v6] In-Reply-To: References: Message-ID: > Hi, Please review > > Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with > `java -XX:DumpLoadedClassList= .... ` > to collect shareable class names and saved in file `` , then > `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` > With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. > The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. > New added jcmd option: > `jcmd VM.cds static_dump ` > or > `jcmd VM.cds dynamic_dump ` > To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. > The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. > > Tests: tier1,tier2,tier3,tier4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Remove redundant check for if a class is shareable ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2737/files - new: https://git.openjdk.java.net/jdk/pull/2737/files/e882a074..3834f042 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=05 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=04-05 Stats: 52 lines in 2 files changed: 1 ins; 33 del; 18 mod Patch: https://git.openjdk.java.net/jdk/pull/2737.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737 PR: https://git.openjdk.java.net/jdk/pull/2737 From coleenp at openjdk.java.net Wed Mar 24 16:25:50 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 16:25:50 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions Message-ID: find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. Also: is_shared_class_visible{_impl} Tested with tier1 on 4 Oracle platforms (in progress) ------------- Commit messages: - Remove more TRAPS parameters for functions that don't throw or propagate exceptions. - Remove more TRAPS parameters for functions that don't throw or propagate exceptions. - Remove more TRAPS parameters for functions that don't throw or propagate exceptions. Changes: https://git.openjdk.java.net/jdk/pull/3176/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3176&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264126 Stats: 50 lines in 11 files changed: 0 ins; 5 del; 45 mod Patch: https://git.openjdk.java.net/jdk/pull/3176.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3176/head:pull/3176 PR: https://git.openjdk.java.net/jdk/pull/3176 From iklam at openjdk.java.net Wed Mar 24 16:51:54 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 24 Mar 2021 16:51:54 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 06:11:44 GMT, Tom Rodriguez wrote: > 8264016: [JVMCI] add some thread local fields for use by JVMCI LGTM ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Wed Mar 24 17:02:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 17:02:42 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 06:11:44 GMT, Tom Rodriguez wrote: > 8264016: [JVMCI] add some thread local fields for use by JVMCI We made the _threadObj field in JavaThread an OopStorage to avoid a crash during thread deletion. It's not recommended to add oops directly to runtime data structures. I have to dig up the bug first. > I haven't done a deep analysis of how much could be recovered but I could look into reducing the overall size JavaThread by > repacking if it makes adding these fields more palatable. We have this sort of on our (internal) list with other improvements, so don't do anything here. We also had a JVMCI bug that suggested adding these fields to a side .hpp file and referring to it in JavaThread by that container. Like HandshakeState. You can optimize the field layout inside this as you want. I'm marking "Request changes" until I've had time to dig up the bug. ------------- Changes requested by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3147 From ccheung at openjdk.java.net Wed Mar 24 17:05:45 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 24 Mar 2021 17:05:45 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 16:20:30 GMT, Coleen Phillimore wrote: > find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. > > The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. > > check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. > > Also: is_shared_class_visible{_impl} > > Tested with tier1 on 4 Oracle platforms (in progress) LGTM. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3176 From github.com+168222+mgkwill at openjdk.java.net Wed Mar 24 17:22:45 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Wed, 24 Mar 2021 17:22:45 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 13:10:03 GMT, Stefan Johansson wrote: >> Please review this refactoring of the hugetlbfs reservation code. >> >> **Summary** >> In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: >> if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { >> return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); >> } else { >> return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); >> } >> >> The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. >> >> Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). >> >> **Testing** >> Mach5 tier1-3 and a lot of local testing with different large page configurations. > > Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: > > Ivan review > > Renamed helper to commit_memory_special and updated the comments. Changes requested by mgkwill at github.com (no known OpenJDK username). src/hotspot/os/linux/os_linux.cpp line 3976: > 3974: // If the size is not a multiple of the large page size, we > 3975: // will mix the type of pages used, but in a decending order. > 3976: // Start of by reserving a range of the given size that is Small typo, ` // Start of by reserving` should be ` // Start off by reserving` or ` // Start by reserving` src/hotspot/os/linux/os_linux.cpp line 3987: > 3985: } > 3986: > 3987: // Start of by committing large pages. Small Typo ` // Start of` should be ` // Start off` or ` // Start by` src/hotspot/os/linux/os_linux.cpp line 4003: > 4001: // Failed to commit large pages, so we need to unmap the > 4002: // reminder of the orinal reservation. > 4003: ::munmap(small_start, small_size); I'm assuming that if mmap fails for large pages, it un-maps the reservation area requested for large pages and thus here we only need to munmap for remaining reservation (small pages)? src/hotspot/os/linux/os_linux.cpp line 4007: > 4005: } > 4006: > 4007: // Commit the reminding bytes using small pages. ` // Commit the reminding` should be ` // Commit the remaining` ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From github.com+168222+mgkwill at openjdk.java.net Wed Mar 24 17:22:45 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Wed, 24 Mar 2021 17:22:45 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v2] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 20:16:52 GMT, Stefan Johansson wrote: >> src/hotspot/os/linux/os_linux.cpp line 3981: >> >>> 3979: // and the given alignment. The larger of the two will be used. >>> 3980: size_t required_alignment = MAX(os::large_page_size(), alignment); >>> 3981: char* const aligned_start = anon_mmap_aligned(req_addr, bytes, required_alignment); >> >> Do we need to add back the comment "// First reserve - but not commit"? > > My hope was that the new comment would be sufficient, but if you think it is needed I could add that the reserved range is not committed here. I agree that current comment encompasses `// First reserve - but not commit`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From github.com+168222+mgkwill at openjdk.java.net Wed Mar 24 17:47:47 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Wed, 24 Mar 2021 17:47:47 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 13:10:03 GMT, Stefan Johansson wrote: >> Please review this refactoring of the hugetlbfs reservation code. >> >> **Summary** >> In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: >> if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { >> return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); >> } else { >> return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); >> } >> >> The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. >> >> Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). >> >> **Testing** >> Mach5 tier1-3 and a lot of local testing with different large page configurations. > > Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: > > Ivan review > > Renamed helper to commit_memory_special and updated the comments. Suggestions are mostly nits, if the assumption in my comment about un-mapping is correct (I suspect so based on your conversation with Ivan (@walulyai)). Looks good otherwise from my perspective. ------------- Marked as reviewed by mgkwill at github.com (no known OpenJDK username). PR: https://git.openjdk.java.net/jdk/pull/3073 From coleenp at openjdk.java.net Wed Mar 24 18:27:44 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 18:27:44 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 17:02:38 GMT, Calvin Cheung wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > LGTM. Thanks Calvin! ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Wed Mar 24 18:46:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 18:46:40 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 06:04:15 GMT, Tom Rodriguez wrote: >> src/hotspot/share/runtime/thread.hpp line 1020: >> >>> 1018: intptr_t* _jvmci_reserved0; >>> 1019: intptr_t* _jvmci_reserved1; >>> 1020: oop _jvmci_reserved_oop0; >> >> Can this use OopStorage? We've been getting rid of oop fields and the corresponding oops_do support. > > Wouldn't using OopStorage require an extra level of indirection for the field? It was https://bugs.openjdk.java.net/browse/JDK-8244997 - you were co-author :) Maybe this jvmci_reserved_oop0 won't crash for the same reasons. I don't know that. I still would like to not see 45 lines of declarations for JVMCI added to JavaThread. These should be in a separate header file and declared, as in https://bugs.openjdk.java.net/browse/JDK-8137018. If you promise to fix 8137018, I'm fine with this change. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From never at openjdk.java.net Wed Mar 24 18:55:40 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Wed, 24 Mar 2021 18:55:40 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 17:00:17 GMT, Coleen Phillimore wrote: >> 8264016: [JVMCI] add some thread local fields for use by JVMCI > > We made the _threadObj field in JavaThread an OopStorage to avoid a crash during thread deletion. It's not recommended to add oops directly to runtime data structures. I have to dig up the bug first. > >> I haven't done a deep analysis of how much could be recovered but I could look into reducing the overall size JavaThread by > repacking if it makes adding these fields more palatable. > > We have this sort of on our (internal) list with other improvements, so don't do anything here. > We also had a JVMCI bug that suggested adding these fields to a side .hpp file and referring to it in JavaThread by that container. Like HandshakeState. You can optimize the field layout inside this as you want. > > I'm marking "Request changes" until I've had time to dig up the bug. No the C++ compiler just emits them in the declared order. Since we group the field declarations by their relatedness it's not easy to get a completely good overall packing. There's actually less direct waste than I expected in 17, though there are lots of cases where int are used to stored bools. clang doesn't appear to optimize the storage for enums so there are lots of small enums stored in ints instead of bytes. I believe gcc will use the smallest storage available. There's also dubious stuff like the ``char[64]`` in the HandshakeState and lots of statistics that could live elsewhere. Is there a bug I could attach my observations to? It's nothing earth shattering but the clang reported layout is interesting to see. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From iklam at openjdk.java.net Wed Mar 24 19:06:41 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 24 Mar 2021 19:06:41 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 16:20:30 GMT, Coleen Phillimore wrote: > find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. > > The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. > > check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. > > Also: is_shared_class_visible{_impl} > > Tested with tier1 on 4 Oracle platforms (in progress) Changes requested by iklam (Reviewer). src/hotspot/share/ci/ciEnv.cpp line 428: > 426: } > 427: > 428: Handle loader; I think it's better to get rid of the EXCEPTION_CONTEXT above, and add Thread* current = Thread::current(); Also, the extensive use of EXCEPTION_CONTEXT in the JVMCI code should be reviewed. I think they probably need to be either removed or changed to EXCEPTION_MARK. src/hotspot/share/classfile/systemDictionary.cpp line 1234: > 1232: Symbol* class_name = ik->name(); > 1233: > 1234: bool visible = is_shared_class_visible(class_name, ik, pkg_entry, class_loader); No longer need the local `visible`. Such locals were needed because we couldn't do if (foobar(a, b, c, CHECK_NULL)) { return NULL; } src/hotspot/share/classfile/systemDictionary.cpp line 1842: > 1840: klass = Universe::typeArrayKlassObj(t); > 1841: } else { > 1842: MutexLocker mu(SystemDictionary_lock); Since this is a clean up RFE, I think it's better to avoid changes that may impact performance. I would avoid adding calls to Thread::current() -- except for cases inside logging code. Maybe change TRAPS to Thread* current and move it to the first parameter? I.e., how you changed SystemDictionaryShared::check_linking_constraints(). src/hotspot/share/classfile/systemDictionaryShared.cpp line 1864: > 1862: } > 1863: > 1864: if (Thread::current()->is_VM_thread()) { For performance, maybe it's better: if (DynamicDumpSharedSpaces) { if (Thread::current()->is_VM_thread()) { return; } } else { assert(!Thread::current()->is_VM_thread(), "....."); } ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Wed Mar 24 19:22:39 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 19:22:39 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 18:53:18 GMT, Tom Rodriguez wrote: >> We made the _threadObj field in JavaThread an OopStorage to avoid a crash during thread deletion. It's not recommended to add oops directly to runtime data structures. I have to dig up the bug first. >> >>> I haven't done a deep analysis of how much could be recovered but I could look into reducing the overall size JavaThread by > repacking if it makes adding these fields more palatable. >> >> We have this sort of on our (internal) list with other improvements, so don't do anything here. >> We also had a JVMCI bug that suggested adding these fields to a side .hpp file and referring to it in JavaThread by that container. Like HandshakeState. You can optimize the field layout inside this as you want. >> >> I'm marking "Request changes" until I've had time to dig up the bug. > > No the C++ compiler just emits them in the declared order. Since we group the field declarations by their relatedness it's not easy to get a completely good overall packing. There's actually less direct waste than I expected in 17, though there are lots of cases where int are used to stored bools. clang doesn't appear to optimize the storage for enums so there are lots of small enums stored in ints instead of bytes. I believe gcc will use the smallest storage available. There's also dubious stuff like the ``char[64]`` in the HandshakeState and lots of statistics that could live elsewhere. Is there a bug I could attach my observations to? It's nothing earth shattering but the clang reported layout is interesting to see. I have a fix for the Mutex in HandshakeState. Here's a bug for you to fill in. That would be great. https://bugs.openjdk.java.net/browse/JDK-8264145 ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Wed Mar 24 19:22:39 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 19:22:39 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 19:15:01 GMT, Coleen Phillimore wrote: >> No the C++ compiler just emits them in the declared order. Since we group the field declarations by their relatedness it's not easy to get a completely good overall packing. There's actually less direct waste than I expected in 17, though there are lots of cases where int are used to stored bools. clang doesn't appear to optimize the storage for enums so there are lots of small enums stored in ints instead of bytes. I believe gcc will use the smallest storage available. There's also dubious stuff like the ``char[64]`` in the HandshakeState and lots of statistics that could live elsewhere. Is there a bug I could attach my observations to? It's nothing earth shattering but the clang reported layout is interesting to see. > > I have a fix for the Mutex in HandshakeState. Here's a bug for you to fill in. That would be great. > https://bugs.openjdk.java.net/browse/JDK-8264145 One of the things we've talked about is making the hierarchy be something like: Thread NamedThread -> WatcherThread SafepointAwareThread -> CompilerThread -> JavaThread So that CompilerThread isn't a JavaThread. But we're not going to do that very soon. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From never at openjdk.java.net Wed Mar 24 19:22:40 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Wed, 24 Mar 2021 19:22:40 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 18:43:48 GMT, Coleen Phillimore wrote: >> Wouldn't using OopStorage require an extra level of indirection for the field? > > It was https://bugs.openjdk.java.net/browse/JDK-8244997 - you were co-author :) > > Maybe this jvmci_reserved_oop0 won't crash for the same reasons. I don't know that. > > I still would like to not see 45 lines of declarations for JVMCI added to JavaThread. These should be in a separate header file and declared, as in https://bugs.openjdk.java.net/browse/JDK-8137018. If you promise to fix 8137018, I'm fine with this change. co-author is a strong word. :) But that does ring a bell. Why was threadObj problematic but the other existing oop fields were not? JavaThread::oops_do_no_frames visits a lot of roots that aren't OopStorage. I can tackle JDK-8137018. I think we'll need to add an alias mechanism to vmStructs_jvmci.cpp to maintain backward compatibility but that's fairly straightforward. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Wed Mar 24 19:27:39 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 19:27:39 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 19:02:26 GMT, Ioi Lam wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > src/hotspot/share/classfile/systemDictionary.cpp line 1842: > >> 1840: klass = Universe::typeArrayKlassObj(t); >> 1841: } else { >> 1842: MutexLocker mu(SystemDictionary_lock); > > Since this is a clean up RFE, I think it's better to avoid changes that may impact performance. I would avoid adding calls to Thread::current() -- except for cases inside logging code. Maybe change TRAPS to Thread* current and move it to the first parameter? I.e., how you changed SystemDictionaryShared::check_linking_constraints(). This will not impact performance though, since we're already taking a lock. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From never at openjdk.java.net Wed Mar 24 19:35:41 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Wed, 24 Mar 2021 19:35:41 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 19:17:19 GMT, Coleen Phillimore wrote: >> I have a fix for the Mutex in HandshakeState. Here's a bug for you to fill in. That would be great. >> https://bugs.openjdk.java.net/browse/JDK-8264145 > > One of the things we've talked about is making the hierarchy be something like: > Thread > NamedThread > -> WatcherThread > SafepointAwareThread > -> CompilerThread > -> JavaThread > > So that CompilerThread isn't a JavaThread. But we're not going to do that very soon. Regarding loom, it seems like there would have to be a major migration of lots of state into java.lang.Thread. Or some more explicit backup and restores of various chunks of JavaThread during migration. I know from previous discussions that things like the deferred_locals are problematic but JFR and any TLAB statistics/allocation tracking also seems problematic. Is there some systematic solution to the general problem? If these fields needed to live in Thread, then we could inject these fields into java.lang.Thread when JVMCI is enabled. The access wouldn't have to be hugely different. Is a resolution to the loom issue something that can be deferred? ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Wed Mar 24 19:35:44 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 19:35:44 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 18:52:56 GMT, Ioi Lam wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > src/hotspot/share/ci/ciEnv.cpp line 428: > >> 426: } >> 427: >> 428: Handle loader; > > I think it's better to get rid of the EXCEPTION_CONTEXT above, and add > > Thread* current = Thread::current(); > > Also, the extensive use of EXCEPTION_CONTEXT in the JVMCI code should be reviewed. I think they probably need to be either removed or changed to EXCEPTION_MARK. Ok, that works for ciEnv.cpp. I'd rather not change the JVMCI code any more and leave it up to the maintainers of that code. It's just a cleanliness issue not correctness. > src/hotspot/share/classfile/systemDictionary.cpp line 1234: > >> 1232: Symbol* class_name = ik->name(); >> 1233: >> 1234: bool visible = is_shared_class_visible(class_name, ik, pkg_entry, class_loader); > > No longer need the local `visible`. Such locals were needed because we couldn't do > > if (foobar(a, b, c, CHECK_NULL)) { > return NULL; > } ok. > src/hotspot/share/classfile/systemDictionaryShared.cpp line 1864: > >> 1862: } >> 1863: >> 1864: if (Thread::current()->is_VM_thread()) { > > For performance, maybe it's better: > if (DynamicDumpSharedSpaces) { > if (Thread::current()->is_VM_thread()) { > return; > } > } else { > assert(!Thread::current()->is_VM_thread(), "....."); > } So we always dump dynamic shared spaces from the VMThread? ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Wed Mar 24 19:51:00 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 19:51:00 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: References: Message-ID: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> > find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. > > The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. > > check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. > > Also: is_shared_class_visible{_impl} > > Tested with tier1 on 4 Oracle platforms (in progress) Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Improvements suggested by Ioi. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3176/files - new: https://git.openjdk.java.net/jdk/pull/3176/files/82b984e3..1e9e29cf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3176&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3176&range=00-01 Stats: 19 lines in 3 files changed: 3 ins; 1 del; 15 mod Patch: https://git.openjdk.java.net/jdk/pull/3176.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3176/head:pull/3176 PR: https://git.openjdk.java.net/jdk/pull/3176 From sjohanss at openjdk.java.net Wed Mar 24 20:29:11 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 24 Mar 2021 20:29:11 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v3] In-Reply-To: References: Message-ID: > Please review this refactoring of the hugetlbfs reservation code. > > **Summary** > In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: > if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { > return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); > } else { > return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); > } > > The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. > > Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). > > **Testing** > Mach5 tier1-3 and a lot of local testing with different large page configurations. Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: Marcus review. Updated comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3073/files - new: https://git.openjdk.java.net/jdk/pull/3073/files/38a13144..2bf1d5a5 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/3073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3073/head:pull/3073 PR: https://git.openjdk.java.net/jdk/pull/3073 From sjohanss at openjdk.java.net Wed Mar 24 20:29:11 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 24 Mar 2021 20:29:11 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 13:15:04 GMT, Ivan Walulya wrote: >> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: >> >> Ivan review >> >> Renamed helper to commit_memory_special and updated the comments. > > lgtm! > Suggestions are mostly nits, if the assumption in my comment about un-mapping is correct (I suspect so based on your conversation with Ivan (@walulyai)). Looks good otherwise from my perspective. Thanks for reviewing Marcus, updated the comments per your suggestion and also changed one to better fit the flow. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From sjohanss at openjdk.java.net Wed Mar 24 20:29:13 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Wed, 24 Mar 2021 20:29:13 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 16:55:35 GMT, Marcus G K Williams wrote: >> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: >> >> Ivan review >> >> Renamed helper to commit_memory_special and updated the comments. > > src/hotspot/os/linux/os_linux.cpp line 3987: > >> 3985: } >> 3986: >> 3987: // Start of by committing large pages. > > Small Typo ` // Start of` should be ` // Start off` or ` // Start by` Good catch, change this to `// First commit using large pages.` since I've already used "start off" above. > src/hotspot/os/linux/os_linux.cpp line 4003: > >> 4001: // Failed to commit large pages, so we need to unmap the >> 4002: // reminder of the orinal reservation. >> 4003: ::munmap(small_start, small_size); > > I'm assuming that if mmap fails for large pages, it un-maps the reservation area requested for large pages and thus here we only need to munmap for remaining reservation (small pages)? Yes, if `mmap()` fails we lose the reservation for that range and need to start over only using small pages. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From iklam at openjdk.java.net Wed Mar 24 20:38:40 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 24 Mar 2021 20:38:40 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 19:32:54 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/systemDictionaryShared.cpp line 1864: >> >>> 1862: } >>> 1863: >>> 1864: if (Thread::current()->is_VM_thread()) { >> >> For performance, maybe it's better: >> if (DynamicDumpSharedSpaces) { >> if (Thread::current()->is_VM_thread()) { >> return; >> } >> } else { >> assert(!Thread::current()->is_VM_thread(), "....."); >> } > > So we always dump dynamic shared spaces from the VMThread? When DynamicDumpSharedSpaces is true, this code can be executed in a Java thread or a VM thread. When it's in the VM thread, we cannot proceed to the code below and must return immediately. When DynamicDumpSharedSpaces is false, this code cannot be executed in a VM thread. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From github.com+168222+mgkwill at openjdk.java.net Wed Mar 24 20:43:41 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Wed, 24 Mar 2021 20:43:41 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v3] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 20:29:11 GMT, Stefan Johansson wrote: >> Please review this refactoring of the hugetlbfs reservation code. >> >> **Summary** >> In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: >> if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { >> return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); >> } else { >> return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); >> } >> >> The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. >> >> Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). >> >> **Testing** >> Mach5 tier1-3 and a lot of local testing with different large page configurations. > > Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: > > Marcus review. > > Updated comments. +1 ------------- Marked as reviewed by mgkwill at github.com (no known OpenJDK username). PR: https://git.openjdk.java.net/jdk/pull/3073 From ccheung at openjdk.java.net Wed Mar 24 21:20:41 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 24 Mar 2021 21:20:41 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v6] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 15:35:55 GMT, Yumin Qi wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Remove redundant check for if a class is shareable src/hotspot/share/oops/instanceKlass.cpp line 4269: > 4267: } > 4268: > 4269: if (class_loader == NULL && ClassLoader::contains_append_entry(stream->source())) { Since the above has been removed, I think the `ClassLoader::contains_append_entry()` function can be removed too. test/hotspot/jtreg/runtime/cds/appcds/jcmd/JCmdTest.java line 59: > 57: public class JCmdTest { > 58: static final String TEST_CLASS[] = {"LingeredTestApp", "jdk/test/lib/apps/LingeredApp"}; > 59: static final String TEST_JAR = "test.jar"; `TEST_JAR` is unused. test/hotspot/jtreg/runtime/cds/appcds/jcmd/JCmdTest.java line 168: > 166: "-XX:ArchiveClassesAtExit=tmp.jsa", > 167: "-Xshare:auto", > 168: "-Xshare:on"}; The `excludeFlags` doesn't match the ones in CDS.java. 226 private static String[] excludeFlags = { 227 "-XX:DumpLoadedClassList=", 228 "-XX:+DumpSharedSpaces", 229 "-XX:+DynamicDumpSharedSpaces", 230 "-XX:+RecordDynamicDumpInfo", 231 "-Xshare:", 232 "-XX:SharedClassListFile=", 233 "-XX:SharedArchiveFile=", 234 "-XX:ArchiveClassesAtExit=", 235 "-XX:+UseSharedSpaces", 236 "-XX:+RequireSharedSpaces"}; ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From ccheung at openjdk.java.net Wed Mar 24 21:25:42 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Wed, 24 Mar 2021 21:25:42 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v6] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 15:35:55 GMT, Yumin Qi wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Remove redundant check for if a class is shareable test/hotspot/jtreg/runtime/cds/appcds/jcmd/JCmdTest.java line 184: > 182: test(SUBCMD_STATIC_DUMP, null, pid, EXPECT_PASS); > 183: } > 184: app.stopApp(); For successful dumping cases like the above, you may want to add a runtime case making sure one of the classes in the jar is loaded from the archive. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From kvn at openjdk.java.net Wed Mar 24 21:35:40 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 24 Mar 2021 21:35:40 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 13:29:38 GMT, David Holmes wrote: >> The compiler folk will need to see if VM_Version::initialize itself has any dependencies on the AOTLoader initialization. Changing the initialization order is always risky. >> >> On a side note please don't modify any copyright line except for Oracle's when modifying files, unless instructed to by the owner of that copyright. >> >> Thanks, >> David > > It may be possible to factor out the necessary logic to the VM_Version::early_initialize() function instead. VM_Version::initialize don't have dependencies on AOT initialization. ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From kvn at openjdk.java.net Wed Mar 24 21:39:57 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Wed, 24 Mar 2021 21:39:57 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 08:04:47 GMT, Pengfei Li wrote: > Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. > HotSpot AOT tests failed because the shared library compiled with the > same VM options on the same machine are skipped when loaded back. > > Below command sequence shows a simple way to reproduce this issue. > > $ getconf -a | grep LEVEL1_DCACHE_LINESIZE > LEVEL1_DCACHE_LINESIZE 256 > > $ jaotc --output a.so Hello.class > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello > Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' > 4 1 skipped ./a.so aot library > > The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs > with L1 dcache line size larger than 128 bytes, the value is adjusted to > the cache line size in `VM_Version_init()`. This adjustment is done after > AOT library loading in `codeCache_init()`. So the AOT lib verifier still > assumes the `ContendedPaddingWidth` in the compiled library should be 128 > and thus causes the loaded library skipped. > > In my proposed fix, `AOTLoader::initialize()` is moved out of the general > codecache initialization and placed after `VM_Version_init()`. The order > of `codeCache_init()` and `VM_Version_init()` is not changed since there may > be code emitted during `VM_Version_init()`, which depends on the general > codecache init. > > Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. looks fine from compiler/aot POV. ------------- Marked as reviewed by kvn (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3169 From coleenp at openjdk.java.net Wed Mar 24 22:11:45 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 22:11:45 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: <2M27kSDpdNKnizFBorZ2pbhnqGA2JFuWkxJ6PmJ0lrs=.59e0f142-c2ae-4a81-99d1-5c2df97fb69e@github.com> On Wed, 24 Mar 2021 19:19:28 GMT, Tom Rodriguez wrote: >> It was https://bugs.openjdk.java.net/browse/JDK-8244997 - you were co-author :) >> >> Maybe this jvmci_reserved_oop0 won't crash for the same reasons. I don't know that. >> >> I still would like to not see 45 lines of declarations for JVMCI added to JavaThread. These should be in a separate header file and declared, as in https://bugs.openjdk.java.net/browse/JDK-8137018. If you promise to fix 8137018, I'm fine with this change. > > co-author is a strong word. :) But that does ring a bell. Why was threadObj problematic but the other existing oop fields were not? JavaThread::oops_do_no_frames visits a lot of roots that aren't OopStorage. > I can tackle JDK-8137018. I think we'll need to add an alias mechanism to vmStructs_jvmci.cpp to maintain backward compatibility but that's fairly straightforward. well, I didn't even list myself as author of that one. _threadObj was a problem because some code (like thread dump management code) was accessing it from a terminating thread and the barriers were messed up. It was more complicated than that. I don't know if this will be an issue for these declarations, and if you hide them in another place, we'll never know. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Wed Mar 24 22:15:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 24 Mar 2021 22:15:41 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 20:35:36 GMT, Ioi Lam wrote: >> So we always dump dynamic shared spaces from the VMThread? > > When DynamicDumpSharedSpaces is true, this code can be executed in a Java thread or a VM thread. When it's in the VM thread, we cannot proceed to the code below and must return immediately. > > When DynamicDumpSharedSpaces is false, this code cannot be executed in a VM thread. Ok, I think I got the logic right then. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From never at openjdk.java.net Thu Mar 25 00:20:38 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Thu, 25 Mar 2021 00:20:38 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: <2M27kSDpdNKnizFBorZ2pbhnqGA2JFuWkxJ6PmJ0lrs=.59e0f142-c2ae-4a81-99d1-5c2df97fb69e@github.com> References: <2M27kSDpdNKnizFBorZ2pbhnqGA2JFuWkxJ6PmJ0lrs=.59e0f142-c2ae-4a81-99d1-5c2df97fb69e@github.com> Message-ID: <1u8BV8JLiWajupAKlU8SY3gPMFZ3yTpTkAlq6FPcVu4=.5df999cd-e7f2-47ed-abcb-fe8a4699fd1b@github.com> On Wed, 24 Mar 2021 22:09:19 GMT, Coleen Phillimore wrote: >> co-author is a strong word. :) But that does ring a bell. Why was threadObj problematic but the other existing oop fields were not? JavaThread::oops_do_no_frames visits a lot of roots that aren't OopStorage. >> I can tackle JDK-8137018. I think we'll need to add an alias mechanism to vmStructs_jvmci.cpp to maintain backward compatibility but that's fairly straightforward. > > well, I didn't even list myself as author of that one. _threadObj was a problem because some code (like thread dump management code) was accessing it from a terminating thread and the barriers were messed up. It was more complicated than that. I don't know if this will be an issue for these declarations, and if you hide them in another place, we'll never know. I see. These fields will only be accessed from generated code so I don't think there are the same runtime considerations with them. Obviously it will be our problem to diagnose and fix if issues like that crop up. We just don't want to break an obvious invariants that would require the use of OopStorage. Do you now approve of these changes? ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Thu Mar 25 00:38:43 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 00:38:43 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 06:11:44 GMT, Tom Rodriguez wrote: > 8264016: [JVMCI] add some thread local fields for use by JVMCI Marked as reviewed by coleenp (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From coleenp at openjdk.java.net Thu Mar 25 00:38:43 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 00:38:43 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: <1u8BV8JLiWajupAKlU8SY3gPMFZ3yTpTkAlq6FPcVu4=.5df999cd-e7f2-47ed-abcb-fe8a4699fd1b@github.com> References: <2M27kSDpdNKnizFBorZ2pbhnqGA2JFuWkxJ6PmJ0lrs=.59e0f142-c2ae-4a81-99d1-5c2df97fb69e@github.com> <1u8BV8JLiWajupAKlU8SY3gPMFZ3yTpTkAlq6FPcVu4=.5df999cd-e7f2-47ed-abcb-fe8a4699fd1b@github.com> Message-ID: On Thu, 25 Mar 2021 00:14:49 GMT, Tom Rodriguez wrote: >> well, I didn't even list myself as author of that one. _threadObj was a problem because some code (like thread dump management code) was accessing it from a terminating thread and the barriers were messed up. It was more complicated than that. I don't know if this will be an issue for these declarations, and if you hide them in another place, we'll never know. > > I see. These fields will only be accessed from generated code so I don't think there are the same runtime considerations with them. Obviously it will be our problem to diagnose and fix if issues like that crop up. We just don't want to break an obvious invariants that would require the use of OopStorage. Do you now approve of these changes? Ok! ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From dholmes at openjdk.java.net Thu Mar 25 02:22:41 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 25 Mar 2021 02:22:41 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> References: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> Message-ID: On Wed, 24 Mar 2021 19:51:00 GMT, Coleen Phillimore wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Improvements suggested by Ioi. Hi Coleen, Generally this looks good, but I'm with Ioi here that where we pass through the current thread to avoid needing to manifest Thread::current down the stack (for MutexLockers or ResourceMarks etc) then I would prefer to see that kept - and if necessary relocate the Thread parameter to the beginning. In the past we have consciously added these thread parameters to avoid the Thread::current calls and I prefer not to see that undone. But it is very subjective - I removed one only needed for logging code for example. Thanks, David src/hotspot/share/ci/ciEnv.cpp line 445: > 443: { > 444: ttyUnlocker ttyul; // release tty lock to avoid ordering problems > 445: MutexLocker ml(Compile_lock); We could pass current here too. src/hotspot/share/jvmci/jvmciRuntime.cpp line 1268: > 1266: { > 1267: ttyUnlocker ttyul; // release tty lock to avoid ordering problems > 1268: MutexLocker ml(Compile_lock); Can pass THREAD here ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Thu Mar 25 02:56:02 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 02:56:02 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v3] In-Reply-To: References: Message-ID: > find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. > > The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. > > check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. > > Also: is_shared_class_visible{_impl} > > Tested with tier1 on 4 Oracle platforms (in progress) Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: add back Thread parameter to find_constrained_instance_or_array_klass. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3176/files - new: https://git.openjdk.java.net/jdk/pull/3176/files/1e9e29cf..ca4d8cee Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3176&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3176&range=01-02 Stats: 9 lines in 4 files changed: 1 ins; 0 del; 8 mod Patch: https://git.openjdk.java.net/jdk/pull/3176.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3176/head:pull/3176 PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Thu Mar 25 02:56:03 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 02:56:03 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: References: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> Message-ID: On Thu, 25 Mar 2021 02:08:32 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Improvements suggested by Ioi. > > src/hotspot/share/ci/ciEnv.cpp line 445: > >> 443: { >> 444: ttyUnlocker ttyul; // release tty lock to avoid ordering problems >> 445: MutexLocker ml(Compile_lock); > > We could pass current here too. ok > src/hotspot/share/jvmci/jvmciRuntime.cpp line 1268: > >> 1266: { >> 1267: ttyUnlocker ttyul; // release tty lock to avoid ordering problems >> 1268: MutexLocker ml(Compile_lock); > > Can pass THREAD here ok ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Thu Mar 25 02:56:04 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 02:56:04 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v3] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 19:24:54 GMT, Coleen Phillimore wrote: >> src/hotspot/share/classfile/systemDictionary.cpp line 1842: >> >>> 1840: klass = Universe::typeArrayKlassObj(t); >>> 1841: } else { >>> 1842: MutexLocker mu(SystemDictionary_lock); >> >> Since this is a clean up RFE, I think it's better to avoid changes that may impact performance. I would avoid adding calls to Thread::current() -- except for cases inside logging code. Maybe change TRAPS to Thread* current and move it to the first parameter? I.e., how you changed SystemDictionaryShared::check_linking_constraints(). > > This will not impact performance though, since we're already taking a lock. Modern Thread::current() is relatively cheap. It looks a lot nicer without the thread parameter. Ok, I added it back as requested by Ioi and David. It still looks worse and will have no benefit for performance. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Thu Mar 25 02:59:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 02:59:40 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: References: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> Message-ID: <8umW_RoQ4c1FvINNfdUFIkIThu1XMVE0buJ-O_UP0O4=.421caac7-bebe-44b9-a356-ad66227cc89f@github.com> On Thu, 25 Mar 2021 02:20:15 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Improvements suggested by Ioi. > > Hi Coleen, > > Generally this looks good, but I'm with Ioi here that where we pass through the current thread to avoid needing to manifest Thread::current down the stack (for MutexLockers or ResourceMarks etc) then I would prefer to see that kept - and if necessary relocate the Thread parameter to the beginning. In the past we have consciously added these thread parameters to avoid the Thread::current calls and I prefer not to see that undone. But it is very subjective - I removed one only needed for logging code for example. > > Thanks, > David Added back a thread parameter and rerunning tier1 sanity tests. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From iklam at openjdk.java.net Thu Mar 25 03:22:43 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 25 Mar 2021 03:22:43 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> References: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> Message-ID: On Wed, 24 Mar 2021 19:51:00 GMT, Coleen Phillimore wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Improvements suggested by Ioi. Marked as reviewed by iklam (Reviewer). src/hotspot/share/classfile/systemDictionaryShared.cpp line 1873: > 1871: } else { > 1872: assert(!Thread::current()->is_VM_thread(), "must not be"); > 1873: Arguments::assert_is_dumping_archive(); The code after the assert should always be executed, not predicated on DynamicDumpSharedSpaces. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From dholmes at openjdk.java.net Thu Mar 25 04:16:46 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 25 Mar 2021 04:16:46 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v3] In-Reply-To: References: Message-ID: <5mDMWZD4KDnFoBozhtfS9hXow27gJ4yMkoD2lWcmSrk=.3bf3d5e9-c482-40b1-a39a-93c428512ce1@github.com> On Thu, 25 Mar 2021 02:56:02 GMT, Coleen Phillimore wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > add back Thread parameter to find_constrained_instance_or_array_klass. Updates look good - thanks But I'm having an issue with the systemDictionaryShared change to record_linking_constraints as well. Thanks, David ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3176 From dholmes at openjdk.java.net Thu Mar 25 04:16:48 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 25 Mar 2021 04:16:48 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: References: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> Message-ID: On Wed, 24 Mar 2021 20:37:35 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Improvements suggested by Ioi. > > src/hotspot/share/classfile/systemDictionaryShared.cpp line 1873: > >> 1871: } else { >> 1872: assert(!Thread::current()->is_VM_thread(), "must not be"); >> 1873: Arguments::assert_is_dumping_archive(); > > The code after the assert should always be executed, not predicated on DynamicDumpSharedSpaces. I'm finding the logic change here very hard to follow. Being in the VMThread may imply that DynamicDumpSharedSpaces must be true, but that doesn't necessarily mean that if not in the VMThread then DynamicDumpSharedSpaces must be false. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From pli at openjdk.java.net Thu Mar 25 04:24:40 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Thu, 25 Mar 2021 04:24:40 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: <85W8mR3a29uVLP1n-WY05s1MRTZCxqGvau8tMvLEl08=.99d4991a-d938-4ef8-b0ca-46b672e4c9e2@github.com> On Wed, 24 Mar 2021 21:37:14 GMT, Vladimir Kozlov wrote: >> Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. >> HotSpot AOT tests failed because the shared library compiled with the >> same VM options on the same machine are skipped when loaded back. >> >> Below command sequence shows a simple way to reproduce this issue. >> >> $ getconf -a | grep LEVEL1_DCACHE_LINESIZE >> LEVEL1_DCACHE_LINESIZE 256 >> >> $ jaotc --output a.so Hello.class >> >> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello >> Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' >> 4 1 skipped ./a.so aot library >> >> The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs >> with L1 dcache line size larger than 128 bytes, the value is adjusted to >> the cache line size in `VM_Version_init()`. This adjustment is done after >> AOT library loading in `codeCache_init()`. So the AOT lib verifier still >> assumes the `ContendedPaddingWidth` in the compiled library should be 128 >> and thus causes the loaded library skipped. >> >> In my proposed fix, `AOTLoader::initialize()` is moved out of the general >> codecache initialization and placed after `VM_Version_init()`. The order >> of `codeCache_init()` and `VM_Version_init()` is not changed since there may >> be code emitted during `VM_Version_init()`, which depends on the general >> codecache init. >> >> Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. > > looks fine from compiler/aot POV. > I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? Hi Andrew, I agree that tuning the initialization order is a tricky fix to some extent. But I don't think it's a trivial work to define a function to get the final value of `ContendedPaddingWidth` before `VM_Version::initialize()`. As the value depends on CPU dcache line size, the problem is that the way querying that CPU size info varies significantly on different platforms. In current implementation, we run a few assembly code on AArch64. But on some other architectures, typically ppc and s390, we emit much more code into a code buffer (and thus depends on `CodeCache::initialize()`). And on Windows, we need to call some Windows API to retrieve processor info. If we do too much in the newly defined function, it would be no much difference from moving `VM_Version::initialize()` before `AOTLoader::initialize()`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From pli at openjdk.java.net Thu Mar 25 04:34:39 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Thu, 25 Mar 2021 04:34:39 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: <85W8mR3a29uVLP1n-WY05s1MRTZCxqGvau8tMvLEl08=.99d4991a-d938-4ef8-b0ca-46b672e4c9e2@github.com> References: <85W8mR3a29uVLP1n-WY05s1MRTZCxqGvau8tMvLEl08=.99d4991a-d938-4ef8-b0ca-46b672e4c9e2@github.com> Message-ID: On Thu, 25 Mar 2021 04:22:11 GMT, Pengfei Li wrote: >> looks fine from compiler/aot POV. > >> I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? > > Hi Andrew, I agree that tuning the initialization order is a tricky fix to some extent. But I don't think it's a trivial work to define a function to get the final value of `ContendedPaddingWidth` before `VM_Version::initialize()`. As the value depends on CPU dcache line size, the problem is that the way querying that CPU size info varies significantly on different platforms. In current implementation, we run a few assembly code on AArch64. But on some other architectures, typically ppc and s390, we emit much more code into a code buffer (and thus depends on `CodeCache::initialize()`). And on Windows, we need to call some Windows API to retrieve processor info. If we do too much in the newly defined function, it would be no much difference from moving `VM_Version::initialize()` before `AOTLoader::initialize()`. > It may be possible to factor out the necessary logic to the VM_Version::early_initialize() function instead. Personally I didn't see it's possible. `VM_Version::early_initialize()` is called too early. As I mentioned above. On some architectures (typically arm32, ppc and s390), code is emitted into a code buffer to query the CPU dcache line size info. And this depends on `CodeCache::initialize()`. So we still need to place the necessary logic after `CodeCache::initialize()`. Please let me know if you have other suggestions. ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From kbarrett at openjdk.java.net Thu Mar 25 07:33:55 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 25 Mar 2021 07:33:55 GMT Subject: RFR: 8264166: OopStorage should support specifying MEMFLAGS for allocations Message-ID: Please review this change to OopStorage to allow the MEMFLAGS value for associated allocations to be specified when the storage object is constructed. This allows a subsystem that needs an OopStorage object to associate its allocation with others for that subsystem in NMT tracking and reporting. Testing: mach5 tier1. Manually compared NMT output before and after this change for a test that generated a lot of one particular OopStorage entries. ------------- Commit messages: - add MEMFLAGS support to OopStorage Changes: https://git.openjdk.java.net/jdk/pull/3188/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3188&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264166 Stats: 64 lines in 16 files changed: 30 ins; 0 del; 34 mod Patch: https://git.openjdk.java.net/jdk/pull/3188.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3188/head:pull/3188 PR: https://git.openjdk.java.net/jdk/pull/3188 From david.holmes at oracle.com Thu Mar 25 07:41:22 2021 From: david.holmes at oracle.com (David Holmes) Date: Thu, 25 Mar 2021 17:41:22 +1000 Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: <85W8mR3a29uVLP1n-WY05s1MRTZCxqGvau8tMvLEl08=.99d4991a-d938-4ef8-b0ca-46b672e4c9e2@github.com> Message-ID: On 25/03/2021 2:34 pm, Pengfei Li wrote: > On Thu, 25 Mar 2021 04:22:11 GMT, Pengfei Li wrote: > >>> looks fine from compiler/aot POV. >> >>> I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? >> >> Hi Andrew, I agree that tuning the initialization order is a tricky fix to some extent. But I don't think it's a trivial work to define a function to get the final value of `ContendedPaddingWidth` before `VM_Version::initialize()`. As the value depends on CPU dcache line size, the problem is that the way querying that CPU size info varies significantly on different platforms. In current implementation, we run a few assembly code on AArch64. But on some other architectures, typically ppc and s390, we emit much more code into a code buffer (and thus depends on `CodeCache::initialize()`). And on Windows, we need to call some Windows API to retrieve processor info. If we do too much in the newly defined function, it would be no much difference from moving `VM_Version::initialize()` before `AOTLoader::initialize()`. > >> It may be possible to factor out the necessary logic to the VM_Version::early_initialize() function instead. > > Personally I didn't see it's possible. `VM_Version::early_initialize()` is called too early. As I mentioned above. On some architectures (typically arm32, ppc and s390), code is emitted into a code buffer to query the CPU dcache line size info. And this depends on `CodeCache::initialize()`. So we still need to place the necessary logic after `CodeCache::initialize()`. > > Please let me know if you have other suggestions. No it was just something to investigate to see if it was feasible. Vladimir has indicated there is no issue with changing the order as you have done so that is fine as far as I am concerned. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3169 > From dholmes at openjdk.java.net Thu Mar 25 07:45:42 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 25 Mar 2021 07:45:42 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: <3-v1OHDwR3s9YqkEDvycRjVnZefpIhpXyExvOhSRkaQ=.4cd1a348-e6b3-47ac-b5e3-52708bc587e5@github.com> On Wed, 24 Mar 2021 08:04:47 GMT, Pengfei Li wrote: > Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. > HotSpot AOT tests failed because the shared library compiled with the > same VM options on the same machine are skipped when loaded back. > > Below command sequence shows a simple way to reproduce this issue. > > $ getconf -a | grep LEVEL1_DCACHE_LINESIZE > LEVEL1_DCACHE_LINESIZE 256 > > $ jaotc --output a.so Hello.class > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello > Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' > 4 1 skipped ./a.so aot library > > The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs > with L1 dcache line size larger than 128 bytes, the value is adjusted to > the cache line size in `VM_Version_init()`. This adjustment is done after > AOT library loading in `codeCache_init()`. So the AOT lib verifier still > assumes the `ContendedPaddingWidth` in the compiled library should be 128 > and thus causes the loaded library skipped. > > In my proposed fix, `AOTLoader::initialize()` is moved out of the general > codecache initialization and placed after `VM_Version_init()`. The order > of `codeCache_init()` and `VM_Version_init()` is not changed since there may > be code emitted during `VM_Version_init()`, which depends on the general > codecache init. > > Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. Marked as reviewed by dholmes (Reviewer). src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 3: > 1: /* > 2: * Copyright (c) 1997, 2021, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2015, 2021, Red Hat Inc. All rights reserved. Please restore this copyright line unless instructed to update it by Red Hat. ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From sjohanss at openjdk.java.net Thu Mar 25 08:20:05 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Thu, 25 Mar 2021 08:20:05 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v4] In-Reply-To: References: Message-ID: > Please review this refactoring of the hugetlbfs reservation code. > > **Summary** > In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: > if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { > return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); > } else { > return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); > } > > The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. > > Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). > > **Testing** > Mach5 tier1-3 and a lot of local testing with different large page configurations. Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: Self review. Update helper name to better match commit_memory_special(). ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3073/files - new: https://git.openjdk.java.net/jdk/pull/3073/files/2bf1d5a5..f70ca6a3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=02-03 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/3073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3073/head:pull/3073 PR: https://git.openjdk.java.net/jdk/pull/3073 From tschatzl at openjdk.java.net Thu Mar 25 08:23:39 2021 From: tschatzl at openjdk.java.net (Thomas Schatzl) Date: Thu, 25 Mar 2021 08:23:39 GMT Subject: RFR: 8264166: OopStorage should support specifying MEMFLAGS for allocations In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 07:27:58 GMT, Kim Barrett wrote: > Please review this change to OopStorage to allow the MEMFLAGS value for associated allocations to be specified when the storage object is constructed. This allows a subsystem that needs an OopStorage object to associate its allocation with others for that subsystem in NMT tracking and reporting. > > Testing: > mach5 tier1. > Manually compared NMT output before and after this change for a test that generated a lot of one particular OopStorage entries. Lgtm. ------------- Marked as reviewed by tschatzl (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3188 From stefank at openjdk.java.net Thu Mar 25 08:44:44 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Thu, 25 Mar 2021 08:44:44 GMT Subject: RFR: 8264166: OopStorage should support specifying MEMFLAGS for allocations In-Reply-To: References: Message-ID: <4UnYhJAxCldsUhFA9awgPMJyvm9jHFZZRW6pVTYYOc0=.f7b45db1-fe2c-492d-a999-e1a6f529b66e@github.com> On Thu, 25 Mar 2021 07:27:58 GMT, Kim Barrett wrote: > Please review this change to OopStorage to allow the MEMFLAGS value for associated allocations to be specified when the storage object is constructed. This allows a subsystem that needs an OopStorage object to associate its allocation with others for that subsystem in NMT tracking and reporting. > > Testing: > mach5 tier1. > Manually compared NMT output before and after this change for a test that generated a lot of one particular OopStorage entries. Looks good. Just one nit src/hotspot/share/gc/shared/oopStorage.inline.hpp line 28: > 26: #define SHARE_GC_SHARED_OOPSTORAGE_INLINE_HPP > 27: > 28: #include "memory/allocation.hpp" Sort order ------------- Marked as reviewed by stefank (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3188 From pli at openjdk.java.net Thu Mar 25 09:06:07 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Thu, 25 Mar 2021 09:06:07 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line [v2] In-Reply-To: References: Message-ID: > Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. > HotSpot AOT tests failed because the shared library compiled with the > same VM options on the same machine are skipped when loaded back. > > Below command sequence shows a simple way to reproduce this issue. > > $ getconf -a | grep LEVEL1_DCACHE_LINESIZE > LEVEL1_DCACHE_LINESIZE 256 > > $ jaotc --output a.so Hello.class > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello > Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' > 4 1 skipped ./a.so aot library > > The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs > with L1 dcache line size larger than 128 bytes, the value is adjusted to > the cache line size in `VM_Version_init()`. This adjustment is done after > AOT library loading in `codeCache_init()`. So the AOT lib verifier still > assumes the `ContendedPaddingWidth` in the compiled library should be 128 > and thus causes the loaded library skipped. > > In my proposed fix, `AOTLoader::initialize()` is moved out of the general > codecache initialization and placed after `VM_Version_init()`. The order > of `codeCache_init()` and `VM_Version_init()` is not changed since there may > be code emitted during `VM_Version_init()`, which depends on the general > codecache init. > > Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: Restore Red Hat copyright line ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3169/files - new: https://git.openjdk.java.net/jdk/pull/3169/files/17b36ef6..012565fe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3169&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3169&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3169.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3169/head:pull/3169 PR: https://git.openjdk.java.net/jdk/pull/3169 From pli at openjdk.java.net Thu Mar 25 09:06:09 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Thu, 25 Mar 2021 09:06:09 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line [v2] In-Reply-To: <3-v1OHDwR3s9YqkEDvycRjVnZefpIhpXyExvOhSRkaQ=.4cd1a348-e6b3-47ac-b5e3-52708bc587e5@github.com> References: <3-v1OHDwR3s9YqkEDvycRjVnZefpIhpXyExvOhSRkaQ=.4cd1a348-e6b3-47ac-b5e3-52708bc587e5@github.com> Message-ID: On Thu, 25 Mar 2021 07:42:08 GMT, David Holmes wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore Red Hat copyright line > > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 1997, 2021, Oracle and/or its affiliates. All rights reserved. >> 3: * Copyright (c) 2015, 2021, Red Hat Inc. All rights reserved. > > Please restore this copyright line unless instructed to update it by Red Hat. Reverted, thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From kbarrett at openjdk.java.net Thu Mar 25 10:56:40 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Thu, 25 Mar 2021 10:56:40 GMT Subject: RFR: 8264166: OopStorage should support specifying MEMFLAGS for allocations In-Reply-To: <4UnYhJAxCldsUhFA9awgPMJyvm9jHFZZRW6pVTYYOc0=.f7b45db1-fe2c-492d-a999-e1a6f529b66e@github.com> References: <4UnYhJAxCldsUhFA9awgPMJyvm9jHFZZRW6pVTYYOc0=.f7b45db1-fe2c-492d-a999-e1a6f529b66e@github.com> Message-ID: On Thu, 25 Mar 2021 08:42:06 GMT, Stefan Karlsson wrote: >> Please review this change to OopStorage to allow the MEMFLAGS value for associated allocations to be specified when the storage object is constructed. This allows a subsystem that needs an OopStorage object to associate its allocation with others for that subsystem in NMT tracking and reporting. >> >> Testing: >> mach5 tier1. >> Manually compared NMT output before and after this change for a test that generated a lot of one particular OopStorage entries. > > src/hotspot/share/gc/shared/oopStorage.inline.hpp line 28: > >> 26: #define SHARE_GC_SHARED_OOPSTORAGE_INLINE_HPP >> 27: >> 28: #include "memory/allocation.hpp" > > Sort order Oops, will fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/3188 From ysuenaga at openjdk.java.net Thu Mar 25 12:45:40 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Thu, 25 Mar 2021 12:45:40 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> On Mon, 22 Mar 2021 22:12:14 GMT, Xin Liu wrote: > This patch provides a buffer to store asynchrounous messages and flush them to > underlying files periodically. I think this PR is very useful for us! > May we know more about LogMessageBuffer.hpp/cpp? We haven?t found a real use of it. That?s why we are hesitating to support LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator). Further, we haven?t supported async_mode for LogStdoutOutput and LogStderrOutput either. It?s not difficult but needs to big code change. `LogMessageBuffer` is used in `LogMessage`. For example, we can see it as following. Frame # 1 is `LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator)`. IMHO we do not need to change LogStdout/errOutput, but it is better to change LogMessageBuffer. #0 LogFileStreamOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logDecorators.hpp:108 #1 0x00007ffff6e80e8e in LogFileOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logFileOutput.cpp:314 #2 0x00007ffff6e876eb in LogTagSet::log ( this=this at entry=0x7ffff7d4a640 ::_tagset>, msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.cpp:85 #3 0x00007ffff6a194df in LogImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::write ( msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.hpp:150 #4 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::flush ( this=0x7ffff58675d0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:79 #5 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::~LogMessageImpl ( this=0x7ffff58675d0, __in_chrg=) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:74 #6 InstanceKlass::print_class_load_logging (this=this at entry=0x800007430, loader_data=loader_data at entry=0x7ffff00f5200, module_entry=module_entry at entry=0x0, cfs=cfs at entry=0x0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/oops/instanceKlass.cpp:3647 src/hotspot/share/runtime/globals.hpp line 2033: > 2031: "Milliseconds between asynchronous log flushing") \ > 2032: \ > 2033: product(bool, AsyncLogging, false, \ I think this option is not needed - `async` should be set to `false` by default, and we should control it through `-Xlog` option like other log output options (e.g. `filecount`). src/hotspot/share/runtime/globals.hpp line 2036: > 2034: "Enble asynchronous GC logging") \ > 2035: \ > 2036: product(size_t, GCLogBufferSize, 2*K, \ This PR is for UL, not only GC log. So it should be renamed. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From coleenp at openjdk.java.net Thu Mar 25 13:06:00 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 13:06:00 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: > find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. > > The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. > > check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. > > Also: is_shared_class_visible{_impl} > > Tested with tier1 on 4 Oracle platforms (in progress) Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: fix broken logic in systemDictionaryShared. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3176/files - new: https://git.openjdk.java.net/jdk/pull/3176/files/ca4d8cee..76892893 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3176&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3176&range=02-03 Stats: 14 lines in 1 file changed: 0 ins; 2 del; 12 mod Patch: https://git.openjdk.java.net/jdk/pull/3176.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3176/head:pull/3176 PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Thu Mar 25 13:06:00 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Thu, 25 Mar 2021 13:06:00 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v2] In-Reply-To: References: <8bqTFtd4HUjMmF4RwVDoJH973LxAfJOnnXI_JshWpHo=.a36a6c50-eedd-438e-b365-d15637ad6742@github.com> Message-ID: On Thu, 25 Mar 2021 04:11:36 GMT, David Holmes wrote: >> src/hotspot/share/classfile/systemDictionaryShared.cpp line 1873: >> >>> 1871: } else { >>> 1872: assert(!Thread::current()->is_VM_thread(), "must not be"); >>> 1873: Arguments::assert_is_dumping_archive(); >> >> The code after the assert should always be executed, not predicated on DynamicDumpSharedSpaces. > > I'm finding the logic change here very hard to follow. Being in the VMThread may imply that DynamicDumpSharedSpaces must be true, but that doesn't necessarily mean that if not in the VMThread then DynamicDumpSharedSpaces must be false. I got this logic wrong, now fixed. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From ccheung at openjdk.java.net Thu Mar 25 13:23:46 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Thu, 25 Mar 2021 13:23:46 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v6] In-Reply-To: References: Message-ID: On Wed, 24 Mar 2021 15:35:55 GMT, Yumin Qi wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Remove redundant check for if a class is shareable Looks good. Few comments below. Also, the vmSymbols.h and diagnosticCommand.hpp need copyright update. Thanks, Calvin ------------- Changes requested by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2737 From iklam at openjdk.java.net Thu Mar 25 13:35:23 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Thu, 25 Mar 2021 13:35:23 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: <1fncQEyM1By3oWrXIHGuA8KBJQxPtLpkJmbPyYYvHDA=.49f0fbe4-bf2b-429b-a55d-18036f8e4207@github.com> On Thu, 25 Mar 2021 13:06:00 GMT, Coleen Phillimore wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fix broken logic in systemDictionaryShared. Marked as reviewed by iklam (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From matthias.baesken at sap.com Thu Mar 25 13:49:11 2021 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 25 Mar 2021 13:49:11 +0000 Subject: os_windows.cpp : simplify is_thread_cpu_time_supported ? Message-ID: Hello, I wonder , should we just return true in os::is_thread_cpu_time_supported() on Windows ? See https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L4588 According to MSDN https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getthreadtimes GetThreadTimes is supported on Win2003/XP and higher . This should be fine for OpenJDK . Best regards, Matthias From chagedorn at openjdk.java.net Thu Mar 25 15:06:42 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 25 Mar 2021 15:06:42 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives Message-ID: While playing around with `WB_IsMethodCompilable` together with `compileonly` I ran into some surprising results for methods that should never be compiled (not part of `compileonly`): `isMethodCompilable` returns true instead of false when such an excluded method was not yet tried to be compiled. The reason for it is that `WB_IsMethodCompilable` directly checks `CompilationPolicy::can_be_compiled()` which calls `Method::is_not_compilable()`. However, the `ExcludeOption` compiler directive is only evaluated lazily upon a compilation attempt. Therefore, if a method was not tried to be compiled, yet, `Method::is_not_compilable()` always returns false, regardless of any set compiler directive. I therefore suggest to additionally check the `ExcludeOption` in `WB_IsMethodCompilable`. I also cleaned up some wrong use of `CompLevel_any` and `CompLevel_all` as suggested by @veresov: `CompLevel_any` should only be used to query the state as in `is_*()` methods and `CompLevel_all` when changing the state is in `set_*()` methods. Thanks, Christian ------------- Commit messages: - Fix some CompLevel_all to CompLevel_any - 8263582: WB_IsMethodCompilable ignores compiler directives Changes: https://git.openjdk.java.net/jdk/pull/3195/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3195&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8263582 Stats: 141 lines in 5 files changed: 128 ins; 0 del; 13 mod Patch: https://git.openjdk.java.net/jdk/pull/3195.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3195/head:pull/3195 PR: https://git.openjdk.java.net/jdk/pull/3195 From lucy at openjdk.java.net Thu Mar 25 15:34:41 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Thu, 25 Mar 2021 15:34:41 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting Message-ID: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> This enhancement is intended to improve the hardware feature detection and reporting, in particular for more recently introduced hardware. The enhancement is a prerequisite for possible future feature exploitation. Reviews are highly welcome and appreciated. ------------- Commit messages: - 8264173: [s390] Improve Hardware Feature Detection And Reporting Changes: https://git.openjdk.java.net/jdk/pull/3196/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3196&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264173 Stats: 549 lines in 4 files changed: 340 ins; 74 del; 135 mod Patch: https://git.openjdk.java.net/jdk/pull/3196.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3196/head:pull/3196 PR: https://git.openjdk.java.net/jdk/pull/3196 From iveresov at openjdk.java.net Thu Mar 25 15:36:26 2021 From: iveresov at openjdk.java.net (Igor Veresov) Date: Thu, 25 Mar 2021 15:36:26 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 15:00:39 GMT, Christian Hagedorn wrote: > While playing around with `WB_IsMethodCompilable` together with `compileonly` I ran into some surprising results for methods that should never be compiled (not part of `compileonly`): `isMethodCompilable` returns true instead of false when such an excluded method was not yet tried to be compiled. > > The reason for it is that `WB_IsMethodCompilable` directly checks `CompilationPolicy::can_be_compiled()` which calls `Method::is_not_compilable()`. However, the `ExcludeOption` compiler directive is only evaluated lazily upon a compilation attempt. Therefore, if a method was not tried to be compiled, yet, `Method::is_not_compilable()` always returns false, regardless of any set compiler directive. > > I therefore suggest to additionally check the `ExcludeOption` in `WB_IsMethodCompilable`. I also cleaned up some wrong use of `CompLevel_any` and `CompLevel_all` as suggested by @veresov: `CompLevel_any` should only be used to query the state as in `is_*()` methods and `CompLevel_all` when changing the state is in `set_*()` methods. > > Thanks, > Christian Marked as reviewed by iveresov (Reviewer). test/hotspot/jtreg/compiler/whitebox/TestMethodCompilableCompilerDirectives.java line 64: > 62: // to prevent a compilation is evaluated lazily and is only applied when a compilation for m is attempted. > 63: // Another problem is that Method::is_not_compilable() only returns true for CompLevel_any if C1 AND C2 cannot compile it. > 64: // This means that a compilation of m must have been attempted for C1 and C2 before WB::isMethodCompilable(m, CompLevl_any) will A typo "CompLevl_any" -> "CompLevel_any". ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From chagedorn at openjdk.java.net Thu Mar 25 15:41:46 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 25 Mar 2021 15:41:46 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 15:33:18 GMT, Igor Veresov wrote: >> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo > > Marked as reviewed by iveresov (Reviewer). Thanks Igor for your review! > test/hotspot/jtreg/compiler/whitebox/TestMethodCompilableCompilerDirectives.java line 64: > >> 62: // to prevent a compilation is evaluated lazily and is only applied when a compilation for m is attempted. >> 63: // Another problem is that Method::is_not_compilable() only returns true for CompLevel_any if C1 AND C2 cannot compile it. >> 64: // This means that a compilation of m must have been attempted for C1 and C2 before WB::isMethodCompilable(m, CompLevl_any) will > > A typo "CompLevl_any" -> "CompLevel_any". Thanks, fixed it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From chagedorn at openjdk.java.net Thu Mar 25 15:41:45 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Thu, 25 Mar 2021 15:41:45 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: Message-ID: > While playing around with `WB_IsMethodCompilable` together with `compileonly` I ran into some surprising results for methods that should never be compiled (not part of `compileonly`): `isMethodCompilable` returns true instead of false when such an excluded method was not yet tried to be compiled. > > The reason for it is that `WB_IsMethodCompilable` directly checks `CompilationPolicy::can_be_compiled()` which calls `Method::is_not_compilable()`. However, the `ExcludeOption` compiler directive is only evaluated lazily upon a compilation attempt. Therefore, if a method was not tried to be compiled, yet, `Method::is_not_compilable()` always returns false, regardless of any set compiler directive. > > I therefore suggest to additionally check the `ExcludeOption` in `WB_IsMethodCompilable`. I also cleaned up some wrong use of `CompLevel_any` and `CompLevel_all` as suggested by @veresov: `CompLevel_any` should only be used to query the state as in `is_*()` methods and `CompLevel_all` when changing the state is in `set_*()` methods. > > Thanks, > Christian Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: fix typo ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3195/files - new: https://git.openjdk.java.net/jdk/pull/3195/files/952049c5..67e31f21 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3195&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3195&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3195.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3195/head:pull/3195 PR: https://git.openjdk.java.net/jdk/pull/3195 From stuefe at openjdk.java.net Thu Mar 25 16:04:28 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 25 Mar 2021 16:04:28 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 08:20:05 GMT, Stefan Johansson wrote: >> Please review this refactoring of the hugetlbfs reservation code. >> >> **Summary** >> In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: >> if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { >> return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); >> } else { >> return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); >> } >> >> The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. >> >> Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). >> >> **Testing** >> Mach5 tier1-3 and a lot of local testing with different large page configurations. > > Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: > > Self review. > > Update helper name to better match commit_memory_special(). Hi Stefan, this is a welcome cleanup! Remarks inline. Cheers, Thomas src/hotspot/os/linux/os_linux.cpp line 3932: > 3930: size_t page_size, > 3931: char* req_addr, > 3932: bool exec) { I'd prefer if this were file scope static and not exported (don't think this needs anything from the os::Linux namespace, or?). Also, mid term we could probably merge this with commit_memory. AFAICS the only differences to the latter is that this can do huge pages, and the error handling is a bit different. I am also not sure the return type makes a lot of sense. Either we handle commit errors inside this function, then we should return void. Or, we return a boolean and handle it in the caller. Long term I would prefer to handle allocation errors not by aborting but by leaving it up to the caller what to do. Because it makes sense to have the option to "reserve if we get large pages, otherwise just use small pages". Eg this I would like to do in a future Metaspace. src/hotspot/os/linux/os_linux.cpp line 3965: > 3963: size_t alignment, > 3964: char* req_addr, > 3965: bool exec) { So the contract is that this function will allocate huge paged memory with whatever page size is the default (controlled with UseLargePages and LargePageSizeInBytes). And with Markus future changes we will mix-and-match page sizes best as we can. So, control of page size is out of the hands of the caller. Are there callers misusing alignment for page size? Otherwise this seems fine to me. ------------- Changes requested by stuefe (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3073 From kvn at openjdk.java.net Thu Mar 25 17:39:27 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Thu, 25 Mar 2021 17:39:27 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: Message-ID: <14AlrpsyzREqo4XE-uDRzIsIeLnLcRzB5SIC9UU66Oc=.9d544536-3d41-40e7-a888-1bfd3484f8f5@github.com> On Thu, 25 Mar 2021 15:41:45 GMT, Christian Hagedorn wrote: >> While playing around with `WB_IsMethodCompilable` together with `compileonly` I ran into some surprising results for methods that should never be compiled (not part of `compileonly`): `isMethodCompilable` returns true instead of false when such an excluded method was not yet tried to be compiled. >> >> The reason for it is that `WB_IsMethodCompilable` directly checks `CompilationPolicy::can_be_compiled()` which calls `Method::is_not_compilable()`. However, the `ExcludeOption` compiler directive is only evaluated lazily upon a compilation attempt. Therefore, if a method was not tried to be compiled, yet, `Method::is_not_compilable()` always returns false, regardless of any set compiler directive. >> >> I therefore suggest to additionally check the `ExcludeOption` in `WB_IsMethodCompilable`. I also cleaned up some wrong use of `CompLevel_any` and `CompLevel_all` as suggested by @veresov: `CompLevel_any` should only be used to query the state as in `is_*()` methods and `CompLevel_all` when changing the state is in `set_*()` methods. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > fix typo I would like to hear from @neliasso why ExcludeOption is treated so specially (). Why m->is_not_compilable() returns `false` when ExcludeOption is used? src/hotspot/share/prims/whitebox.cpp line 869: > 867: // Both compilers could have ExcludeOption set. Check all combinations. > 868: bool excluded_c1 = is_excluded_for_compiler(CompileBroker::compiler1(), mh); > 869: bool excluded_c2 = is_excluded_for_compiler(CompileBroker::compiler2(), mh); May be use next instead as we do in `WhiteBox::compile_method` at line #992: *comp = CompileBroker::compiler(comp_level);``` ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From akozlov at openjdk.java.net Thu Mar 25 18:14:06 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 25 Mar 2021 18:14:06 GMT Subject: Integrated: 8253795: Implementation of JEP 391: macOS/AArch64 Port In-Reply-To: References: Message-ID: On Fri, 22 Jan 2021 18:49:42 GMT, Anton Kozlov wrote: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) This pull request has now been integrated. Changeset: dbc9e4b5 Author: Anton Kozlov Committer: Vladimir Kempik URL: https://git.openjdk.java.net/jdk/commit/dbc9e4b5 Stats: 2960 lines in 75 files changed: 2851 ins; 27 del; 82 mod 8253795: Implementation of JEP 391: macOS/AArch64 Port 8253816: Support macOS W^X 8253817: Support macOS Aarch64 ABI in Interpreter 8253818: Support macOS Aarch64 ABI for compiled wrappers 8253819: Implement os/cpu for macOS/AArch64 8253839: Update tests and JDK code for macOS/Aarch64 8254941: Implement Serviceability Agent for macOS/AArch64 8255776: Change build system for macOS/AArch64 8262903: [macos_aarch64] Thread::current() called on detached thread Co-authored-by: Vladimir Kempik Co-authored-by: Bernhard Urban-Forster Co-authored-by: Ludovic Henry Co-authored-by: Monica Beckwith Reviewed-by: erikj, ihse, prr, cjplummer, stefank, gziemski, aph, mbeckwit, luhenry ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Thu Mar 25 18:14:03 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 25 Mar 2021 18:14:03 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v30] In-Reply-To: References: Message-ID: > Please review the implementation of JEP 391: macOS/AArch64 Port. > > It's heavily based on existing ports to linux/aarch64, macos/x86_64, and windows/aarch64. > > Major changes are in: > * src/hotspot/cpu/aarch64: support of the new calling convention (subtasks JDK-8253817, JDK-8253818) > * src/hotspot/os_cpu/bsd_aarch64: copy of os_cpu/linux_aarch64 with necessary adjustments (JDK-8253819) > * src/hotspot/share, test/hotspot/gtest: support of write-xor-execute (W^X), required on macOS/AArch64 platform. It's implemented with pthread_jit_write_protect_np provided by Apple. The W^X mode is local to a thread, so W^X mode change relates to the java thread state change (for java threads). In most cases, JVM executes in write-only mode, except when calling a generated stub like SafeFetch, which requires a temporary switch to execute-only mode. The same execute-only mode is enabled when a java thread executes in java or native states. This approach of managing W^X mode turned out to be simple and efficient enough. > * src/jdk.hotspot.agent: serviceability agent implementation (JDK-8254941) Anton Kozlov has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 117 commits: - JDK-8261397: bsd_aarch64 part - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos - Merge branch 'master' into jdk-macos - JDK-8262491: bsd_aarch64 part - JDK-8263002: bsd_aarch64 part - Merge remote-tracking branch 'upstream/jdk/master' into jdk-macos - Wider #ifdef block - Fix most of issues in java/foreign/ tests Failures related to va_args are tracked in JDK-8263512. - Add Azul copyright - Update Oracle copyright years - ... and 107 more: https://git.openjdk.java.net/jdk/compare/b006f22f...d3629967 ------------- Changes: https://git.openjdk.java.net/jdk/pull/2200/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2200&range=29 Stats: 2960 lines in 75 files changed: 2851 ins; 27 del; 82 mod Patch: https://git.openjdk.java.net/jdk/pull/2200.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2200/head:pull/2200 PR: https://git.openjdk.java.net/jdk/pull/2200 From akozlov at openjdk.java.net Thu Mar 25 18:14:03 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Thu, 25 Mar 2021 18:14:03 GMT Subject: RFR: 8253795: Implementation of JEP 391: macOS/AArch64 Port [v29] In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 16:33:50 GMT, Andrew Haley wrote: >> Marked as reviewed by luhenry (Author). > >> > > > [ Back-porting this patch to JDK 11] depends on the will of openjdk11 maintainers to accept this (and few other, like jep-388, as we depend on it) contribution. >> > > >> > > >> > > To the extent that 11u has fixed policies :) we definitely have a policy of accepting patches to keep 11u working on current hardware. So yes. >> > >> > >> > @lewurm That sounds like a green flag for you and jep-388 (with its R18_RESERVED functionality) ;) >> >> Thanks, @theRealAph, and @VladimirKempik . We are on it! > > It's going to be tricky to do in a really clean way, given some of the weirdnesses of the ABI. However, I think there's probably a need for it The JEP was targeted to JDK17. So I propose to integrate this. Thank you all for the reviews, suggestions, discussions, and support! ------------- PR: https://git.openjdk.java.net/jdk/pull/2200 From xliu at openjdk.java.net Thu Mar 25 19:00:32 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Thu, 25 Mar 2021 19:00:32 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Thu, 25 Mar 2021 12:42:34 GMT, Yasumasa Suenaga wrote: > I think this PR is very useful for us! > > > May we know more about LogMessageBuffer.hpp/cpp? We haven?t found a real use of it. That?s why we are hesitating to support LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator). Further, we haven?t supported async_mode for LogStdoutOutput and LogStderrOutput either. It?s not difficult but needs to big code change. > > `LogMessageBuffer` is used in `LogMessage`. For example, we can see it as following. Frame # 1 is `LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator)`. IMHO we do not need to change LogStdout/errOutput, but it is better to change LogMessageBuffer. > > ``` > #0 LogFileStreamOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) > at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logDecorators.hpp:108 > #1 0x00007ffff6e80e8e in LogFileOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) > at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logFileOutput.cpp:314 > #2 0x00007ffff6e876eb in LogTagSet::log ( > this=this at entry=0x7ffff7d4a640 ::_tagset>, msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.cpp:85 > #3 0x00007ffff6a194df in LogImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::write ( > msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.hpp:150 > #4 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::flush ( > this=0x7ffff58675d0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:79 > #5 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::~LogMessageImpl ( > this=0x7ffff58675d0, __in_chrg=) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:74 > #6 InstanceKlass::print_class_load_logging (this=this at entry=0x800007430, loader_data=loader_data at entry=0x7ffff00f5200, > module_entry=module_entry at entry=0x0, cfs=cfs at entry=0x0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/oops/instanceKlass.cpp:3647 > ``` hi, @YaSuenag, Thank you for providing the stacktrace! I didn't notice until you point out. Now I understand the rationale and usecases of logMessageBuffer. Let me figure out how to support it. IIUC, the most important attribute of `LogMessage` is to guarantee messages are consecutive, or free from interleaving. I will focus on it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From xliu at openjdk.java.net Thu Mar 25 19:04:29 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Thu, 25 Mar 2021 19:04:29 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Thu, 25 Mar 2021 12:21:41 GMT, Yasumasa Suenaga wrote: >> This patch provides a buffer to store asynchrounous messages and flush them to >> underlying files periodically. > > src/hotspot/share/runtime/globals.hpp line 2033: > >> 2031: "Milliseconds between asynchronous log flushing") \ >> 2032: \ >> 2033: product(bool, AsyncLogging, false, \ > > I think this option is not needed - `async` should be set to `false` by default, and we should control it through `-Xlog` option like other log output options (e.g. `filecount`). It's possible that a Java process have multiple file-based outputs. A global option `AsyncLogging` can set them all Otherwise, developers have to set async=true individually. It's part of CSR, right? > src/hotspot/share/runtime/globals.hpp line 2036: > >> 2034: "Enble asynchronous GC logging") \ >> 2035: \ >> 2036: product(size_t, GCLogBufferSize, 2*K, \ > > This PR is for UL, not only GC log. So it should be renamed. ack. I will rename it AsyncLogBufferSize. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From stuefe at openjdk.java.net Thu Mar 25 20:14:31 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 25 Mar 2021 20:14:31 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Thu, 25 Mar 2021 18:57:32 GMT, Xin Liu wrote: >> I think this PR is very useful for us! >> >>> May we know more about LogMessageBuffer.hpp/cpp? We haven?t found a real use of it. That?s why we are hesitating to support LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator). Further, we haven?t supported async_mode for LogStdoutOutput and LogStderrOutput either. It?s not difficult but needs to big code change. >> >> `LogMessageBuffer` is used in `LogMessage`. For example, we can see it as following. Frame # 1 is `LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator)`. IMHO we do not need to change LogStdout/errOutput, but it is better to change LogMessageBuffer. >> >> #0 LogFileStreamOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) >> at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logDecorators.hpp:108 >> #1 0x00007ffff6e80e8e in LogFileOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) >> at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logFileOutput.cpp:314 >> #2 0x00007ffff6e876eb in LogTagSet::log ( >> this=this at entry=0x7ffff7d4a640 ::_tagset>, msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.cpp:85 >> #3 0x00007ffff6a194df in LogImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::write ( >> msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.hpp:150 >> #4 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::flush ( >> this=0x7ffff58675d0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:79 >> #5 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::~LogMessageImpl ( >> this=0x7ffff58675d0, __in_chrg=) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:74 >> #6 InstanceKlass::print_class_load_logging (this=this at entry=0x800007430, loader_data=loader_data at entry=0x7ffff00f5200, >> module_entry=module_entry at entry=0x0, cfs=cfs at entry=0x0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/oops/instanceKlass.cpp:3647 > >> I think this PR is very useful for us! >> >> > May we know more about LogMessageBuffer.hpp/cpp? We haven?t found a real use of it. That?s why we are hesitating to support LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator). Further, we haven?t supported async_mode for LogStdoutOutput and LogStderrOutput either. It?s not difficult but needs to big code change. >> >> `LogMessageBuffer` is used in `LogMessage`. For example, we can see it as following. Frame # 1 is `LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator)`. IMHO we do not need to change LogStdout/errOutput, but it is better to change LogMessageBuffer. >> >> ``` >> #0 LogFileStreamOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) >> at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logDecorators.hpp:108 >> #1 0x00007ffff6e80e8e in LogFileOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) >> at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logFileOutput.cpp:314 >> #2 0x00007ffff6e876eb in LogTagSet::log ( >> this=this at entry=0x7ffff7d4a640 ::_tagset>, msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.cpp:85 >> #3 0x00007ffff6a194df in LogImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::write ( >> msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.hpp:150 >> #4 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::flush ( >> this=0x7ffff58675d0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:79 >> #5 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::~LogMessageImpl ( >> this=0x7ffff58675d0, __in_chrg=) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:74 >> #6 InstanceKlass::print_class_load_logging (this=this at entry=0x800007430, loader_data=loader_data at entry=0x7ffff00f5200, >> module_entry=module_entry at entry=0x0, cfs=cfs at entry=0x0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/oops/instanceKlass.cpp:3647 >> ``` > > hi, @YaSuenag, > > Thank you for providing the stacktrace! I didn't notice until you point out. Now I understand the rationale and usecases of logMessageBuffer. Let me figure out how to support it. > > IIUC, the most important attribute of `LogMessage` is to guarantee messages are consecutive, or free from interleaving. I will focus on it. Hi Xin, I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description. - Who does the writing, and who is affected when the writing stalls? - Do you then block or throw output away? - If the former, how do you mitigate the ripple effect? - If the latter, how does the reader of the log file know that something is missing? - How often do you flush? How do you prevent missing output in the log file in case of crashes? - Can this really the full brunt of logging (-Xlog:*=trace) over many threads? - Does this work with multiple target and multiple IO files? - Does it cost anything if logging is off or not async? Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread. I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling. - How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking. --- I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From stuefe at openjdk.java.net Thu Mar 25 20:21:26 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Thu, 25 Mar 2021 20:21:26 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Thu, 25 Mar 2021 20:11:46 GMT, Thomas Stuefe wrote: >>> I think this PR is very useful for us! >>> >>> > May we know more about LogMessageBuffer.hpp/cpp? We haven?t found a real use of it. That?s why we are hesitating to support LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator). Further, we haven?t supported async_mode for LogStdoutOutput and LogStderrOutput either. It?s not difficult but needs to big code change. >>> >>> `LogMessageBuffer` is used in `LogMessage`. For example, we can see it as following. Frame # 1 is `LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator)`. IMHO we do not need to change LogStdout/errOutput, but it is better to change LogMessageBuffer. >>> >>> ``` >>> #0 LogFileStreamOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) >>> at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logDecorators.hpp:108 >>> #1 0x00007ffff6e80e8e in LogFileOutput::write (this=this at entry=0x7ffff0002af0, msg_iterator=...) >>> at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logFileOutput.cpp:314 >>> #2 0x00007ffff6e876eb in LogTagSet::log ( >>> this=this at entry=0x7ffff7d4a640 ::_tagset>, msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.cpp:85 >>> #3 0x00007ffff6a194df in LogImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::write ( >>> msg=...) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logTagSet.hpp:150 >>> #4 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::flush ( >>> this=0x7ffff58675d0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:79 >>> #5 LogMessageImpl<(LogTag::type)16, (LogTag::type)68, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0, (LogTag::type)0>::~LogMessageImpl ( >>> this=0x7ffff58675d0, __in_chrg=) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/logging/logMessage.hpp:74 >>> #6 InstanceKlass::print_class_load_logging (this=this at entry=0x800007430, loader_data=loader_data at entry=0x7ffff00f5200, >>> module_entry=module_entry at entry=0x0, cfs=cfs at entry=0x0) at /home/ysuenaga/github-forked/jdk/src/hotspot/share/oops/instanceKlass.cpp:3647 >>> ``` >> >> hi, @YaSuenag, >> >> Thank you for providing the stacktrace! I didn't notice until you point out. Now I understand the rationale and usecases of logMessageBuffer. Let me figure out how to support it. >> >> IIUC, the most important attribute of `LogMessage` is to guarantee messages are consecutive, or free from interleaving. I will focus on it. > > Hi Xin, > > I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description. > > - Who does the writing, and who is affected when the writing stalls? > - Do you then block or throw output away? > - If the former, how do you mitigate the ripple effect? > - If the latter, how does the reader of the log file know that something is missing? > - How often do you flush? How do you prevent missing output in the log file in case of crashes? > - Can this really the full brunt of logging (-Xlog:*=trace) over many threads? > - Does this work with multiple target and multiple IO files? > - Does it cost anything if logging is off or not async? > > Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread. > > I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling. > > - How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking. > > --- > > I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think. > > Cheers, Thomas p.s. I like the integration into UL via a target modification btw. That feels very organic. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From dholmes at openjdk.java.net Thu Mar 25 22:29:26 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Thu, 25 Mar 2021 22:29:26 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 13:06:00 GMT, Coleen Phillimore wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fix broken logic in systemDictionaryShared. src/hotspot/share/classfile/systemDictionaryShared.cpp line 1864: > 1862: } > 1863: > 1864: if (DynamicDumpSharedSpaces && Thread::current()->is_VM_thread()) { This is still a functional change. If there is a bug and we get here in the VMThread when DynamicDumpSharedSpaces is not set then we will no longer immediately return. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From david.holmes at oracle.com Thu Mar 25 22:30:30 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Mar 2021 08:30:30 +1000 Subject: os_windows.cpp : simplify is_thread_cpu_time_supported ? In-Reply-To: References: Message-ID: Hi Matthias, On 25/03/2021 11:49 pm, Baesken, Matthias wrote: > Hello, I wonder , should we just return true in os::is_thread_cpu_time_supported() on Windows ? > > See > > https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L4588 > > > According to MSDN > https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getthreadtimes > > > GetThreadTimes is supported on Win2003/XP and higher . This should be fine for OpenJDK . Yes it should be fine. There may be other Windows archaisms in the code that could be cleaned up now. Cheers, David > Best regards, Matthias > From iklam at openjdk.java.net Fri Mar 26 00:00:28 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 26 Mar 2021 00:00:28 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 22:26:28 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fix broken logic in systemDictionaryShared. > > src/hotspot/share/classfile/systemDictionaryShared.cpp line 1864: > >> 1862: } >> 1863: >> 1864: if (DynamicDumpSharedSpaces && Thread::current()->is_VM_thread()) { > > This is still a functional change. If there is a bug and we get here in the VMThread when DynamicDumpSharedSpaces is not set then we will no longer immediately return. That's fine. In debug build, we will assert both before/after this change. In product build, after this change, some duplicated info will be added via info->record_linking_constraint but it won't hurt. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From david.holmes at oracle.com Fri Mar 26 02:28:26 2021 From: david.holmes at oracle.com (David Holmes) Date: Fri, 26 Mar 2021 12:28:26 +1000 Subject: os_windows.cpp : simplify is_thread_cpu_time_supported ? In-Reply-To: References: Message-ID: <3fbc20cb-02ff-178a-32c8-18934a5949e9@oracle.com> Sorry, correction ... On 26/03/2021 8:30 am, David Holmes wrote: > Hi Matthias, > > On 25/03/2021 11:49 pm, Baesken, Matthias wrote: >> Hello,? I wonder , should we just return? true? in >> os::is_thread_cpu_time_supported()?? on Windows? ? >> >> See >> >> https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L4588 >> >> According to? MSDN >> https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getthreadtimes >> >> GetThreadTimes is supported? on Win2003/XP and higher . This should be >> fine for OpenJDK . > > Yes it should be fine. There may be other Windows archaisms in the code > that could be cleaned up now. The issue was not API availability (we got rid of that check a long time ago) but security permissions. We actually just returned "true" prior to JDK 5 but that was changed by JDK-4884677 when the JVM TI support was added. It is a bit messy. We use the result of os::is_thread_cpu_time_supported() at initialization time, on the main thread to then decide the global availability of this feature. And via the normal launcher that thread will have all access bits set and so we will flag thread_cpu_time as being available. At runtime we might encounter a thread for which the access bits are not present and so the actual get_thread_cpu_time call may return -1. In theory the JVM could be loaded on a thread without full permissions and so we would then globally disable thread_cpu_time. So I think this code has to stay. David ----- > Cheers, > David > >> Best regards, Matthias >> From dholmes at openjdk.java.net Fri Mar 26 03:08:28 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 26 Mar 2021 03:08:28 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 13:06:00 GMT, Coleen Phillimore wrote: >> find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. >> >> The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. >> >> check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. >> >> Also: is_shared_class_visible{_impl} >> >> Tested with tier1 on 4 Oracle platforms (in progress) > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fix broken logic in systemDictionaryShared. LGTM. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3176 From dholmes at openjdk.java.net Fri Mar 26 03:08:28 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 26 Mar 2021 03:08:28 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 23:57:42 GMT, Ioi Lam wrote: >> src/hotspot/share/classfile/systemDictionaryShared.cpp line 1864: >> >>> 1862: } >>> 1863: >>> 1864: if (DynamicDumpSharedSpaces && Thread::current()->is_VM_thread()) { >> >> This is still a functional change. If there is a bug and we get here in the VMThread when DynamicDumpSharedSpaces is not set then we will no longer immediately return. > > That's fine. In debug build, we will assert both before/after this change. In product build, after this change, some duplicated info will be added via info->record_linking_constraint but it won't hurt. Okay. Thanks for clarifying. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From ysuenaga at openjdk.java.net Fri Mar 26 07:21:26 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Fri, 26 Mar 2021 07:21:26 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Thu, 25 Mar 2021 19:00:49 GMT, Xin Liu wrote: >> src/hotspot/share/runtime/globals.hpp line 2033: >> >>> 2031: "Milliseconds between asynchronous log flushing") \ >>> 2032: \ >>> 2033: product(bool, AsyncLogging, false, \ >> >> I think this option is not needed - `async` should be set to `false` by default, and we should control it through `-Xlog` option like other log output options (e.g. `filecount`). > > It's possible that a Java process have multiple file-based outputs. A global option `AsyncLogging` can set them all Otherwise, developers have to set async=true individually. It's part of CSR, right? Yes, it's part of CSR. I think it is prefer to set `async` to false by default because it should be treated same with `file` / `filecount` / `filesize` on the logger. The user should add `async=true` to the logger what the user want to set to. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From thomas.stuefe at gmail.com Fri Mar 26 07:43:46 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 26 Mar 2021 08:43:46 +0100 Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: Can you please link the CSR to the issue? On Fri, Mar 26, 2021 at 8:21 AM Yasumasa Suenaga wrote: > On Thu, 25 Mar 2021 19:00:49 GMT, Xin Liu wrote: > > >> src/hotspot/share/runtime/globals.hpp line 2033: > >> > >>> 2031: "Milliseconds between asynchronous log flushing") > \ > >>> 2032: > \ > >>> 2033: product(bool, AsyncLogging, false, > \ > >> > >> I think this option is not needed - `async` should be set to `false` by > default, and we should control it through `-Xlog` option like other log > output options (e.g. `filecount`). > > > > It's possible that a Java process have multiple file-based outputs. A > global option `AsyncLogging` can set them all Otherwise, developers have to > set async=true individually. It's part of CSR, right? > > Yes, it's part of CSR. > > I think it is prefer to set `async` to false by default because it should > be treated same with `file` / `filecount` / `filesize` on the logger. The > user should add `async=true` to the logger what the user want to set to. > > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3135 > From kbarrett at openjdk.java.net Fri Mar 26 07:47:46 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 26 Mar 2021 07:47:46 GMT Subject: RFR: 8264166: OopStorage should support specifying MEMFLAGS for allocations [v2] In-Reply-To: References: Message-ID: <7sMGu0o72hVvkvsdpInVQ5Wkvxo6Qiv1K-V-y8Kz6BA=.70a8297c-0717-4130-82b5-b5293a5a5408@github.com> > Please review this change to OopStorage to allow the MEMFLAGS value for associated allocations to be specified when the storage object is constructed. This allows a subsystem that needs an OopStorage object to associate its allocation with others for that subsystem in NMT tracking and reporting. > > Testing: > mach5 tier1. > Manually compared NMT output before and after this change for a test that generated a lot of one particular OopStorage entries. Kim Barrett has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into oopstorage_memflags - fix include sort order - add MEMFLAGS support to OopStorage ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3188/files - new: https://git.openjdk.java.net/jdk/pull/3188/files/a9c808be..6a30454f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3188&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3188&range=00-01 Stats: 7366 lines in 175 files changed: 6712 ins; 240 del; 414 mod Patch: https://git.openjdk.java.net/jdk/pull/3188.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3188/head:pull/3188 PR: https://git.openjdk.java.net/jdk/pull/3188 From kbarrett at openjdk.java.net Fri Mar 26 07:47:46 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Fri, 26 Mar 2021 07:47:46 GMT Subject: Integrated: 8264166: OopStorage should support specifying MEMFLAGS for allocations In-Reply-To: References: Message-ID: <0lZbxFyo0UuHHM2ew5Nz_YhCj75hw_3VGG2oGeMEZdU=.d60f7968-1cba-4a6b-97a4-1043f8ab1bb9@github.com> On Thu, 25 Mar 2021 07:27:58 GMT, Kim Barrett wrote: > Please review this change to OopStorage to allow the MEMFLAGS value for associated allocations to be specified when the storage object is constructed. This allows a subsystem that needs an OopStorage object to associate its allocation with others for that subsystem in NMT tracking and reporting. > > Testing: > mach5 tier1. > Manually compared NMT output before and after this change for a test that generated a lot of one particular OopStorage entries. This pull request has now been integrated. Changeset: bb354b9d Author: Kim Barrett URL: https://git.openjdk.java.net/jdk/commit/bb354b9d Stats: 64 lines in 16 files changed: 30 ins; 0 del; 34 mod 8264166: OopStorage should support specifying MEMFLAGS for allocations Reviewed-by: tschatzl, stefank ------------- PR: https://git.openjdk.java.net/jdk/pull/3188 From matthias.baesken at sap.com Fri Mar 26 08:06:12 2021 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Fri, 26 Mar 2021 08:06:12 +0000 Subject: os_windows.cpp : simplify is_thread_cpu_time_supported ? In-Reply-To: <3fbc20cb-02ff-178a-32c8-18934a5949e9@oracle.com> References: <3fbc20cb-02ff-178a-32c8-18934a5949e9@oracle.com> Message-ID: Hi David, thanks for the info . I found https://docs.microsoft.com/en-us/windows/win32/procthread/thread-security-and-access-rights so it looks like we need THREAD_QUERY_INFORMATION or THREAD_QUERY_LIMITED_INFORMATION access right for GetThreadTimes . On the other hand , the test in os::is_thread_cpu_time_supported() on Windows might (temporary ?) fail for other reasons too , it is not clear to me if this is really always related to the wrong access rights ? And at some places in HS code like jfrThreadCPULoadEvent.cpp , os::thread_cpu_time is called anyway without checking for os::is_thread_cpu_time_supported() ; same for thread.cpp / Thread::print_on but this is just printing some output so it is most likely not really a big issue on Windows . Best regards, Matthias >> On 25/03/2021 11:49 pm, Baesken, Matthias wrote: >>> Hello,? I wonder , should we just return? true? in >>> os::is_thread_cpu_time_supported()?? on Windows? ? >>> >>> See >>> >>> https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L4588 >>> >>> According to? MSDN >>> https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getthreadtimes >>> >>> GetThreadTimes is supported? on Win2003/XP and higher . This should be >>> fine for OpenJDK . >> >> Yes it should be fine. There may be other Windows archaisms in the code >> that could be cleaned up now. > >The issue was not API availability (we got rid of that check a long time >ago) but security permissions. We actually just returned "true" prior to >JDK 5 but that was changed by JDK-4884677 when the JVM TI support was added. > >It is a bit messy. We use the result of >os::is_thread_cpu_time_supported() at initialization time, on the main >thread to then decide the global availability of this feature. And via >the normal launcher that thread will have all access bits set and so we >will flag thread_cpu_time as being available. At runtime we might >encounter a thread for which the access bits are not present and so the >actual get_thread_cpu_time call may return -1. In theory the JVM could >be loaded on a thread without full permissions and so we would then >globally disable thread_cpu_time. > >So I think this code has to stay. From rehn at openjdk.java.net Fri Mar 26 08:21:25 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Fri, 26 Mar 2021 08:21:25 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: <5D_18NJHbCcd9gmqY4BioAiAljBB_RYa2lZqfzmU0qI=.8b61c6ad-e15a-4d5b-9398-5dc95aa0ade6@github.com> On Thu, 25 Mar 2021 20:19:03 GMT, Thomas Stuefe wrote: >> Hi Xin, >> >> I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description. >> >> - Who does the writing, and who is affected when the writing stalls? >> - Do you then block or throw output away? >> - If the former, how do you mitigate the ripple effect? >> - If the latter, how does the reader of the log file know that something is missing? >> - How often do you flush? How do you prevent missing output in the log file in case of crashes? >> - Can this really the full brunt of logging (-Xlog:*=trace) over many threads? >> - Does this work with multiple target and multiple IO files? >> - Does it cost anything if logging is off or not async? >> >> Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread. >> >> I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling. >> >> - How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking. >> >> --- >> >> I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think. >> >> Cheers, Thomas > > p.s. I like the integration into UL via a target modification btw. That feels very organic. Hi, This is flushed by the watcher thread (non-JavaThread). Flushing can thus happen during a safepoint and one or more safepoints may have passed between the actual logging and the flushing. If the VM thread logs it can be delayed while watcher thread does "pop_all()" it seems like. I suppose pop_all can take a while if you have a couple of thousands of logs messages? We can also change log-configuration during run-time, e.g. turn on/off logs via jcmd. Wouldn't it be more natural to flush the async logs-lines before we update the log configuration? (e.g. if you turn off a log via jcmd, we flush the async buffer before) Thanks ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From chagedorn at openjdk.java.net Fri Mar 26 08:58:27 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Fri, 26 Mar 2021 08:58:27 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: <14AlrpsyzREqo4XE-uDRzIsIeLnLcRzB5SIC9UU66Oc=.9d544536-3d41-40e7-a888-1bfd3484f8f5@github.com> References: <14AlrpsyzREqo4XE-uDRzIsIeLnLcRzB5SIC9UU66Oc=.9d544536-3d41-40e7-a888-1bfd3484f8f5@github.com> Message-ID: On Thu, 25 Mar 2021 17:19:13 GMT, Vladimir Kozlov wrote: >> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo > > src/hotspot/share/prims/whitebox.cpp line 869: > >> 867: // Both compilers could have ExcludeOption set. Check all combinations. >> 868: bool excluded_c1 = is_excluded_for_compiler(CompileBroker::compiler1(), mh); >> 869: bool excluded_c2 = is_excluded_for_compiler(CompileBroker::compiler2(), mh); > > May be use next instead as we do in `WhiteBox::compile_method` at line #992: > *comp = CompileBroker::compiler(comp_level);``` The problem is that `CompileBroker::compiler()` returns `NULL` for `CompLevel_any`. And even if it returned one compiler, I also need to check the other one to decide if the method is completly non-compilable. That's why I added this additional logic for `CompLevel_any`. ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From pli at openjdk.java.net Fri Mar 26 09:47:25 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Fri, 26 Mar 2021 09:47:25 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line [v2] In-Reply-To: <3-v1OHDwR3s9YqkEDvycRjVnZefpIhpXyExvOhSRkaQ=.4cd1a348-e6b3-47ac-b5e3-52708bc587e5@github.com> References: <3-v1OHDwR3s9YqkEDvycRjVnZefpIhpXyExvOhSRkaQ=.4cd1a348-e6b3-47ac-b5e3-52708bc587e5@github.com> Message-ID: On Thu, 25 Mar 2021 07:42:42 GMT, David Holmes wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore Red Hat copyright line > > Marked as reviewed by dholmes (Reviewer). > > I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? > > Hi Andrew, I agree that tuning the initialization order is a tricky fix to some extent. But I don't think it's a trivial work to define a function to get the final value of `ContendedPaddingWidth` before `VM_Version::initialize()`. As the value depends on CPU dcache line size, the problem is that the way querying that CPU size info varies significantly on different platforms. In current implementation, we run a few assembly code on AArch64. But on some other architectures, typically ppc and s390, we emit much more code into a code buffer (and thus depends on `CodeCache::initialize()`). And on Windows, we need to call some Windows API to retrieve processor info. If we do too much in the newly defined function, it would be no much difference from moving `VM_Version::initialize()` before `AOTLoader::initialize()`. @theRealAph Just wondering if you have other solutions or advice? ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From simonis at openjdk.java.net Fri Mar 26 10:05:32 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 26 Mar 2021 10:05:32 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: On Mon, 22 Mar 2021 22:12:14 GMT, Xin Liu wrote: > This patch provides a buffer to store asynchrounous messages and flush them to > underlying files periodically. Hi Xin, thanks for finally addressing this issue. In general your change looks good. Please find my detailed comments inline. Best regards, Volker src/hotspot/share/logging/logAsyncFlusher.cpp line 38: > 36: > 37: #ifndef PRODUCT > 38: template<> void LinkedListDeque::log_drop(AsyncLogMessage* e) { See comment on `pop_front()`. I'd remove this function entirely and handle this functionality right to `LogAsyncFlusher::enqueue()`. src/hotspot/share/logging/logAsyncFlusher.cpp line 72: > 70: > 71: if (_buffer.size() >= GCLogBufferSize) { > 72: _buffer.pop_front(); See comment on pop_front(). Instead of relying on `pop-front()` calling `log_drop()` I propose to remove `log_drop()` and implement it's functionality right here where I think it is more appropriate. You can call `buffer.front()` to get a pointer to the `AsyncLogMessage` and easily implement the functionality of `log_drop()` here. src/hotspot/share/logging/logAsyncFlusher.cpp line 81: > 79: LogAsyncFlusher* LogAsyncFlusher::_instance = NULL; > 80: > 81: void LogAsyncFlusher::initialize() { I don't think you need this. Please merge the initialization into `LogAsyncFlusher::instance()` (see my comments on your changes in `init.cpp`). src/hotspot/share/logging/logAsyncFlusher.hpp line 34: > 32: > 33: template > 34: class LinkedListDeque : private LinkedListImpl { The name `LinkedListDeque` implies that this is a general purpose Deque implementation which is not true. It's fine to have an implementation for your very specific needs (otherwise it should probably be in its own file under `share/utilities/dequeue.hpp`). But to make this explicitly clear to the reader, can you please rename it to something like `AsyncFlusherDeque` and specify it's semantics in a comment on top of the class. E.g this class doesn't support the usage of the inherited `add()` method because that would break the `size()` functionality. src/hotspot/share/logging/logAsyncFlusher.hpp line 42: > 40: LinkedListDeque() : _tail(NULL), _size(0) {} > 41: void push_back(const E& e) { > 42: if (!_tail) I think the convention is to use curly braces even for one-line blocks (as you've done in your other code). src/hotspot/share/logging/logAsyncFlusher.hpp line 64: > 62: if (h != NULL) { > 63: --_size; > 64: log_drop(h->data()); I'm a little unhappy that some semantics of the Dequeues basic datatype become visible in the implementation of a basic Dequeue method. E.g. "*log dropping*" is not something common for a Dequeue but for the LogAsyncFlusher. I think it would be better to drop the call to `log_drop()` here and implement this functionality right in `LogAsyncFlusher::enqueue()`. src/hotspot/share/logging/logAsyncFlusher.hpp line 79: > 77: } > 78: > 79: void log_drop(E* e) {} See comment on `pop_front()`. I'd remove this function entirely and handle it right in `LogAsyncFlusher::enqueue()`. src/hotspot/share/logging/logAsyncFlusher.hpp line 95: > 93: : _output(output), _decorators(decorations.get_decorators()), > 94: _level(decorations.get_level()), _tagset(decorations.get_logTagSet()) { > 95: // allow to fail here, then _message is NULL Why do you extract and store the `LogDecorators`, `_level` and `_tagset` set separately and re-create the `LogDecorations` in `AsyncLogMessage::writeback()`? Is it to save memory (because `LogDecorators` are much smaller than the `LogDecorations`) at the expense of time for recreating? src/hotspot/share/logging/logAsyncFlusher.hpp line 96: > 94: _level(decorations.get_level()), _tagset(decorations.get_logTagSet()) { > 95: // allow to fail here, then _message is NULL > 96: _message = os::strdup(msg, mtLogging); If you think `msg` can't be NULL here please add an assertion, otherwise please handle it. src/hotspot/share/logging/logAsyncFlusher.hpp line 111: > 109: o._message = NULL; // transfer the ownership of _message to this > 110: } > 111: Maybe add an explicit copy assignment operator with a `ShouldNotReachHere` to make sure `AsyncLogMessages` are not assigned unintentionally. src/hotspot/share/logging/logAsyncFlusher.hpp line 116: > 114: bool equals(const AsyncLogMessage& o) const { > 115: return (&_output == &o._output) && (_message == o._message || !strcmp(_message, o._message)); > 116: } [`strcmp()` is not defined for `NULL`](https://en.cppreference.com/w/cpp/string/byte/strcmp) but you can have `_message == NULL` if you've transferred ownership in the copy constructor. src/hotspot/share/logging/logAsyncFlusher.hpp line 124: > 122: > 123: class LogAsyncFlusher : public PeriodicTask { > 124: private: As far as I know, `PeriodicTask` is designed for short running task. But `LogAsyncFlusher::task()` will now call `AsyncLogMessage::writeback()` which does blocking I/O and can block for quite some time (that's why we have this change in the first place :). How does this affect the other periodic tasks and the `WatcherThread`. What's the worst case scenario if the `WatcherThread` is blocked? Is this any better than before? src/hotspot/share/logging/logConfiguration.cpp line 544: > 542: " If set to 0, log rotation is disabled." > 543: " This will cause existing log files to be overwritten."); > 544: out->print_cr(" async=true|false - write asynchronously or not."); Mention the default here which should be "false". src/hotspot/share/logging/logDecorations.cpp line 68: > 66: #undef DECORATOR > 67: > 68: assert(get_decorators() == decorators, "insanity check"); I think this should read "sanity check". src/hotspot/share/logging/logDecorators.hpp line 89: > 87: } > 88: > 89: LogDecorators(const LogDecorators& o) : _decorators(o._decorators) { Why do you need this new copy constructor? src/hotspot/share/logging/logDecorators.hpp line 92: > 90: } > 91: > 92: LogDecorators& operator=(const LogDecorators& rhs) { Why do you need this new assignment operator? src/hotspot/share/logging/logFileOutput.cpp line 50: > 48: : LogFileStreamOutput(NULL), _name(os::strdup_check_oom(name, mtLogging)), > 49: _file_name(NULL), _archive_name(NULL), _current_file(0), > 50: _file_count(DefaultFileCount), _is_default_file_count(true), _async_mode(AsyncLogging), _archive_name_len(0), See comments on `globals.hpp`. No need for an extra option. Make this `false` by default. And can you please also add the `_async_mode` to the following log trace in `LogFileOutput::initialize()`: log_trace(logging)("Initializing logging to file '%s' (filecount: %u" ", filesize: " SIZE_FORMAT " KiB).", _file_name, _file_count, _rotate_size / K); src/hotspot/share/logging/logFileOutput.cpp line 322: > 320: > 321: LogAsyncFlusher* flusher = LogAsyncFlusher::instance(); > 322: if (_async_mode && flusher != NULL) { Why you don't check for `flusher == NULL` in `LogAsyncFlusher::instance()` and call `LogAsyncFlusher::initialize()` in case it is NULL. I think it's no difference where the NULL check is and doing it in `LogAsyncFlusher::instance()` will save you from calling `LogAsyncFlusher::initialize()` in `init_globals()`. Put `LogAsyncFlusher::instance()` into the `if (_async_mode)` block and add an assertion that `flusher != NULL`. src/hotspot/share/logging/logFileOutput.cpp line 324: > 322: if (_async_mode && flusher != NULL) { > 323: flusher->enqueue(*this, decorations, msg); > 324: return 0; I think the contract of `LogFileOutput::write()` is not clear. Should this return the number of characters that have been actually written out or the number of characters that have been consumed. For the time beeing this doesn't seem to be a problem though, because the current callers of `LogFileOutput::write()` don't seem to check the return value anyway. src/hotspot/share/logging/logFileOutput.cpp line 336: > 334: } > 335: > 336: assert(!_async_mode, "AsyncLogging is not supported yet"); Can you please explain in which circumstances `LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator)` will be called and why it is not necessary to support `_async_mode` here? src/hotspot/share/runtime/init.cpp line 126: > 124: return status; > 125: > 126: LogAsyncFlusher::initialize(); I don't think this is required here. See my comment on `LogFileOutput::write()`. Just do the initialization in `LogAsyncFlusher::instance()` when it is called for the first time (i.e. `LogAsyncFlusher::_instance` is still NULL). test/hotspot/gtest/logging/test_asynclog.cpp line 168: > 166: EXPECT_FALSE(file_contains_substring(TestLogFileName, "log_trace-test")); // trace message is masked out > 167: EXPECT_TRUE(file_contains_substring(TestLogFileName, "log_debug-test")); > 168: } Should have a newline at the end. ------------- Changes requested by simonis (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3135 From simonis at openjdk.java.net Fri Mar 26 10:05:32 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 26 Mar 2021 10:05:32 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Fri, 26 Mar 2021 07:18:34 GMT, Yasumasa Suenaga wrote: >> It's possible that a Java process have multiple file-based outputs. A global option `AsyncLogging` can set them all Otherwise, developers have to set async=true individually. It's part of CSR, right? > > Yes, it's part of CSR. > > I think it is prefer to set `async` to false by default because it should be treated same with `file` / `filecount` / `filesize` on the logger. The user should add `async=true` to the logger what the user want to set to. Agreed. No need for an extra option. Handle it like other logging framework defaults (e.g. `DefaultFileCount` or `DefaultFileSize`) ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From simonis at openjdk.java.net Fri Mar 26 10:05:33 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Fri, 26 Mar 2021 10:05:33 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: <-tEJ2xzQNVtH8Lc7twtzbSVmWvMAne88K3Dm9zQuNZ4=.063dbfa6-3c9a-4d8b-acde-308e35b61c70@github.com> On Thu, 25 Mar 2021 12:21:44 GMT, Yasumasa Suenaga wrote: >> This patch provides a buffer to store asynchrounous messages and flush them to >> underlying files periodically. > > src/hotspot/share/runtime/globals.hpp line 2036: > >> 2034: "Enble asynchronous GC logging") \ >> 2035: \ >> 2036: product(size_t, GCLogBufferSize, 2*K, \ > > This PR is for UL, not only GC log. So it should be renamed. Agree, this should be something like "AsyncLoggingBufferSize". ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From sjohanss at openjdk.java.net Fri Mar 26 11:07:30 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Fri, 26 Mar 2021 11:07:30 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 09:00:37 GMT, Thomas Stuefe wrote: >> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: >> >> Self review. >> >> Update helper name to better match commit_memory_special(). > > src/hotspot/os/linux/os_linux.cpp line 3932: > >> 3930: size_t page_size, >> 3931: char* req_addr, >> 3932: bool exec) { > > I'd prefer if this were file scope static and not exported (don't think this needs anything from the os::Linux namespace, or?). > > Also, mid term we could probably merge this with commit_memory. AFAICS the only differences to the latter is that this can do huge pages, and the error handling is a bit different. > > I am also not sure the return type makes a lot of sense. Either we handle commit errors inside this function, then we should return void. Or, we return a boolean and handle it in the caller. > > Long term I would prefer to handle allocation errors not by aborting but by leaving it up to the caller what to do. Because it makes sense to have the option to "reserve if we get large pages, otherwise just use small pages". Eg this I would like to do in a future Metaspace. That was my first plan as well, but it uses `os::Linux::hugetlbfs_page_size_flag()` which in turn depends on `os::Linux::default_large_page_size()`. So I went with this. I agree that we long term should look at using/merging `commit_memoy()` but didn't want to change to much functionality at once here. What we get now is very much a like what we got before but all large pages at the beginning of the mapping. I also agree that now when this is really just a commit function, returning the address is a bit strange. I don't see a good way to handle potential failures in here so I opted for returning a `bool`. In some sense it is left to the caller (`ReservedSpace`) right now. If using large pages fail, we will try using small pages instead. But not sure if you would want this logic to move to another layer. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From sjohanss at openjdk.java.net Fri Mar 26 11:22:27 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Fri, 26 Mar 2021 11:22:27 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 09:04:06 GMT, Thomas Stuefe wrote: >> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: >> >> Self review. >> >> Update helper name to better match commit_memory_special(). > > src/hotspot/os/linux/os_linux.cpp line 3965: > >> 3963: size_t alignment, >> 3964: char* req_addr, >> 3965: bool exec) { > > So the contract is that this function will allocate huge paged memory with whatever page size is the default (controlled with UseLargePages and LargePageSizeInBytes). > > And with Markus future changes we will mix-and-match page sizes best as we can. So, control of page size is out of the hands of the caller. Are there callers misusing alignment for page size? > > Otherwise this seems fine to me. Yes, right now we will only use a single large page size at a time. But using more than one (after Marcus change) need to take the alignment into account. As you know I have plans to do more changes in this area so that we can pass down what page size we want for a given reservation. But if we want to apply that change before everything else is in place I will suggest to use the alignment as an upper limit for the page size. A mixed mapping would still only use one large page size and then fill up with small pages. This is not optimal, but trying to mix and match even more will lead to a lot more error handling and if doing that it should be a separate change. For cases where we only expect to use large pages (the old only case) the alignment is always set to `large_page_size` or larger. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From sjohanss at openjdk.java.net Fri Mar 26 11:27:40 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Fri, 26 Mar 2021 11:27:40 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v5] In-Reply-To: References: Message-ID: > Please review this refactoring of the hugetlbfs reservation code. > > **Summary** > In recent adventures in this area of the code I noticed a strange condition in `reserve_memory_special_huge_tlbfs` where we take the "mixed-mapping" route even if the size doesn't require any small pages to be used: > if (is_aligned(bytes, os::large_page_size()) && alignment <= os::large_page_size()) { > return reserve_memory_special_huge_tlbfs_only(bytes, req_addr, exec); > } else { > return reserve_memory_special_huge_tlbfs_mixed(bytes, alignment, req_addr, exec); > } > > The second condition here is needed because if the alignment is larger than the large page size, we needed to enforce this and can't just trust `mmap` to give us a properly aligned address. Doing this by using the mixed-function feels a bit weird and looking a bit more at this I found a way to refactor this function to avoid having the two helpers. > > Instead of only having the mixed path honor the passed down alignment, make sure that is always done. This will also have the side-effect that all large pages in a "mixed"-mapping will be at the start and then we will have a tail of small pages. This actually also ensures that we will use large pages for a mixed mapping, in the past there was a corner case where we could end up with just a head and tail of small pages and no large page in between (if the mapping was smaller than 2 large pages and there was no alignment constraint). > > **Testing** > Mach5 tier1-3 and a lot of local testing with different large page configurations. Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: Thomas review. Changed commit_memory_special to return bool to signal if the request succeeded or not. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3073/files - new: https://git.openjdk.java.net/jdk/pull/3073/files/f70ca6a3..787b87fe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=04 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3073&range=03-04 Stats: 17 lines in 2 files changed: 1 ins; 0 del; 16 mod Patch: https://git.openjdk.java.net/jdk/pull/3073.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3073/head:pull/3073 PR: https://git.openjdk.java.net/jdk/pull/3073 From sjohanss at openjdk.java.net Fri Mar 26 11:27:40 2021 From: sjohanss at openjdk.java.net (Stefan Johansson) Date: Fri, 26 Mar 2021 11:27:40 GMT Subject: RFR: 8262291: Refactor reserve_memory_special_huge_tlbfs [v4] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 16:01:36 GMT, Thomas Stuefe wrote: >> Stefan Johansson has updated the pull request incrementally with one additional commit since the last revision: >> >> Self review. >> >> Update helper name to better match commit_memory_special(). > > Hi Stefan, > > this is a welcome cleanup! > > Remarks inline. > > Cheers, Thomas Thanks for reviewing @tstuefe, if you are ok with these changes I will push this tonight as I will be on parental leave for the coming two weeks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3073 From iwalulya at openjdk.java.net Fri Mar 26 11:46:26 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Fri, 26 Mar 2021 11:46:26 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 09:06:38 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > Man Cao has updated the pull request incrementally with one additional commit since the last revision: > > Address comment and add a gtest. src/hotspot/share/utilities/lockFreeQueue.hpp line 33: > 31: > 32: // The LockFreeQueue template provides a lock-free FIFO. Its structure > 33: // and usage is similar to LockFreeStack. It has inner paddings, and probably need to add the conditional critical sections to the LockFreeStack for this description to be correct. But that can be done in a separate PR. src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 33: > 31: #include "utilities/lockFreeQueue.hpp" > 32: #include "logging/log.hpp" > 33: Don't we need inline specifiers for the functions below? src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 108: > 106: // returns released objects to a free list for reuse, it could cause > 107: // excessive allocations. > 108: GlobalCounter::ConditionalCriticalSection cs(use_rcu ? ` GlobalCounter::ConditionalCriticalSection cs(Thread::current());` should be fine, not sure how much is gained by skipping the `Thread::current()` call. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From stefank at openjdk.java.net Fri Mar 26 11:59:31 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 26 Mar 2021 11:59:31 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers Message-ID: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> The JIT compiler embeds pointers to addresses within an object. These are called derived pointers. When the GC moves objects, these pointers need to be updated explicitly, because the GC only deals with the real oops of the objects (base pointer). The code that deals with this uses oop* for the address containing the base pointer. This is fine, the address contains an oop. However, it also uses oop* for the interior pointer, even though the contents is not a valid oop. This creates temporary oops that does not conform to the normal requirements for oops. For example, the lower three bits could be set. This makes it problematic to write stricter verification code. I propose that we use intptr_t* instead of oop*, and only use oop* when the location is known to contain a valid oop. ------------- Commit messages: - 8264268: Don't use oop types for derived pointers Changes: https://git.openjdk.java.net/jdk/pull/3214/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3214&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264268 Stats: 51 lines in 5 files changed: 7 ins; 6 del; 38 mod Patch: https://git.openjdk.java.net/jdk/pull/3214.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3214/head:pull/3214 PR: https://git.openjdk.java.net/jdk/pull/3214 From stefank at openjdk.java.net Fri Mar 26 12:07:36 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Fri, 26 Mar 2021 12:07:36 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops Message-ID: Some parts of the JVM puts an marker to show that a location does not contain a valid oop. The code that handles this typically look like this: oop* p = ... if (*p != Universe::non_oop_word()) This means that sometimes the *p will create an oop that contains the non_oop_word. This makes it problematic to add stricter oop verification. I propose that we add a new function that checks the value of locations without converting it to an oop. (Note: I'm testing the new dependent pull Skara feature with this PR. It builds depends on the pr/3214 branch) ------------- Depends on: https://git.openjdk.java.net/jdk/pull/3214 Commit messages: - 8264271: Avoid creating non_oop_word oops Changes: https://git.openjdk.java.net/jdk/pull/3215/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3215&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264271 Stats: 58 lines in 7 files changed: 38 ins; 10 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/3215.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3215/head:pull/3215 PR: https://git.openjdk.java.net/jdk/pull/3215 From lutz.schmidt at sap.com Fri Mar 26 12:23:41 2021 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 26 Mar 2021 12:23:41 +0000 Subject: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line [v2] Message-ID: Hi, just stumbled over this conversation. If I got it right, the basic issue is that ContendedPaddingWidth may change its value after having been used by AOT code in the early stages of initialization. Wouldn't it be a pragmatic solution to make ContendedPaddingWidth platform-dependent? It could then be set to 128 for CPUs where D-cache line size does not exceed 128 and D-cache line size for other CPUs Side question: Is there any other CPU except s390 which has a D-cache line size greater than 128? Just curious. Thanks, Lutz ?On 26.03.21, 10:47, "hotspot-dev on behalf of Pengfei Li" wrote: On Thu, 25 Mar 2021 07:42:42 GMT, David Holmes wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Restore Red Hat copyright line > > Marked as reviewed by dholmes (Reviewer). > > I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? > > Hi Andrew, I agree that tuning the initialization order is a tricky fix to some extent. But I don't think it's a trivial work to define a function to get the final value of `ContendedPaddingWidth` before `VM_Version::initialize()`. As the value depends on CPU dcache line size, the problem is that the way querying that CPU size info varies significantly on different platforms. In current implementation, we run a few assembly code on AArch64. But on some other architectures, typically ppc and s390, we emit much more code into a code buffer (and thus depends on `CodeCache::initialize()`). And on Windows, we need to call some Windows API to retrieve processor info. If we do too much in the newly defined function, it would be no much difference from moving `VM_Version::initialize()` before `AOTLoader::initialize()`. @theRealAph Just wondering if you have other solutions or advice? ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From aph at openjdk.java.net Fri Mar 26 13:07:27 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 26 Mar 2021 13:07:27 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line [v2] In-Reply-To: References: <3-v1OHDwR3s9YqkEDvycRjVnZefpIhpXyExvOhSRkaQ=.4cd1a348-e6b3-47ac-b5e3-52708bc587e5@github.com> Message-ID: On Fri, 26 Mar 2021 09:44:19 GMT, Pengfei Li wrote: >> Marked as reviewed by dholmes (Reviewer). > >> > I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? >> >> Hi Andrew, I agree that tuning the initialization order is a tricky fix to some extent. But I don't think it's a trivial work to define a function to get the final value of `ContendedPaddingWidth` before `VM_Version::initialize()`. As the value depends on CPU dcache line size, the problem is that the way querying that CPU size info varies significantly on different platforms. In current implementation, we run a few assembly code on AArch64. But on some other architectures, typically ppc and s390, we emit much more code into a code buffer (and thus depends on `CodeCache::initialize()`). And on Windows, we need to call some Windows API to retrieve processor info. If we do too much in the newly defined function, it would be no much difference from moving `VM_Version::initialize()` before `AOTLoader::initialize()`. > > @theRealAph Just wondering if you have other solutions or advice? > Vladimir has indicated there is no issue with changing the order as you > have done so that is fine as far as I am concerned. OK, then. >> src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 3: >> >>> 1: /* >>> 2: * Copyright (c) 1997, 2021, Oracle and/or its affiliates. All rights reserved. >>> 3: * Copyright (c) 2015, 2021, Red Hat Inc. All rights reserved. >> >> Please restore this copyright line unless instructed to update it by Red Hat. > > Reverted, thanks. > > I must confess I don't like this solution at all: it sounds very delicate. Couldn't you define a function `VM_Version::get_ContendedPaddingWidth()`and call that? > > Hi Andrew, I agree that tuning the initialization order is a tricky fix to some extent. But I don't think it's a trivial work to define a function to get the final value of `ContendedPaddingWidth` before `VM_Version::initialize()`. As the value depends on CPU dcache line size, the problem is that the way querying that CPU size info varies significantly on different platforms. In current implementation, we run a few assembly code on AArch64. But on some other architectures, typically ppc and s390, we emit much more code into a code buffer (and thus depends on `CodeCache::initialize()`). And on Windows, we need to call some Windows API to retrieve processor info. If we do too much in the newly defined function, it would be no much difference from moving `VM_Version::initialize()` before `AOTLoader::initialize()`. So why not, on the platforms where it's a pain, define a function `VM_Version::get_ContendedPaddingWidth()` that returns the largest value it could be on that platform? On platforms where it's easy to find out, return the true value. ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From aph at openjdk.java.net Fri Mar 26 13:07:27 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Fri, 26 Mar 2021 13:07:27 GMT Subject: RFR: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line [v2] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 09:06:07 GMT, Pengfei Li wrote: >> Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. >> HotSpot AOT tests failed because the shared library compiled with the >> same VM options on the same machine are skipped when loaded back. >> >> Below command sequence shows a simple way to reproduce this issue. >> >> $ getconf -a | grep LEVEL1_DCACHE_LINESIZE >> LEVEL1_DCACHE_LINESIZE 256 >> >> $ jaotc --output a.so Hello.class >> >> $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello >> Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' >> 4 1 skipped ./a.so aot library >> >> The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs >> with L1 dcache line size larger than 128 bytes, the value is adjusted to >> the cache line size in `VM_Version_init()`. This adjustment is done after >> AOT library loading in `codeCache_init()`. So the AOT lib verifier still >> assumes the `ContendedPaddingWidth` in the compiled library should be 128 >> and thus causes the loaded library skipped. >> >> In my proposed fix, `AOTLoader::initialize()` is moved out of the general >> codecache initialization and placed after `VM_Version_init()`. The order >> of `codeCache_init()` and `VM_Version_init()` is not changed since there may >> be code emitted during `VM_Version_init()`, which depends on the general >> codecache init. >> >> Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Restore Red Hat copyright line Marked as reviewed by aph (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From coleenp at openjdk.java.net Fri Mar 26 13:17:25 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 26 Mar 2021 13:17:25 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 03:05:15 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fix broken logic in systemDictionaryShared. > > LGTM. Thanks for reviewing. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Fri Mar 26 13:17:25 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 26 Mar 2021 13:17:25 GMT Subject: RFR: 8264126: Remove TRAPS/THREAD parameter for class loading functions [v4] In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 03:05:07 GMT, David Holmes wrote: >> That's fine. In debug build, we will assert both before/after this change. In product build, after this change, some duplicated info will be added via info->record_linking_constraint but it won't hurt. > > Okay. Thanks for clarifying. Thanks for looking carefully at this. ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From coleenp at openjdk.java.net Fri Mar 26 13:17:26 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 26 Mar 2021 13:17:26 GMT Subject: Integrated: 8264126: Remove TRAPS/THREAD parameter for class loading functions In-Reply-To: References: Message-ID: <14xBpgR29mv50OXEH9Padl5b-8ZQSDVcK2ngox7De9s=.faaeb398-8aeb-4b60-bb6a-fe0679e5eba6@github.com> On Wed, 24 Mar 2021 16:20:30 GMT, Coleen Phillimore wrote: > find_constrained_instance_or_array_klass only passes THREAD so that it can be used in a MutexLocker for SystemDictionary_lock. This can use the MutexLocker that gets Thread::current() without any harm to performance. > > The other functions add_loader_constraint, record_linking_constraints, and check_signature_loaders fall out from that. > > check_signature_loaders should throw an exception but it unfortunately makes the caller construct the exception message so it doesn't. > > Also: is_shared_class_visible{_impl} > > Tested with tier1 on 4 Oracle platforms (in progress) This pull request has now been integrated. Changeset: 507b690f Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/507b690f Stats: 60 lines in 11 files changed: 2 ins; 6 del; 52 mod 8264126: Remove TRAPS/THREAD parameter for class loading functions Reviewed-by: ccheung, iklam, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/3176 From lucy at openjdk.java.net Fri Mar 26 14:07:52 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Fri, 26 Mar 2021 14:07:52 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting [v2] In-Reply-To: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> Message-ID: > This enhancement is intended to improve the hardware feature detection and reporting, in particular for more recently introduced hardware. The enhancement is a prerequisite for possible future feature exploitation. > > Reviews are highly welcome and appreciated. Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: update copyright headers ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3196/files - new: https://git.openjdk.java.net/jdk/pull/3196/files/3f524240..d15d1157 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3196&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3196&range=00-01 Stats: 7 lines in 4 files changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/3196.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3196/head:pull/3196 PR: https://git.openjdk.java.net/jdk/pull/3196 From iklam at openjdk.java.net Fri Mar 26 17:01:34 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Fri, 26 Mar 2021 17:01:34 GMT Subject: RFR: 8264285: Do not support FLAG_SET_XXX for VM flags of string type Message-ID: We have two versions of `JVMFlagAccess::ccstrAtPut()` that are slightly different. The following version is supposed to be used only by the `FLAG_SET_{CMDLINE,ERGO,MGMT}` macros. However, it's not used anywhere in the HotSpot source code: JVMFlag::Error JVMFlagAccess::ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin) { JVMFlag* faddr = JVMFlag::flag_from_enum(flag); assert(faddr->is_ccstr(), "wrong flag type"); ccstr old_value = faddr->get_ccstr(); trace_flag_changed(faddr, old_value, value, origin); char* new_value = os::strdup_check_oom(value); faddr->set_ccstr(new_value); if (!faddr->is_default() && old_value != NULL) { // Prior value is heap allocated so free it. FREE_C_HEAP_ARRAY(char, old_value); } faddr->set_origin(origin); return JVMFlag::SUCCESS; } It's not clear whether this unused version is actually correct since the last JVMFlag rewrite in [JDK-8081833](https://bugs.openjdk.java.net/browse/JDK-8081833), due to complete lack of testing. Let's remove this version to simplify code maintenance. If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. ------------- Commit messages: - 8264285: Do not support FLAG_SET_XXX for VM flags of string type Changes: https://git.openjdk.java.net/jdk/pull/3219/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3219&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264285 Stats: 33 lines in 3 files changed: 7 ins; 17 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/3219.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3219/head:pull/3219 PR: https://git.openjdk.java.net/jdk/pull/3219 From xliu at openjdk.java.net Fri Mar 26 19:08:28 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Fri, 26 Mar 2021 19:08:28 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Fri, 26 Mar 2021 08:37:41 GMT, Volker Simonis wrote: >> Yes, it's part of CSR. >> >> I think it is prefer to set `async` to false by default because it should be treated same with `file` / `filecount` / `filesize` on the logger. The user should add `async=true` to the logger what the user want to set to. > > Agreed. No need for an extra option. Handle it like other logging framework defaults (e.g. `DefaultFileCount` or `DefaultFileSize`) > > You also have to check that `LogAsyncInterval` is zero modulo 10 otherwise you'll get: > $ ./images/jdk/bin/java -XX:LogAsyncInterval=13 -Xlog:gc:file=/tmp/class.log::async=true -version > # To suppress the following error report, specify this argument > # after -XX: or in .hotspotrc: SuppressErrorAt=/task.cpp:73 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/priv/simonisv/OpenJDK/Git/jdk/src/hotspot/share/runtime/task.cpp:73), pid=21168, tid=21169 > # assert(_interval >= PeriodicTask::min_interval && _interval % PeriodicTask::interval_gran == 0) failed: improper PeriodicTask interval time > # > # JRE version: (17.0) (slowdebug build ) > # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.simonisv.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0x1244301] PeriodicTask::PeriodicTask(unsigned long)+0x83 > # > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /priv/simonisv/output/jdk-dbg/hs_err_pid21168.log > # > # > Aborted oh, I don't know this constraint. I will add a constraint check for the option. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From xliu at openjdk.java.net Fri Mar 26 19:29:28 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Fri, 26 Mar 2021 19:29:28 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: <5Wk-7de3G5IG2XOPsckXFSEnXCb_iyzHgabZ563VI5o=.9714b6c8-a8de-443b-864c-2633ee81817b@github.com> On Thu, 25 Mar 2021 19:37:25 GMT, Volker Simonis wrote: >> This patch provides a buffer to store asynchrounous messages and flush them to >> underlying files periodically. > > src/hotspot/share/logging/logAsyncFlusher.hpp line 64: > >> 62: if (h != NULL) { >> 63: --_size; >> 64: log_drop(h->data()); > > I'm a little unhappy that some semantics of the Dequeues basic datatype become visible in the implementation of a basic Dequeue method. E.g. "*log dropping*" is not something common for a Dequeue but for the LogAsyncFlusher. I think it would be better to drop the call to `log_drop()` here and implement this functionality right in `LogAsyncFlusher::enqueue()`. ACK. I would like to have a general purpose linked-listed deque here. I will move the log_drop() logic to LogAsyncFlusher. > src/hotspot/share/logging/logAsyncFlusher.hpp line 95: > >> 93: : _output(output), _decorators(decorations.get_decorators()), >> 94: _level(decorations.get_level()), _tagset(decorations.get_logTagSet()) { >> 95: // allow to fail here, then _message is NULL > > Why do you extract and store the `LogDecorators`, `_level` and `_tagset` set separately and re-create the `LogDecorations` in `AsyncLogMessage::writeback()`? Is it to save memory (because `LogDecorators` are much smaller than the `LogDecorations`) at the expense of time for recreating? Saving memory is my intension. To keep thing simple, I used to copy objects directly. Then I found `LogDecorations` consists of a char array(256bytes) . This information can be compressed into an uint mask, which is `LogDecorators`. Because AsyncLogMessage is the payload, which hotspot stores thousands of them in the buffer, size matters. > src/hotspot/share/logging/logFileOutput.cpp line 336: > >> 334: } >> 335: >> 336: assert(!_async_mode, "AsyncLogging is not supported yet"); > > Can you please explain in which circumstances `LogFileOutput::write(LogMessageBuffer::Iterator msg_iterator)` will be called and why it is not necessary to support `_async_mode` here? > > Looks like this method is used e.g. when doing a class loading log (`-Xlog:class+load`) and will now result in a hard crash in debug builds which is certainly not appropriate: > $ ./images/jdk/bin/java -Xlog:class+load:file=/tmp/class.log::async=true -version > # To suppress the following error report, specify this argument > # after -XX: or in .hotspotrc: SuppressErrorAt=/logFileOutput.cpp:336 > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/priv/simonisv/OpenJDK/Git/jdk/src/hotspot/share/logging/logFileOutput.cpp:336), pid=20491, tid=20492 > # assert(!_async_mode) failed: AsyncLogging is not supported yet > # > # JRE version: (17.0) (slowdebug build ) > # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.simonisv.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0xde5ae4] LogFileOutput::write(LogMessageBuffer::Iterator)+0x3c > # > # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again > # > # An error report file with more information is saved as: > # /priv/simonisv/output/jdk-dbg/hs_err_pid20491.log > # > # > Aborted I overlook it. Sorry. I didn't notice the logging construct `LogMessage`. I've fixed this issue. I will update it soon. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From erikj at openjdk.java.net Fri Mar 26 19:34:25 2021 From: erikj at openjdk.java.net (Erik Joelsson) Date: Fri, 26 Mar 2021 19:34:25 GMT Subject: RFR: 8264240: [macos_aarch64] enable appcds support after JDK-8263002 In-Reply-To: References: Message-ID: <9Omimx03OQ8un-iKii6PJNQEiBe1SmGnWrlcR3LSvx0=.2feee97b-728c-4a1a-b62f-a0474ab44dc1@github.com> On Fri, 26 Mar 2021 17:18:22 GMT, Vladimir Kempik wrote: > Please review this small patch for macos_aarch64. > It reverts small part of jep-391 where we disabled cds for macos_aarch64. > After JDK-8263002 is fixed, the appcds can be enabled back on macos_aarch64. > CDS related tests in tier1 now pass Marked as reviewed by erikj (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3221 From vkempik at openjdk.java.net Fri Mar 26 19:53:26 2021 From: vkempik at openjdk.java.net (Vladimir Kempik) Date: Fri, 26 Mar 2021 19:53:26 GMT Subject: Integrated: 8264240: [macos_aarch64] enable appcds support after JDK-8263002 In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 17:18:22 GMT, Vladimir Kempik wrote: > Please review this small patch for macos_aarch64. > It reverts small part of jep-391 where we disabled cds for macos_aarch64. > After JDK-8263002 is fixed, the appcds can be enabled back on macos_aarch64. > CDS related tests in tier1 now pass This pull request has now been integrated. Changeset: d6bb1537 Author: Vladimir Kempik URL: https://git.openjdk.java.net/jdk/commit/d6bb1537 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod 8264240: [macos_aarch64] enable appcds support after JDK-8263002 Reviewed-by: erikj ------------- PR: https://git.openjdk.java.net/jdk/pull/3221 From kvn at openjdk.java.net Fri Mar 26 22:32:33 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Fri, 26 Mar 2021 22:32:33 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: <14AlrpsyzREqo4XE-uDRzIsIeLnLcRzB5SIC9UU66Oc=.9d544536-3d41-40e7-a888-1bfd3484f8f5@github.com> Message-ID: On Fri, 26 Mar 2021 08:55:39 GMT, Christian Hagedorn wrote: >> src/hotspot/share/prims/whitebox.cpp line 869: >> >>> 867: // Both compilers could have ExcludeOption set. Check all combinations. >>> 868: bool excluded_c1 = is_excluded_for_compiler(CompileBroker::compiler1(), mh); >>> 869: bool excluded_c2 = is_excluded_for_compiler(CompileBroker::compiler2(), mh); >> >> May be use next instead as we do in `WhiteBox::compile_method` at line #992: >> *comp = CompileBroker::compiler(comp_level);``` > > The problem is that `CompileBroker::compiler()` returns `NULL` for `CompLevel_any`. And even if it returned one compiler, I also need to check the other one to decide if the method is completly non-compilable. That's why I added this additional logic for `CompLevel_any`. I thought `exclude` command does not specify which compilation level (and corresponding compiler) is disabled - it disables all compilations. But may be it is not true for directives. @neliasso, please, correct me if I am wrong. ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From coleenp at openjdk.java.net Fri Mar 26 23:08:37 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 26 Mar 2021 23:08:37 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread Message-ID: This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. Tested with tier1-7. ------------- Commit messages: - Merge branch 'master' into metaspace - Comment updates. - Add Metaspace::allocate for non-java threads that return null. - Add Metaspace::allocate for non-java threads that return null. Changes: https://git.openjdk.java.net/jdk/pull/3207/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3207&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264149 Stats: 89 lines in 10 files changed: 42 ins; 13 del; 34 mod Patch: https://git.openjdk.java.net/jdk/pull/3207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3207/head:pull/3207 PR: https://git.openjdk.java.net/jdk/pull/3207 From dholmes at openjdk.java.net Fri Mar 26 23:08:38 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Fri, 26 Mar 2021 23:08:38 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: Message-ID: <7vek4vBeq_m6kHjh6V5999ZsZYJ-SAvTg3LCtajlBrk=.25130296-3ca8-4437-9528-2a38447569db@github.com> On Thu, 25 Mar 2021 21:47:46 GMT, Coleen Phillimore wrote: > This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. > > Tested with tier1-7. src/hotspot/share/memory/metaspace.cpp line 816: > 814: } > 815: > 816: return result; Shouldn't we still try to find more memory by calling Universe::heap()->satisfy_failed_metadata_allocation before giving up? Or can that not work if called from the VMThread? src/hotspot/share/memory/metaspace.hpp line 127: > 125: MetaspaceObj::Type type, TRAPS); > 126: > 127: // Nothrow version of allocate which can be called by a non-Java thread. "Nothrow" is a C++ concept. I would just say "Non-TRAPS version ... Returns NULL on failure." src/hotspot/share/oops/method.cpp line 571: > 569: CompileBroker::log_metaspace_failure(); > 570: ClassLoaderDataGraph::set_metaspace_oom(true); > 571: return NULL; // return the exception (which is cleared) Comment needs updating ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Fri Mar 26 23:08:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 26 Mar 2021 23:08:40 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: <7vek4vBeq_m6kHjh6V5999ZsZYJ-SAvTg3LCtajlBrk=.25130296-3ca8-4437-9528-2a38447569db@github.com> References: <7vek4vBeq_m6kHjh6V5999ZsZYJ-SAvTg3LCtajlBrk=.25130296-3ca8-4437-9528-2a38447569db@github.com> Message-ID: On Thu, 25 Mar 2021 22:14:37 GMT, David Holmes wrote: >> This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. >> >> Tested with tier1-7. > > src/hotspot/share/memory/metaspace.cpp line 816: > >> 814: } >> 815: >> 816: return result; > > Shouldn't we still try to find more memory by calling Universe::heap()->satisfy_failed_metadata_allocation before giving up? Or can that not work if called from the VMThread? It can't work from the VMThread. Patricio and I were chatting yesterday and he pointed out neither of these VM operations can nest (VM_ChangeBreakpoints and VM_MetaspaceGC) making up names but you get the point. > src/hotspot/share/memory/metaspace.hpp line 127: > >> 125: MetaspaceObj::Type type, TRAPS); >> 126: >> 127: // Nothrow version of allocate which can be called by a non-Java thread. > > "Nothrow" is a C++ concept. I would just say "Non-TRAPS version ... Returns NULL on failure." Yes, true. Particularly since it had to be declared with throws(). > src/hotspot/share/oops/method.cpp line 571: > >> 569: CompileBroker::log_metaspace_failure(); >> 570: ClassLoaderDataGraph::set_metaspace_oom(true); >> 571: return NULL; // return the exception (which is cleared) > > Comment needs updating got it. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Fri Mar 26 23:08:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Fri, 26 Mar 2021 23:08:40 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: <7vek4vBeq_m6kHjh6V5999ZsZYJ-SAvTg3LCtajlBrk=.25130296-3ca8-4437-9528-2a38447569db@github.com> Message-ID: On Fri, 26 Mar 2021 13:25:24 GMT, Coleen Phillimore wrote: >> src/hotspot/share/memory/metaspace.hpp line 127: >> >>> 125: MetaspaceObj::Type type, TRAPS); >>> 126: >>> 127: // Nothrow version of allocate which can be called by a non-Java thread. >> >> "Nothrow" is a C++ concept. I would just say "Non-TRAPS version ... Returns NULL on failure." > > Yes, true. Particularly since it had to be declared with throws(). got it! ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From xliu at openjdk.java.net Fri Mar 26 23:25:31 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Fri, 26 Mar 2021 23:25:31 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 10:02:37 GMT, Volker Simonis wrote: >> This patch provides a buffer to store asynchrounous messages and flush them to >> underlying files periodically. > > Hi Xin, > > thanks for finally addressing this issue. In general your change looks good. Please find my detailed comments inline. > > Best regards, > Volker hi, Thomas, Thank you for reviewing this PR. > Hi Xin, > > I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description. > > * Who does the writing, and who is affected when the writing stalls? The WatchThread eventually flushes those buffered messages. if the writing stalls, it blocks periodic tasks. It blocks long enough, other periodic tasks are skipped. > * Do you then block or throw output away? > > * If the former, how do you mitigate the ripple effect? > * If the latter, how does the reader of the log file know that something is missing? The capacity of buffer is limited, which is `AsyncLogBufferSize` (2k by default). Actually, logTagSet.cpp limits the maximal length of a vwrite is 512. That means that maximal memory used by this buffer is 1M (=2k * 0.5k). If the buffer overflows, it starts dropping the heads. this behavior simulates a ringbuffer. If you enable `-XX:+Verbose`, the dropping message will be printed to the tty console. I prefer to drop messages than keeping them growing because later may trigger out-of-memory error. > * How often do you flush? How do you prevent missing output in the log file in case of crashes? The interval is defined by `LogAsyncInterval` (300ms by default). I insert a statement `async->flusher()` in `ostream_abort()`. > * Can this really the full brunt of logging (-Xlog:*=trace) over many threads? to be honest, it can't. I see a lot of dropping messages on console with -XX:+Verbose. I have tuned parameters that it won't drop messages easily for normal GC activity with info verbosity. `-Xlog:*=trace` will drop messages indeed, but this is tunable. I have a [stress test](https://github.com/navyxliu/JavaGCworkload/blob/master/runJavaUL.sh) to tweak parameters. > * Does this work with multiple target and multiple IO files? Yes, it works if you have multiple outputs. `LogAsyncFlusher` is singleton. one single buffer and one thread serve them all. > * Does it cost anything if logging is off or not async? > so far, LogAsyncFlusher as a periodic task remains active even no output is in async_mode. it wakes up every `LogAsyncInterval` ms. it's a dummy task because the deque is always empty. the cost is almost nothing. > Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread. > > I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling. > You concern is reasonable. I don't understand why there is only one watchThread and up to 10 periodic tasks are crowded in it. If it's a bottleneck, I plan to improve this infrastructure. I can make hotspot supports multiple watcher threads and spread periodic tasks among them. All watcher threads are connected using linked list to manage. Can we treat it as a separated task? for normal usage, I think the delay is quite managed. Writing thousands of lines to a file usually can be done in sub-ms. > * How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking. IMHO, logging shouldn't hurt performance a lot. At least, those do impact on performance are not supposed to enable by default. On the other side, I hope logging messages from other threads avoid from interweaving when I enable them to read. That leads me to use mutex. That actually improves readability. My design target is non-blocking. pop_all() is an ad-hoc operation which pop up all elements and release the mutex immediately. writeback() does IO without it. In our real applications, we haven't seen this feature downgrade GC performance yet. > > I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think. > > Cheers, Thomas I believe UL has its own reasons. In my defense, I don't make UL more complex. I only changed a couple of lines in one of its implementation file(logFileOutput.cpp) and didn't change its interfaces. I try my best to reuse existing codebase. We can always refactor existing code([JDK-8239066](https://bugs.openjdk.java.net/browse/JDK-8239066), [JDK-8263840](https://bugs.openjdk.java.net/browse/JDK-8263840)), but it's not this PR's purpose. thanks, --lx ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From david.holmes at oracle.com Sat Mar 27 00:08:20 2021 From: david.holmes at oracle.com (David Holmes) Date: Sat, 27 Mar 2021 10:08:20 +1000 Subject: os_windows.cpp : simplify is_thread_cpu_time_supported ? In-Reply-To: References: <3fbc20cb-02ff-178a-32c8-18934a5949e9@oracle.com> Message-ID: <3b9acb57-c17f-c767-efcd-4e3c929118d5@oracle.com> On 26/03/2021 6:06 pm, Baesken, Matthias wrote: > Hi David, thanks for the info . > > I found https://docs.microsoft.com/en-us/windows/win32/procthread/thread-security-and-access-rights > so it looks like we need THREAD_QUERY_INFORMATION or THREAD_QUERY_LIMITED_INFORMATION access right for GetThreadTimes . > > On the other hand , the test in os::is_thread_cpu_time_supported() on Windows might (temporary ?) fail for other reasons too , it is not clear to me if this is really always > related to the wrong access rights ? Sure it could fail for other reasons (unfortunately win32 docs don't actually list them). So I tend to agree that this kind of check as a global "do we support thread cpu times" is not the right test to do. The question is really about "does this platform provide an API for getting the thread cpu time" - and it does. Whether you can query a given target thread at runtime is a different matter altogether. I wish I knew exactly what caused this check to be put in place as it doesn't seem appropriate. Maybe someone from serviceablity (cc'd) remembers? > And at some places in HS code like jfrThreadCPULoadEvent.cpp , os::thread_cpu_time is called anyway without checking for os::is_thread_cpu_time_supported() ; > same for thread.cpp / Thread::print_on but this is just printing some output so it is most likely not really a big issue on Windows . As long as they can tolerate the -1 return if it fails then it is up to that code whether or not to bother with the check. But I think the check is primarily for the M&M/JVMTI capability checking. Cheers, David > Best regards, Matthias > > > > >>> On 25/03/2021 11:49 pm, Baesken, Matthias wrote: >>>> Hello,? I wonder , should we just return? true? in >>>> os::is_thread_cpu_time_supported()?? on Windows? ? >>>> >>>> See >>>> >>>> https://github.com/openjdk/jdk/blob/master/src/hotspot/os/windows/os_windows.cpp#L4588 >>>> >>>> According to? MSDN >>>> https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getthreadtimes >>>> >>>> GetThreadTimes is supported? on Win2003/XP and higher . This should be >>>> fine for OpenJDK . >>> >>> Yes it should be fine. There may be other Windows archaisms in the code >>> that could be cleaned up now. >> >> The issue was not API availability (we got rid of that check a long time >> ago) but security permissions. We actually just returned "true" prior to >> JDK 5 but that was changed by JDK-4884677 when the JVM TI support was added. >> >> It is a bit messy. We use the result of >> os::is_thread_cpu_time_supported() at initialization time, on the main >> thread to then decide the global availability of this feature. And via >> the normal launcher that thread will have all access bits set and so we >> will flag thread_cpu_time as being available. At runtime we might >> encounter a thread for which the access bits are not present and so the >> actual get_thread_cpu_time call may return -1. In theory the JVM could >> be loaded on a thread without full permissions and so we would then >> globally disable thread_cpu_time. >> >> So I think this code has to stay. > > From xliu at openjdk.java.net Sat Mar 27 01:11:28 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Sat, 27 Mar 2021 01:11:28 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: <3fsM3PwlHovO7vBN-PucRgnfOIhSn-pIiAgQ7rpha2Y=.3029718b-4af9-43e1-b9c6-722c8489e025@github.com> On Fri, 26 Mar 2021 23:22:25 GMT, Xin Liu wrote: >> Hi Xin, >> >> thanks for finally addressing this issue. In general your change looks good. Please find my detailed comments inline. >> >> Best regards, >> Volker > > hi, Thomas, > > Thank you for reviewing this PR. > >> Hi Xin, >> >> I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description. >> >> * Who does the writing, and who is affected when the writing stalls? > > The WatchThread eventually flushes those buffered messages. if the writing stalls, it blocks periodic tasks. > It blocks long enough, other periodic tasks are skipped. > >> * Do you then block or throw output away? >> >> * If the former, how do you mitigate the ripple effect? >> * If the latter, how does the reader of the log file know that something is missing? > > The capacity of buffer is limited, which is `AsyncLogBufferSize` (2k by default). > Actually, logTagSet.cpp limits the maximal length of a vwrite is 512. That means that maximal memory used by this buffer is 1M (=2k * 0.5k). > > If the buffer overflows, it starts dropping the heads. this behavior simulates a ringbuffer. > If you enable `-XX:+Verbose`, the dropping message will be printed to the tty console. > > I prefer to drop messages than keeping them growing because later may trigger out-of-memory error. > >> * How often do you flush? How do you prevent missing output in the log file in case of crashes? > > The interval is defined by `LogAsyncInterval` (300ms by default). I insert a statement `async->flusher()` in `ostream_abort()`. > >> * Can this really the full brunt of logging (-Xlog:*=trace) over many threads? > to be honest, it can't. I see a lot of dropping messages on console with -XX:+Verbose. > > I have tuned parameters that it won't drop messages easily for normal GC activity with info verbosity. > `-Xlog:*=trace` will drop messages indeed, but this is tunable. I have a [stress test](https://github.com/navyxliu/JavaGCworkload/blob/master/runJavaUL.sh) to tweak parameters. > >> * Does this work with multiple target and multiple IO files? > > Yes, it works if you have multiple outputs. `LogAsyncFlusher` is singleton. one single buffer and one thread serve them all. > >> * Does it cost anything if logging is off or not async? >> > so far, LogAsyncFlusher as a periodic task remains active even no output is in async_mode. > it wakes up every `LogAsyncInterval` ms. it's a dummy task because the deque is always empty. the cost is almost nothing. > > >> Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread. >> >> I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling. >> > > You concern is reasonable. I don't understand why there is only one watchThread and up to 10 periodic tasks are crowded in it. > If it's a bottleneck, I plan to improve this infrastructure. I can make hotspot supports multiple watcher threads and spread periodic tasks among them. All watcher threads are connected using linked list to manage. > > Can we treat it as a separated task? for normal usage, I think the delay is quite managed. Writing thousands of lines to a file usually can be done in sub-ms. > >> * How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking. > > IMHO, logging shouldn't hurt performance a lot. At least, those do impact on performance are not supposed to enable by default. On the other side, I hope logging messages from other threads avoid from interweaving when I enable them to read. > That leads me to use mutex. That actually improves readability. > > My design target is non-blocking. pop_all() is an ad-hoc operation which pop up all elements and release the mutex immediately. writeback() does IO without it. > > In our real applications, we haven't seen this feature downgrade GC performance yet. >> >> I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think. >> >> Cheers, Thomas > > I believe UL has its own reasons. In my defense, I don't make UL more complex. I only changed a couple of lines in one of its implementation file(logFileOutput.cpp) and didn't change its interfaces. > > I try my best to reuse existing codebase. We can always refactor existing code([JDK-8239066](https://bugs.openjdk.java.net/browse/JDK-8239066), [JDK-8263840](https://bugs.openjdk.java.net/browse/JDK-8263840)), but it's not this PR's purpose. > > thanks, > --lx > _Mailing list message from [Thomas St??fe](mailto:thomas.stuefe at gmail.com) on [hotspot-dev](mailto:hotspot-dev at openjdk.java.net):_ > > Can you please link the CSR to the issue? > > On Fri, Mar 26, 2021 at 8:21 AM Yasumasa Suenaga > wrote: https://bugs.openjdk.java.net/browse/JDK-8264323 ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From stuefe at openjdk.java.net Sat Mar 27 07:30:31 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Sat, 27 Mar 2021 07:30:31 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> On Mon, 22 Mar 2021 22:12:14 GMT, Xin Liu wrote: > This patch provides a buffer to store asynchrounous messages and flush them to > underlying files periodically. Hi Xin, thank you for your detailed answers. As I wrote, I think this is a useful change. A prior design discussion with a rough sketch would have made things easier. Also, it would have been good to have the CSR discussion beforehand, since it affects how complex the implementation needs to be. I don't know whether there had been design discussions beforehand; if I missed them, I apologize. I am keenly aware that design discussions often lead nowhere because no-one answers. So I understand why you started with a patch. About your proposal: I do not think it can be made airtight, and I think that is okay - if we work with a limited flush buffer and we log too much, things will get dropped, that is unavoidable. But it has to be reliable and comprehensible after the fact. As you write, the patch you propose works well with AWS, but I suspect that is an environment with limited variables, and outside use of the VM could be much more diverse. We must make sure to roll out only well designed solutions which work for us all. E.g. a log system which randomly omits log entries because some internal buffer is full without giving any indication *in the log itself* is a terrible idea :). Since log files are a cornerstone for our support, I am interested in a good solution. First off, the CSR: --- 1) With what you propose, we could have a arbitrary combination of targets with different log files and different async options: java -Xlog:os*:file=os.log::async=false -Xlog:os+thread:file=thread.log::async=true Do we really need that much freedom? How probable is that someone wants different async options for different trace sinks? The more freedom we have here the more complex the implementation gets. All that stuff has to be tested. Why not just make "async" a global setting. 2) AsyncLogBufferSize should be a user-definable memory size, not "number of slots". The fact that internally we keep a vector of disjunct memory snippets is an implementation detail; the user should just give a memory size and the VM should interprete this. This leaves us the freedom to later change the implementation as we see fit. 3) LogAsyncInterval should not exist at all. If it has to exist, it should be a diagnostic switch, not a production one; but ideally, we would just log as soon as there is something to log, see below. --- Implementation: The use of the WatcherThread and PeriodicTask. Polling is plain inefficient, beside the concerns Robbin voiced about blocking things. This is a typical producer-consumer problem, and I would implement it using an own dedicated flusher thread and a monitor. The flusher thread should wake only if there is something to write. This is something I would not do in a separate RFE but now. It would also disarm any arguments against blocking the WatcherThread. ---- The fact that every log message gets strduped could be done better. This can be left for a future RFE - but it explains why I dislike "AsyncLogBufferSize" being "number of entries" instead of a memory size. I think processing a memory-size AsyncLogBufferSize can be kept simple: it would be okay to just guess an average log line length and go with that. Lets say 256 chars. An AsyncLogBufferSize=1M could thus be translated to 4096 entries in your solution. If the sum of all 4096 allocated lines overshoots 1M from time to time, well so be it. A future better solution could use a preallocated fixed sized buffer. There are two ways to do this, the naive but memory inefficient way - array of fixed sized text slots like the event system does. And a smart way: a ring buffer of variable sized strings, '\0' separated, laid out one after the other in memory. The latter is a bit more involved, but can be done, and it would be fast and very memory efficient. But as I wrote, this is an optimization which can be postponed. ---- I may misunderstand the patch, but do you resolve decorators when the flusher is printing? Would this not distort time-dependent decorators (timemillis, timenanos, uptime etc)? Since we record the time of printing, not the time of logging?. If yes, it may be better to resolve the message early and just store the plain string and print that. Basically this would mean to move the whole buffering down a layer or two right at where the raw strings get printed. This would be vastly simplified if we abandon the "async for every trace sink" notion in favor of just a global flag. This would also save a bit of space, since we would not have to carry all the meta information in `AsyncLogMessage` around. I count at least three 64bit slots, possibly 4-5, which alone makes for ~40 bytes per message. Resolved decorators are often smaller than that. Please find further remarks inline. > hi, Thomas, > > Thank you for reviewing this PR. > > > Hi Xin, > > I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description. > > > > * Who does the writing, and who is affected when the writing stalls? > > The WatchThread eventually flushes those buffered messages. if the writing stalls, it blocks periodic tasks. > It blocks long enough, other periodic tasks are skipped. > > > * Do you then block or throw output away? > > > > * If the former, how do you mitigate the ripple effect? > > * If the latter, how does the reader of the log file know that something is missing? > > The capacity of buffer is limited, which is `AsyncLogBufferSize` (2k by default). > Actually, logTagSet.cpp limits the maximal length of a vwrite is 512. That means that maximal memory used by this buffer is 1M (=2k * 0.5k). > > If the buffer overflows, it starts dropping the heads. this behavior simulates a ringbuffer. > If you enable `-XX:+Verbose`, the dropping message will be printed to the tty console. > > I prefer to drop messages than keeping them growing because later may trigger out-of-memory error. > > > * How often do you flush? How do you prevent missing output in the log file in case of crashes? > > The interval is defined by `LogAsyncInterval` (300ms by default). I insert a statement `async->flusher()` in `ostream_abort()`. > If the flusher blocks, this could block VM shutdown? Would this be different from what we do now, e.g. since all log output is serialized and done by one thread? Its probably fine, but we should think about this. > > * Can this really the full brunt of logging (-Xlog:*=trace) over many threads? > > to be honest, it can't. I see a lot of dropping messages on console with -XX:+Verbose. > > I have tuned parameters that it won't drop messages easily for normal GC activity with info verbosity. > `-Xlog:*=trace` will drop messages indeed, but this is tunable. I have a [stress test](https://github.com/navyxliu/JavaGCworkload/blob/master/runJavaUL.sh) to tweak parameters. > > > * Does this work with multiple target and multiple IO files? > > Yes, it works if you have multiple outputs. `LogAsyncFlusher` is singleton. one single buffer and one thread serve them all. The question was how we handle multiple trace sinks, see my "CSR" remarks. > > > * Does it cost anything if logging is off or not async? > > so far, LogAsyncFlusher as a periodic task remains active even no output is in async_mode. > it wakes up every `LogAsyncInterval` ms. it's a dummy task because the deque is always empty. the cost is almost nothing. > > > Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread. > > I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling. > > You concern is reasonable. I don't understand why there is only one watchThread and up to 10 periodic tasks are crowded in it. > If it's a bottleneck, I plan to improve this infrastructure. I can make hotspot supports multiple watcher threads and spread periodic tasks among them. All watcher threads are connected using linked list to manage. > > Can we treat it as a separated task? for normal usage, I think the delay is quite managed. Writing thousands of lines to a file usually can be done in sub-ms. > > > * How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking. > > IMHO, logging shouldn't hurt performance a lot. At least, those do impact on performance are not supposed to enable by default. On the other side, I hope logging messages from other threads avoid from interweaving when I enable them to read. > That leads me to use mutex. That actually improves readability. > > My design target is non-blocking. pop_all() is an ad-hoc operation which pop up all elements and release the mutex immediately. writeback() does IO without it. Since you use a mutex it introduces synchronization, however short, across all logging threads. So it influences runtime behavior. For the record, I think this is okay; maybe a future RFE could improve this with a lockless algorithm. I just wanted to know if you measured anything, and I was curious whether there is a difference now between synchronous and asynchronous logging. (Funnily, asynchronous logging is really more synchronous in a sense, since it synchronizes all logging threads across a common resource). > > In our real applications, we haven't seen this feature downgrade GC performance yet. > > > I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think. > > Cheers, Thomas > > I believe UL has its own reasons. In my defense, I don't make UL more complex. I only changed a couple of lines in one of its implementation file(logFileOutput.cpp) and didn't change its interfaces. > > I try my best to reuse existing codebase. We can always refactor existing code([JDK-8239066](https://bugs.openjdk.java.net/browse/JDK-8239066), [JDK-8263840](https://bugs.openjdk.java.net/browse/JDK-8263840)), but it's not this PR's purpose. > I understand. Its fine to do this in a later RFE. > thanks, > --lx Cheers, Thomas src/hotspot/share/logging/logAsyncFlusher.cpp line 33: > 31: // should cache this object somehow > 32: LogDecorations decorations(_level, _tagset, _decorators); > 33: _output.write_blocking(decorations, _message); Would this mean that time dependent decorators get resolved at print time, not when the log happen? ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From dongbo at openjdk.java.net Sat Mar 27 09:05:45 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Sat, 27 Mar 2021 09:05:45 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic Message-ID: In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. There can be illegal characters at the start of the input if the data is MIME encoded. It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. A JMH micro, Base64Decode.java, is added for performance test. With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. The Base64Decode.java JMH micro-benchmark results: # Kunpeng916, intrinsic Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op # Kunpeng916, default Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op ------------- Commit messages: - 8256245: AArch64: Implement Base64 decoding intrinsic Changes: https://git.openjdk.java.net/jdk/pull/3228/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8256245 Stats: 410 lines in 3 files changed: 410 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Sat Mar 27 09:56:27 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 27 Mar 2021 09:56:27 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Sat, 27 Mar 2021 08:58:03 GMT, Dong Bo wrote: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op Firstly, I wonder how important this is for most applications. I don't actually know, but let's put that to one side. There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Sat Mar 27 10:00:27 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 27 Mar 2021 10:00:27 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Sat, 27 Mar 2021 08:58:03 GMT, Dong Bo wrote: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5578: > 5576: void generate_base64_decode_nosimdround(Register src, Register dst, > 5577: Register nosimd_codec, Label &Exit) > 5578: { We'd want enter/leave here so profiling tools can walk the stack. src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5728: > 5726: > 5727: static const uint8_t fromBase64ForNoSIMD[256] = { > 5728: 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, There seems to be no documentation of these magic tables of constants. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Sat Mar 27 10:07:24 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Sat, 27 Mar 2021 10:07:24 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Sat, 27 Mar 2021 09:53:37 GMT, Andrew Haley wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > Firstly, I wonder how important this is for most applications. I don't actually know, but let's put that to one side. > > There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. Please consider losing the non-SIMD case. It doesn't result in any significant gain. > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5728: > >> 5726: >> 5727: static const uint8_t fromBase64ForNoSIMD[256] = { >> 5728: 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, 255u, > > There seems to be no documentation of these magic tables of constants. We're either going to need a proper description of the algorithm here or a permalink to one. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From jrose at openjdk.java.net Sat Mar 27 23:06:31 2021 From: jrose at openjdk.java.net (John R Rose) Date: Sat, 27 Mar 2021 23:06:31 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers In-Reply-To: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: On Fri, 26 Mar 2021 11:52:58 GMT, Stefan Karlsson wrote: > The JIT compiler embeds pointers to addresses within an object. These are called derived pointers. When the GC moves objects, these pointers need to be updated explicitly, because the GC only deals with the real oops of the objects (base pointer). > > The code that deals with this uses oop* for the address containing the base pointer. This is fine, the address contains an oop. However, it also uses oop* for the interior pointer, even though the contents is not a valid oop. > > This creates temporary oops that does not conform to the normal requirements for oops. For example, the lower three bits could be set. This makes it problematic to write stricter verification code. > > I propose that we use intptr_t* instead of oop*, and only use oop* when the location is known to contain a valid oop. Looks good. Maybe add `typedef intptr_t derived_oop_t` to make the operations on d-oops a little more distinctive. Maybe add a static assert that `sizeof(oop) == sizeof(intptr_t)`, just in case there's a problem with compressed oops in the future. ------------- Marked as reviewed by jrose (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3214 From xliu at openjdk.java.net Sun Mar 28 01:15:28 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Sun, 28 Mar 2021 01:15:28 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: References: Message-ID: <0mJuAYPmQ-XtnGIaac3yODDMPpXGqRf8wbag9AFK2YU=.5a9f5e5d-c45c-4efb-af64-f97f24b4aabf@github.com> On Fri, 26 Mar 2021 09:41:01 GMT, Volker Simonis wrote: >> This patch provides a buffer to store asynchrounous messages and flush them to >> underlying files periodically. > > src/hotspot/share/logging/logFileOutput.cpp line 324: > >> 322: if (_async_mode && flusher != NULL) { >> 323: flusher->enqueue(*this, decorations, msg); >> 324: return 0; > > I think the contract of `LogFileOutput::write()` is not clear. Should this return the number of characters that have been actually written out or the number of characters that have been consumed. For the time beeing this doesn't seem to be a problem though, because the current callers of `LogFileOutput::write()` don't seem to check the return value anyway. yes, it seems that it doesn't matter. logTagSet.cpp don't read the return value of LogOutput::write(). 'return 0' may not be good here. IMHO, we can return -1, which means unknown or NA. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From xliu at openjdk.java.net Sun Mar 28 01:46:47 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Sun, 28 Mar 2021 01:46:47 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: References: Message-ID: <7Sga0HYSWMnzwfPg6xQMYYmFU30RFBm4J6zJQvWMaoE=.c6ad82e4-09c9-4b1b-8dec-3cbed124f045@github.com> > This patch provides a buffer to store asynchrounous messages and flush them to > underlying files periodically. Xin Liu has updated the pull request incrementally with two additional commits since the last revision: - 8229517: Support for optional asynchronous/buffered logging add a constraint for the option LogAsyncInterval. - 8229517: Support for optional asynchronous/buffered logging LogMessage supports async_mode. remove the option AsyncLogging renanme the option GCLogBufferSize to AsyncLogBufferSize move drop_log() to LogAsyncFlusher. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3135/files - new: https://git.openjdk.java.net/jdk/pull/3135/files/28718943..bcefbecb Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3135&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3135&range=00-01 Stats: 120 lines in 8 files changed: 84 ins; 11 del; 25 mod Patch: https://git.openjdk.java.net/jdk/pull/3135.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3135/head:pull/3135 PR: https://git.openjdk.java.net/jdk/pull/3135 From xliu at openjdk.java.net Sun Mar 28 01:46:47 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Sun, 28 Mar 2021 01:46:47 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: References: <5Q-bprPy9TFVeMmBi-SlVfk9akgnkAxXIEFm2oz-GDU=.af4d4660-b6f3-454c-83b3-b40adb043590@github.com> Message-ID: On Fri, 26 Mar 2021 19:05:28 GMT, Xin Liu wrote: >> Agreed. No need for an extra option. Handle it like other logging framework defaults (e.g. `DefaultFileCount` or `DefaultFileSize`) >> >> You also have to check that `LogAsyncInterval` is zero modulo 10 otherwise you'll get: >> $ ./images/jdk/bin/java -XX:LogAsyncInterval=13 -Xlog:gc:file=/tmp/class.log::async=true -version >> # To suppress the following error report, specify this argument >> # after -XX: or in .hotspotrc: SuppressErrorAt=/task.cpp:73 >> # >> # A fatal error has been detected by the Java Runtime Environment: >> # >> # Internal Error (/priv/simonisv/OpenJDK/Git/jdk/src/hotspot/share/runtime/task.cpp:73), pid=21168, tid=21169 >> # assert(_interval >= PeriodicTask::min_interval && _interval % PeriodicTask::interval_gran == 0) failed: improper PeriodicTask interval time >> # >> # JRE version: (17.0) (slowdebug build ) >> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 17-internal+0-adhoc.simonisv.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # Problematic frame: >> # V [libjvm.so+0x1244301] PeriodicTask::PeriodicTask(unsigned long)+0x83 >> # >> # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again >> # >> # An error report file with more information is saved as: >> # /priv/simonisv/output/jdk-dbg/hs_err_pid21168.log >> # >> # >> Aborted > > oh, I don't know this constraint. I will add a constraint check for the option. add a constraint for it. now it will get the error message for 13. java -XX:LogAsyncInterval=13 -Xlog:gc:file=/tmp/class.log::async=true -version LogAsyncInterval (13) must be evenly divisible by PeriodicTask::interval_gran (10) Improperly specified VM option 'LogAsyncInterval=13' Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From xliu at openjdk.java.net Sun Mar 28 02:05:28 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Sun, 28 Mar 2021 02:05:28 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> Message-ID: On Sat, 27 Mar 2021 06:41:35 GMT, Thomas Stuefe wrote: >> Xin Liu has updated the pull request incrementally with two additional commits since the last revision: >> >> - 8229517: Support for optional asynchronous/buffered logging >> >> add a constraint for the option LogAsyncInterval. >> - 8229517: Support for optional asynchronous/buffered logging >> >> LogMessage supports async_mode. >> remove the option AsyncLogging >> renanme the option GCLogBufferSize to AsyncLogBufferSize >> move drop_log() to LogAsyncFlusher. > > src/hotspot/share/logging/logAsyncFlusher.cpp line 33: > >> 31: // should cache this object somehow >> 32: LogDecorations decorations(_level, _tagset, _decorators); >> 33: _output.write_blocking(decorations, _message); > > Would this mean that time dependent decorators get resolved at print time, not when the log happen? yes, it is. I just realize that this does distort the timestamps of relevant decorators. I explain to Volker why I choose to compress `LogDecorations` to `LogDecorators`. https://github.com/openjdk/jdk/pull/3135#discussion_r602538305 This issue is similar to "safepoint-bias" in Java profiler. Those timestamps "bias" to the flusher, but the error is limited to `AsyncLogInterval`. We could spend extra efforts to keep the original timestamps. Is it worth it? ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From yyang at openjdk.java.net Sun Mar 28 04:43:41 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Sun, 28 Mar 2021 04:43:41 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> Message-ID: On Tue, 23 Mar 2021 13:47:39 GMT, Jorn Vernee wrote: > Based on Ioi's suggestion I decided to try with a different locale as well. I tried setting my system locale to something else and with that I was able to reproduce the warnings you report, so it _could_ be an issue with locale settings. AFAIK only `en-us` is supported. Maybe you could confirm/check your locale settings as well? (can run `systeminfo` to get the current setting) > > I've had problems in the past as well because I had the wrong locale set, and some of the tests were failing because of that. So, maybe rather than disabling the warnings, it might be more prudent to change the system locale of the used build systems to prevent similar issues in the future (FWIW, the display language doesn't seem to affect `cl` so that could still be whatever is convenient). Hi Jorn, Sorry for the delayed response. I set the locale of my Cygwin environment to en-us via `export LC_ALL="en_US.UTF-8"`, these warnings are generated when compiling as well as before. Should I change this setting globally instead of just changing it in Cygwin? Anyway, it seems that this problem is caused by the locale setting because as you mentioned, this problem appears when you change the locale setting to Chinese. Setting the locale to English does not have this problem. I checked the building document, but there is no mention of the need to set the locale option to en-us before building JDK. If this is really a necessary step for building, I think we should add this step in the building document, otherwise, I think we should fix this problem in HotSpot. Best Regards Yang ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From kbarrett at openjdk.java.net Sun Mar 28 12:02:42 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Sun, 28 Mar 2021 12:02:42 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers In-Reply-To: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: On Fri, 26 Mar 2021 11:52:58 GMT, Stefan Karlsson wrote: > The JIT compiler embeds pointers to addresses within an object. These are called derived pointers. When the GC moves objects, these pointers need to be updated explicitly, because the GC only deals with the real oops of the objects (base pointer). > > The code that deals with this uses oop* for the address containing the base pointer. This is fine, the address contains an oop. However, it also uses oop* for the interior pointer, even though the contents is not a valid oop. > > This creates temporary oops that does not conform to the normal requirements for oops. For example, the lower three bits could be set. This makes it problematic to write stricter verification code. > > I propose that we use intptr_t* instead of oop*, and only use oop* when the location is known to contain a valid oop. I agree with jrose that having a type name for derived pointers would be better. But I wondered if it might be possible to make it stronger than just an alias type for intptr_t. I tried making it an enum class with a few supporting operations, and it doesn't look too bad. https://github.com/kimbarrett/openjdk-jdk/tree/derived_ptr src/hotspot/share/compiler/oopMap.cpp line 196: > 194: } > 195: > 196: static void add_derived_oop(oop* base, intptr_t* derived, OopClosure* oop_fn) { `add_derived_oop` and `ignore_derived_oop` ignore `oop_fn`. Maybe they should assert that it's `&do_nothing_cl` ? src/hotspot/share/compiler/oopMap.cpp line 290: > 288: intptr_t* derived_loc = (intptr_t*)fr->oopmapreg_to_location(omv.reg(),reg_map); > 289: guarantee(derived_loc != NULL, "missing saved register"); > 290: oop *base_loc = fr->oopmapreg_to_oop_location(omv.content_reg(), reg_map); `oop *base_loc` -> `oop* base_loc` ------------- PR: https://git.openjdk.java.net/jdk/pull/3214 From pli at openjdk.java.net Mon Mar 29 01:14:30 2021 From: pli at openjdk.java.net (Pengfei Li) Date: Mon, 29 Mar 2021 01:14:30 GMT Subject: Integrated: 8264006: Fix AOT library loading on CPUs with 256-byte dcache line In-Reply-To: References: Message-ID: <_eWKirz3bzEYVelRbmQYhMbDYIqEYzIrGhpOFW7GUZM=.775fde05-7a99-4d2d-a0bf-e6b7b6783974@github.com> On Wed, 24 Mar 2021 08:04:47 GMT, Pengfei Li wrote: > Recently we tested OpenJDK on some CPUs with 256-byte dcache line size. > HotSpot AOT tests failed because the shared library compiled with the > same VM options on the same machine are skipped when loaded back. > > Below command sequence shows a simple way to reproduce this issue. > > $ getconf -a | grep LEVEL1_DCACHE_LINESIZE > LEVEL1_DCACHE_LINESIZE 256 > > $ jaotc --output a.so Hello.class > > $ java -XX:+UnlockExperimentalVMOptions -XX:+UseAOT -XX:AOTLibrary=./a.so -XX:+PrintAOT Hello > Shared file ./a.so error: ContendedPaddingWidth has different value '256' from current '128' > 4 1 skipped ./a.so aot library > > The default value of VM option `ContendedPaddingWidth` is 128. But on CPUs > with L1 dcache line size larger than 128 bytes, the value is adjusted to > the cache line size in `VM_Version_init()`. This adjustment is done after > AOT library loading in `codeCache_init()`. So the AOT lib verifier still > assumes the `ContendedPaddingWidth` in the compiled library should be 128 > and thus causes the loaded library skipped. > > In my proposed fix, `AOTLoader::initialize()` is moved out of the general > codecache initialization and placed after `VM_Version_init()`. The order > of `codeCache_init()` and `VM_Version_init()` is not changed since there may > be code emitted during `VM_Version_init()`, which depends on the general > codecache init. > > Tested `hotspot::hotspot_all_no_apps`, `jdk::jdk_core` and `langtools::tier1`. This pull request has now been integrated. Changeset: 2fa6a3c4 Author: Pengfei Li URL: https://git.openjdk.java.net/jdk/commit/2fa6a3c4 Stats: 10 lines in 4 files changed: 5 ins; 0 del; 5 mod 8264006: Fix AOT library loading on CPUs with 256-byte dcache line Reviewed-by: kvn, dholmes, aph ------------- PR: https://git.openjdk.java.net/jdk/pull/3169 From david.holmes at oracle.com Mon Mar 29 01:49:50 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Mar 2021 11:49:50 +1000 Subject: RFR: 8229517: Support for optional asynchronous/buffered logging In-Reply-To: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> Message-ID: Hi Xin, On 27/03/2021 5:30 pm, Thomas Stuefe wrote: > On Mon, 22 Mar 2021 22:12:14 GMT, Xin Liu wrote: > >> This patch provides a buffer to store asynchrounous messages and flush them to >> underlying files periodically. IMO the discussions last year left it still an open question whether this was a general purpose facility that we really needed/wanted in hotspot. I think there is still a larger design discussion to be had for a general purpose facility versus the in-house version you have been using that suited your needs. I'm piggy-backing on some of Thomas's comments below. > Hi Xin, > > thank you for your detailed answers. > > As I wrote, I think this is a useful change. A prior design discussion with a rough sketch would have made things easier. Also, it would have been good to have the CSR discussion beforehand, since it affects how complex the implementation needs to be. I don't know whether there had been design discussions beforehand; if I missed them, I apologize. > > I am keenly aware that design discussions often lead nowhere because no-one answers. So I understand why you started with a patch. > > About your proposal: > > I do not think it can be made airtight, and I think that is okay - if we work with a limited flush buffer and we log too much, things will get dropped, that is unavoidable. But it has to be reliable and comprehensible after the fact. > > As you write, the patch you propose works well with AWS, but I suspect that is an environment with limited variables, and outside use of the VM could be much more diverse. We must make sure to roll out only well designed solutions which work for us all. > > E.g. a log system which randomly omits log entries because some internal buffer is full without giving any indication *in the log itself* is a terrible idea :). Since log files are a cornerstone for our support, I am interested in a good solution. > > First off, the CSR: > > --- > 1) With what you propose, we could have a arbitrary combination of targets with different log files and different async options: > java -Xlog:os*:file=os.log::async=false -Xlog:os+thread:file=thread.log::async=true > > > Do we really need that much freedom? How probable is that someone wants different async options for different trace sinks? The more freedom we have here the more complex the implementation gets. All that stuff has to be tested. Why not just make "async" a global setting. Truly global or global for all actual file-based logging? I think perhaps the latter. If we need per log file control that could be added later if needed. > 2) AsyncLogBufferSize should be a user-definable memory size, not "number of slots". The fact that internally we keep a vector of disjunct memory snippets is an implementation detail; the user should just give a memory size and the VM should interprete this. This leaves us the freedom to later change the implementation as we see fit. I'm not sure it should be a bounded size at all. I don't like the idea of needing yet another VM flag to control this. Why can't the design accommodate unlimited buffer space as determined by available memory? > 3) LogAsyncInterval should not exist at all. If it has to exist, it should be a diagnostic switch, not a production one; but ideally, we would just log as soon as there is something to log, see below. The logging interval should be configurable IMO, so it either needs a product switch, or preferably is set as a global logging option via the Xlog command if that is possible. > > --- > > Implementation: > > The use of the WatcherThread and PeriodicTask. Polling is plain inefficient, beside the concerns Robbin voiced about blocking things. This is a typical producer-consumer problem, and I would implement it using an own dedicated flusher thread and a monitor. The flusher thread should wake only if there is something to write. This is something I would not do in a separate RFE but now. It would also disarm any arguments against blocking the WatcherThread. I agree with Thomas here. Using the WatcherThread for this is not really appropriate, though I understand it is convenient to use a thread that is already present and, crucially, one that does not block for safepoints. And we don't need multiple WatcherThreads or a way to "spread the load" as the Watcher thread is very lightly loaded. So a new kind of NonJavaThread should be introduced for this purpose - if it is truly required - and directly synchronize with it rather than using polling. Further, we must ensure that the flushing thread cannot block the VMThread, if the VMThread is doing logging. > ---- > > The fact that every log message gets strduped could be done better. This can be left for a future RFE - but it explains why I dislike "AsyncLogBufferSize" being "number of entries" instead of a memory size. If we had had async logging from day one then the way we construct log messages would have been done differently. With all the logging sites assuming synchronous logging, with stack-local allocations/resources, we have no choice but to copy messages to memory with a suitable lifetime. As I started with, I think there needs to be a return to the high level architectural design discussion for this feature. The PR serves as a proof-of-concept, but should not IMO be presented as the right general solution for all hotspot users. Regards, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3135 > From kbarrett at openjdk.java.net Mon Mar 29 02:20:27 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 29 Mar 2021 02:20:27 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 12:01:35 GMT, Stefan Karlsson wrote: > Some parts of the JVM puts an marker to show that a location does not contain a valid oop. The code that handles this typically look like this: > > oop* p = ... > if (*p != Universe::non_oop_word()) > > This means that sometimes the *p will create an oop that contains the non_oop_word. This makes it problematic to add stricter oop verification. I propose that we add a new function that checks the value of locations without converting it to an oop. > > (Note: I'm testing the new dependent pull Skara feature with this PR. It builds depends on the pr/3214 branch) Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3215 From ngasson at openjdk.java.net Mon Mar 29 03:15:26 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Mon, 29 Mar 2021 03:15:26 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Sat, 27 Mar 2021 09:54:57 GMT, Andrew Haley wrote: >> In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. >> Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. >> >> Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. >> Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. >> >> There can be illegal characters at the start of the input if the data is MIME encoded. >> It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. >> >> A JMH micro, Base64Decode.java, is added for performance test. >> With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), >> we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. >> >> The Base64Decode.java JMH micro-benchmark results: >> >> Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units >> >> # Kunpeng916, intrinsic >> Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op >> >> # Kunpeng916, default >> Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op >> Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op >> Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op >> Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op >> Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op >> Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op >> Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op >> Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op >> Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op >> Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op >> Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op >> Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op >> Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op >> Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op >> Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op >> Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op >> Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op >> Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op >> Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op >> Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op >> Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op >> Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op >> Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op > > src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5578: > >> 5576: void generate_base64_decode_nosimdround(Register src, Register dst, >> 5577: Register nosimd_codec, Label &Exit) >> 5578: { > > We'd want enter/leave here so profiling tools can walk the stack. That probably ought to go around the whole routine in generate_base64_decodeBlock rather than here? ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From ngasson at openjdk.java.net Mon Mar 29 03:15:28 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Mon, 29 Mar 2021 03:15:28 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: <4q0UZGolKaHArH3ZQS7G_-EBspOa_zWomuGzWfYGNK0=.7b2a9668-a123-435a-822d-5ab2dc01a8c7@github.com> On Sat, 27 Mar 2021 08:58:03 GMT, Dong Bo wrote: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5624: > 5622: __ ld4(in0, in1, in2, in3, arrangement, __ post(src, 4 * size)); > 5623: > 5624: // we need unsigned saturationg substract, to make sure all input values "saturating subtract" src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5649: > 5647: __ orr(decL3, arrangement, decL3, decH3); > 5648: > 5649: // check iilegal inputs, value larger than 63 (maximum of 6 bits) "illegal inputs". Are there existing jtreg tests that cover these cases? src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5772: > 5770: // The value of index 64 is set to 0, so that we know that we already get the > 5771: // decoded data with the 1st lookup. > 5772: static const uint8_t fromBase64ForSIMD[128] = { This table and the one below seem to be identical to first half of the NoSIMD tables. Can't you just use one set of 256-entry tables for both SIMD and non-SIMD algorithms? src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5803: > 5801: Register dst = c_rarg3; // dest array > 5802: Register doff = c_rarg4; // position for writing to dest array > 5803: Register isURL = c_rarg5; // Base64 or URL chracter set "character set" src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 5830: > 5828: > 5829: // The 1st character of the input can be illegal if the data is MIME encoded. > 5830: // We can not benefits from SIMD for this case. The max line size of MIME "cannot benefit" ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From ngasson at openjdk.java.net Mon Mar 29 03:18:27 2021 From: ngasson at openjdk.java.net (Nick Gasson) Date: Mon, 29 Mar 2021 03:18:27 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Sat, 27 Mar 2021 09:53:37 GMT, Andrew Haley wrote: > > There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. But there's only ever one of these generated at startup, right? It's not like the string intrinsics that are expanded at every call site. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From dongbo at openjdk.java.net Mon Mar 29 03:33:34 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Mon, 29 Mar 2021 03:33:34 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> On Mon, 29 Mar 2021 03:15:57 GMT, Nick Gasson wrote: >> Firstly, I wonder how important this is for most applications. I don't actually know, but let's put that to one side. >> >> There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. > >> >> There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. > > But there's only ever one of these generated at startup, right? It's not like the string intrinsics that are expanded at every call site. Thanks for the comments. > Firstly, I wonder how important this is for most applications. I don't actually know, but let's put that to one side. > As claimed in JEP 135, Base64 is frequently used to encode binary/octet sequences that are transmitted as textual data. It is commonly used by applications using Multipurpose Internal Mail Extensions (MIME), encoding passwords for HTTP headers, message digests, etc. > There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. > There is no code unrolling in the non-SIMD case. The instructions are just loading, processing, storing data within loops. About half of the code size is the error handling in SIMD case: // handle illegal input if (size == 16) { Label ErrorInLowerHalf; __ umov(rscratch1, in2, __ D, 0); __ cbnz(rscratch1, ErrorInLowerHalf); // illegal input is in higher half, store the lower half now. __ st3(out0, out1, out2, __ T8B, __ post(dst, 24)); for (int i = 8; i < 15; i++) { __ umov(rscratch2, in2, __ B, (u1) i); __ cbnz(rscratch2, Exit); __ umov(r10, out0, __ B, (u1) i); __ umov(r11, out1, __ B, (u1) i); __ umov(r12, out2, __ B, (u1) i); __ strb(r10, __ post(dst, 1)); __ strb(r11, __ post(dst, 1)); __ strb(r12, __ post(dst, 1)); } __ b(Exit); I think I can rewrite this part as loops. With an intial implemention, we can have almost half of the code size reduced (1312B -> 748B). Sounds OK to you? > Please consider losing the non-SIMD case. It doesn't result in any significant gain. > The non-SIMD case is useful for MIME decoding performance. The MIME base64 encoded data is arranged in lines (line size can be set by user with maximum 76B). Newline characters, e.g. `\r\n`, are illegal but can be ignored by MIME decoding. While the SIMD case works as `load data -> two vector table lookups -> combining -> error detection -> store data`. When using SIMD for MIME decoding, the 1st byte of the input are possibly a newline character. The SIMD case will execute too much wasty code before it can detect the error and exit, with non-simd case, there are only few ldrs, orrs, strs for error detecting. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From dongbo at openjdk.java.net Mon Mar 29 03:54:26 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Mon, 29 Mar 2021 03:54:26 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 03:15:57 GMT, Nick Gasson wrote: >> Firstly, I wonder how important this is for most applications. I don't actually know, but let's put that to one side. >> >> There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. > >> >> There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. > > But there's only ever one of these generated at startup, right? It's not like the string intrinsics that are expanded at every call site. @nick-arm Thank you for watching this. > That probably ought to go around the whole routine in generate_base64_decodeBlock rather than here? > There are two non-simd blocks in this intrinsic. The 1st is at the begining, mainly to roll MIME decoding to non-simd processing due to the performance issue as I claimed before. The 2nd is at the end to handle trailing inputs. So I guess we need generate_base64_decode_nosimdround here. > "illegal inputs". Are there existing jtreg tests that cover these cases? > Yes, they are covered by `test/hotspot/jtreg/compiler/intrinsics/base64/TestBase64.java`. > This table and the one below seem to be identical to first half of the NoSIMD tables. Can't you just use one set of 256-entry tables for both SIMD and non-SIMD algorithms? > They are not identical, `*ForSIMD[64]==0`, `*forNoSIMD[64]=255`. In SIMD case, `*ForSIMD[64]` acts as a pivot to tell us that we already get the decoded data with the 1st lookup when performing the 2nd lookup. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From dholmes at openjdk.java.net Mon Mar 29 05:03:24 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 29 Mar 2021 05:03:24 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 21:47:46 GMT, Coleen Phillimore wrote: > This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. > > Tested with tier1-7. Looks good! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3207 From dholmes at openjdk.java.net Mon Mar 29 05:03:25 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 29 Mar 2021 05:03:25 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: <7vek4vBeq_m6kHjh6V5999ZsZYJ-SAvTg3LCtajlBrk=.25130296-3ca8-4437-9528-2a38447569db@github.com> Message-ID: On Fri, 26 Mar 2021 13:24:38 GMT, Coleen Phillimore wrote: >> src/hotspot/share/memory/metaspace.cpp line 816: >> >>> 814: } >>> 815: >>> 816: return result; >> >> Shouldn't we still try to find more memory by calling Universe::heap()->satisfy_failed_metadata_allocation before giving up? Or can that not work if called from the VMThread? > > It can't work from the VMThread. Patricio and I were chatting yesterday and he pointed out neither of these VM operations can nest (VM_ChangeBreakpoints and VM_MetaspaceGC) making up names but you get the point. Okay. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From david.holmes at oracle.com Mon Mar 29 05:10:59 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Mar 2021 15:10:59 +1000 Subject: RFR: 8264240: [macos_aarch64] enable appcds support after JDK-8263002 In-Reply-To: <9Omimx03OQ8un-iKii6PJNQEiBe1SmGnWrlcR3LSvx0=.2feee97b-728c-4a1a-b62f-a0474ab44dc1@github.com> References: <9Omimx03OQ8un-iKii6PJNQEiBe1SmGnWrlcR3LSvx0=.2feee97b-728c-4a1a-b62f-a0474ab44dc1@github.com> Message-ID: <22767f9f-c85c-f83f-6ebc-88ee35c8175b@oracle.com> On Fri, 26 Mar 2021 17:18:22 GMT, Vladimir Kempik wrote: > > Please review this small patch for macos_aarch64. > It reverts small part of jep-391 where we disabled cds for macos_aarch64. > After JDK-8263002 is fixed, the appcds can be enabled back on macos_aarch64. > CDS related tests in tier1 now pass That tier 1 testing may not have been sufficient. With CDS enabled you will likely hit crashes due to JDK-8262894. David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3221 > From david.holmes at oracle.com Mon Mar 29 05:25:57 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Mar 2021 15:25:57 +1000 Subject: RFR: 8264285: Do not support FLAG_SET_XXX for VM flags of string type In-Reply-To: References: Message-ID: <1ae91ffe-b160-4c4a-0beb-17f50baa7401@oracle.com> Hi Ioi, On 27/03/2021 3:01 am, Ioi Lam wrote: > We have two versions of `JVMFlagAccess::ccstrAtPut()` that are slightly different. > > The following version is supposed to be used only by the `FLAG_SET_{CMDLINE,ERGO,MGMT}` macros. However, it's not used anywhere in the HotSpot source code: > > JVMFlag::Error JVMFlagAccess::ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin) { > JVMFlag* faddr = JVMFlag::flag_from_enum(flag); > assert(faddr->is_ccstr(), "wrong flag type"); > ccstr old_value = faddr->get_ccstr(); > trace_flag_changed(faddr, old_value, value, origin); > char* new_value = os::strdup_check_oom(value); > faddr->set_ccstr(new_value); > if (!faddr->is_default() && old_value != NULL) { > // Prior value is heap allocated so free it. > FREE_C_HEAP_ARRAY(char, old_value); > } > faddr->set_origin(origin); > return JVMFlag::SUCCESS; > } > > It's not clear whether this unused version is actually correct since the last JVMFlag rewrite in [JDK-8081833](https://bugs.openjdk.java.net/browse/JDK-8081833), due to complete lack of testing. Let's remove this version to simplify code maintenance. I agree it may not work because it is untested but I don't agree it should be removed as we may want to set a string flag this way ... > If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. ... and using ccstrAtPut doesn't update the origin of the flag as might be desired when using the macros. I'd rather see a test introduced to sanity check the operation if possible. There is a tension between writing balanced API's and not introducing "dead code". We seem to be swinging the style pendulum more to the "get rid of all dead code" end, rather than considering the utility of writing a balanced API that makes a subsystem functionally complete. :( David ----- > ------------- > > Commit messages: > - 8264285: Do not support FLAG_SET_XXX for VM flags of string type > > Changes: https://git.openjdk.java.net/jdk/pull/3219/files > Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3219&range=00 > Issue: https://bugs.openjdk.java.net/browse/JDK-8264285 > Stats: 33 lines in 3 files changed: 7 ins; 17 del; 9 mod > Patch: https://git.openjdk.java.net/jdk/pull/3219.diff > Fetch: git fetch https://git.openjdk.java.net/jdk pull/3219/head:pull/3219 > > PR: https://git.openjdk.java.net/jdk/pull/3219 > From iklam at openjdk.java.net Mon Mar 29 05:53:29 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 29 Mar 2021 05:53:29 GMT Subject: RFR: 8264285: Do not support FLAG_SET_XXX for VM flags of string type In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 16:21:03 GMT, Ioi Lam wrote: > We have two versions of `JVMFlagAccess::ccstrAtPut()` that are slightly different. > > The following version is supposed to be used only by the `FLAG_SET_{CMDLINE,ERGO,MGMT}` macros. However, it's not used anywhere in the HotSpot source code: > > JVMFlag::Error JVMFlagAccess::ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin) { > JVMFlag* faddr = JVMFlag::flag_from_enum(flag); > assert(faddr->is_ccstr(), "wrong flag type"); > ccstr old_value = faddr->get_ccstr(); > trace_flag_changed(faddr, old_value, value, origin); > char* new_value = os::strdup_check_oom(value); > faddr->set_ccstr(new_value); > if (!faddr->is_default() && old_value != NULL) { > // Prior value is heap allocated so free it. > FREE_C_HEAP_ARRAY(char, old_value); > } > faddr->set_origin(origin); > return JVMFlag::SUCCESS; > } > > It's not clear whether this unused version is actually correct since the last JVMFlag rewrite in [JDK-8081833](https://bugs.openjdk.java.net/browse/JDK-8081833), due to complete lack of testing. Let's remove this version to simplify code maintenance. > > If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. > > If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. > > ... and using ccstrAtPut doesn't update the origin of the flag as might > be desired when using the macros. This is the version I removed: static JVMFlag::Error ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin); This is the remaining version: static JVMFlag::Error ccstrAtPut(JVMFlag* flag, ccstr* value, JVMFlagOrigin origin); So they are practically the same API. The origin is changed in both. The only difference is they have unobvious subtle difference in how they handle the buffer allocation. ------------- PR: https://git.openjdk.java.net/jdk/pull/3219 From kbarrett at openjdk.java.net Mon Mar 29 08:03:32 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 29 Mar 2021 08:03:32 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 09:06:38 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > Man Cao has updated the pull request incrementally with one additional commit since the last revision: > > Address comment and add a gtest. Sorry I've been slow to get back to this. I wanted to think about it some more and then lost track of it. src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 100: > 98: template > 99: template > 100: T* LockFreeQueue::pop() { On further consideration I don't think this `use_rcu` conditionalized `pop` is the right path. The current behavior (with the embedded critical section) was for the specific use case in `G1DirtyCardQueueSet`. But for a general tool, I think a different approach is needed. I think better would be to not provide `pop()` at all, and instead provide `try_pop()`, which has a tri-status result: success, lost a race, or lost to an in-progress operation. So something like: enum class LockFreeQueuePopStatus { success, lost_race, operation_in_progress }; // Member of LockFreeQueue // Executes the body of the old pop loop once, with appropriate // adjustments to return value and returning rather than retrying. Pair try_pop(); Then let the specific use-case determine the context in which try_pop should be called and how to handle the various possible results. This eliminates `ConditionalCriticalSection` (which seems strange). This also eliminates the `G1DirtyCardQueueSet`-specific subclass of `LockFreeQueue`. Instead we have (private) `G1DirtyCardQueueSet::pop_queue()`: BufferNode* G1DirtyCardQueueSet::pop_queue() { using Status = LockFreeQueuePopStatus; Thread* current_thread = Thread::current(); while (true) { GlobalCounter::CriticalSection cs(current_thread); Pair pop_result = _completed.try_pop(); switch (pop_result.first) { case Status::success: return pop_result.second; case Status::operation_in_progress: return nullptr; case Status::lost_race: break; // Try again. } } } I'm also not sure whether the G1 case actually needs the critical section inside the loop anymore. That might be a holdover from an earlier version where the operation-in-progress case did just loop to try again. It definitely does need a critical section though; the life cycle management for the BufferNodes depends on it. ------------- Changes requested by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2986 From kbarrett at openjdk.java.net Mon Mar 29 08:03:32 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 29 Mar 2021 08:03:32 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: On Tue, 16 Mar 2021 05:47:58 GMT, Man Cao wrote: >> The ordering of xchg in append only affects the writing thread. It does nothing for the reader side. The second load_acquire pairs with the set_next (a release_store) in append. However, it's always (one way or another) followed by a conservative cmpxchg, so does seem possible to weaken. The first load_acquire is a "consume", but we don't have that and upgrade to acquire. > > Agreed that the second load_acquire can be weakened, and thanks for noting the first one is a "consume". > However, I'm a bit confused about the explanation. >> The second load_acquire pairs with the set_next (a release_store) in append. > > The set_next() is Atomic::store() as in LockFreeStack, right? Then it is a relaxed store, but not a release_store. In this case the full fence provided by xchg is necessary to make set_next() a release_store. > > IIUC, the following is sufficient to establish a release-acquire ordering: > Writer thread: Reader thread: > StoreStoreFence(); > relaxed_store(p); > relaxed_load(p); > LoadLoadFence(); > In this case: > append() (Writer thread): pop() (Reader thread): > // Provides full fence > Atomic::xchg(&_tail, ...); > Atomic::store(&p._next, ...); > // Suppose we don't use load_acqurie > Atomic::load(&p._next); > // Provides full fence > Atomic::cmpxchg(&_head /* or &_tail */, ...); > I think it is the two full fences that enables us to use relaxed store and relaxed load. I think you are correct. I was confused about set_next(). ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From ysuenaga at openjdk.java.net Mon Mar 29 08:03:58 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 29 Mar 2021 08:03:58 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp [v2] In-Reply-To: References: Message-ID: <5W8i9Wro1OWbzlUbEyeTy4TBLQmhWysLSjDcjadMygc=.a8348509-517b-48c4-be70-68b3ddb1088b@github.com> > I tried to build OpenJDK with g++-10.2.1_pre1-r3 on Alpine Linux 3.13.2, but I saw following warning: > > > 668 | alloca(((pid ^ counter++) & 7) * 128); > | ^ > cc1plus: all warnings being treated as errors Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Remove alloca() from some platforms ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3042/files - new: https://git.openjdk.java.net/jdk/pull/3042/files/006cf7d1..4485a021 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3042&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3042&range=00-01 Stats: 35 lines in 5 files changed: 3 ins; 27 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/3042.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3042/head:pull/3042 PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Mon Mar 29 08:13:26 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 29 Mar 2021 08:13:26 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp In-Reply-To: References: <2LTmcTAinL0BQn5t5ltR32S3uyQ-bwjKF8xRQ71DXf0=.9e9c1a88-7daf-4360-9faa-844a8bd6f489@github.com> <0l5lZg45-QTxenQ0RtyrX2q7PCHaO9Tm3FwzSC_ALK0=.6a7ee9d2-6019-45f9-8ffd-8c706a17aa51@github.com> <2jyfuXZyygeYZAtIiwDCiGC1BKSi1Ga_sot9xmW9EAc=.a9c24bd5-262e-44c1-aa6b-4a7de331e431@github.com> <4FT5wC9-SYkH2FekmMKbKGfHOOUwfS8vUnP KEylN2fk=.c00769fc-b287-40d4-ad18-f3c738e4e483@github.com> Message-ID: On Sun, 21 Mar 2021 06:40:18 GMT, Yasumasa Suenaga wrote: >> @magicus I checked assembly code of release build, it is similar to fastdebug build. The stack (RSP) is expanded. >> (I confirmed it with Visual Studio 16.9.2 because I received update notification before your reply...) >> // Try to randomize the cache line index of hot stack frames. >> // This helps when threads of the same stack traces evict each other's >> // cache lines. The threads can be either from the same JVM instance, or >> // from different JVM instances. The benefit is especially true for >> // processors with hyperthreading technology. >> static int counter = 0; >> int pid = os::current_process_id(); >> 00007FF80F4E10ED mov eax,dword ptr [_initial_pid (07FF80F9D8164h)] >> 00007FF80F4E10F3 test eax,eax >> 00007FF80F4E10F5 jne thread_native_entry+3Dh (07FF80F4E10FDh) >> 00007FF80F4E10F7 call qword ptr [__imp__getpid (07FF80F6EF748h)] >> _alloca(((pid ^ counter++) & 7) * 128); >> 00007FF80F4E10FD mov ecx,dword ptr [counter (07FF80F9D8300h)] >> 00007FF80F4E1103 mov edx,ecx >> 00007FF80F4E1105 xor edx,eax >> 00007FF80F4E1107 inc ecx >> 00007FF80F4E1109 mov dword ptr [counter (07FF80F9D8300h)],ecx >> 00007FF80F4E110F and edx,7 >> 00007FF80F4E1112 shl edx,7 >> 00007FF80F4E1115 mov eax,edx >> 00007FF80F4E1117 lea rcx,[rdx+0Fh] >> 00007FF80F4E111B cmp rcx,rax >> 00007FF80F4E111E ja thread_native_entry+6Ah (07FF80F4E112Ah) >> 00007FF80F4E1120 mov rcx,0FFFFFFFFFFFFFF0h >> 00007FF80F4E112A and rcx,0FFFFFFFFFFFFFFF0h >> 00007FF80F4E112E mov rax,rcx >> 00007FF80F4E1131 call __chkstk (07FF80F6ED370h) >> 00007FF80F4E1136 sub rsp,rcx > >> Did you measure on Alpine too, with muslc? And the XXXBsds? Are we sure we >> measure the right thing? I wish there were regression tests telling us when >> to re-apply this optimization. > > I think we can decide by whether `alloca()` or equivalent stack operation is contained in current binary. > If current binary does not contain it like JDK 17 Linux x64, we can remove it because it already does not work - performance degradation will not happen. > > In its context, I think Alpine + musl libc is also ok to remove it because I confirmed JDK 17 for Alpine x64 (from jdk.java.net) does not contain stack operation same as x64. > > 0000000000b889b0 : > b889b0: 55 push %rbp > b889b1: 48 89 e5 mov %rsp,%rbp > b889b4: 41 56 push %r14 > b889b6: 41 55 push %r13 > b889b8: 49 89 fd mov %rdi,%r13 > b889bb: 41 54 push %r12 > b889bd: 53 push %rbx > b889be: e8 6d a5 1a 00 callq d32f30 > b889c3: e8 e8 fb 6a ff callq 2385b0 > b889c8: 4c 89 ef mov %r13,%rdi > b889cb: 83 05 b6 98 61 00 01 addl $0x1,0x6198b6(%rip) # 11a2288 > b889d2: e8 f9 a4 1a 00 callq d32ed0 > b889d7: 49 8b 9d 68 02 00 00 mov 0x268(%r13),%rbx > b889de: 31 c0 xor %eax,%eax > >> (Please leave the alloca in the AIX implementation; we currently don't have >> the cycles to run regression tests for this) > > Ok, I got it. My colleague confirmed that `alloca()` (or equivalent stack operation) does not exist in JDK 16 from jdk.java.net: (lldb) disass libjvm.dylib`thread_native_entry: -> 0x103892030 <+0>: pushq %rbp 0x103892031 <+1>: movq %rsp, %rbp 0x103892034 <+4>: pushq %r15 0x103892036 <+6>: pushq %r14 0x103892038 <+8>: pushq %r13 0x10389203a <+10>: pushq %r12 0x10389203c <+12>: pushq %rbx 0x10389203d <+13>: pushq %rax 0x10389203e <+14>: movq %rdi, %r14 0x103892041 <+17>: callq 0x103a03690 ; Thread::record_stack_base_and_size() 0x103892046 <+22>: callq 0x103aed89a ; symbol stub for: getpid 0x10389204b <+27>: incl 0x452203(%rip) ; thread_native_entry(Thread*)::counter 0x103892051 <+33>: movq %r14, %rdi 0x103892054 <+36>: callq 0x103a03650 ; Thread::initialize_thread_current() 0x103892059 <+41>: movq 0x270(%r14), %rbx 0x103892060 <+48>: movq 0x50(%rbx), %r15 So I think we can remove `alloca()` from os_bsd, os_windows safely. In Linux, David confirms it did not show any degradation in performance, however we cannot evaluete platforms other than glibc. So I left `alloca()` call if it does not run on glibc, and I modified to affect it. Could you review again? (At request from Thomas, I've not changed os_aix.cpp in this PR.) ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From aph at openjdk.java.net Mon Mar 29 08:34:45 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 29 Mar 2021 08:34:45 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 03:15:57 GMT, Nick Gasson wrote: > > There's a lot of unrolling, particularly in the non-SIMD case. Please consider taking out some of the unrolling; I suspect it'd not increase time by very much but would greatly reduce the code cache pollution. It's very tempting to unroll everything to make a benchmark run quickly, but we have to take a balanced approach. > > But there's only ever one of these generated at startup, right? It's not like the string intrinsics that are expanded at every call site. I'm talking about icache pollution. This stuff could be quite small. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From aph at openjdk.java.net Mon Mar 29 08:41:27 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Mon, 29 Mar 2021 08:41:27 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> References: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> Message-ID: On Mon, 29 Mar 2021 03:28:54 GMT, Dong Bo wrote: > I think I can rewrite this part as loops. > With an intial implemention, we can have almost half of the code size reduced (1312B -> 748B). Sounds OK to you? Sounds great, but I'm still somewhat concerned that the non-SIMD case only offers 3-12% performance gain. Make it just 748 bytes, and therefore not icache-hostile, then perhaps the balance of risk and reward is justified. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From Alan.Hayward at arm.com Mon Mar 29 09:01:51 2021 From: Alan.Hayward at arm.com (Alan Hayward) Date: Mon, 29 Mar 2021 09:01:51 +0000 Subject: RFC: JEP drafts PAC for Linux/AArch64 (JDK-8264130) and Arm64 for MacOS/AArch64 (JDK-8264131) Message-ID: <55F74A14-5AB8-44F7-8903-BE03AE087484@arm.com> Hi all, I?ve been investigating PAC for the AArch64 ports - figuring out what should be supported and trying it out in code. PAC is an AArch64 extension that provides instructions for signing and authenticating values and addresses; it can be used to bring protection against various types of attacks, for a small performance cost. If OpenJDK is running on a system with PAC protection enabled in the kernel, then it should use the feature. I?ve started by implementing the same support as GCC/LLVM - namely signing return addresses. So far I have this seemingly fully working in interpreter only; and C1/C2 crashing in deoptimization. I?ve also got an early attempt at MacOS arm64e. In addition to signing return addresses, arm64e requires signing function pointers. The upcoming PAuth ABI for Linux includes all of the above plus additional features. I?ve not made any attempt at this yet. All of this comes at a cost. Current estimate is 3% on average for signing return addresses. This almost vanishes on non PAC hardware, or when the feature is disabled, as in both cases the PAC instructions are treated as NOPs. Arm64e has the advantage that it is compiled twice within the same fat binary meaning the arm64e version will not have the extra NOPs. I?ve opened JEPs for both the Linux and Arm64e work. These are my first attempts at writing a JEP, so any comments would be greatly appreciated. PAC-RET protection for Linux/AArch64: https://bugs.openjdk.java.net/browse/JDK-8264130 Arm64e support for MacOS/AArch64: https://bugs.openjdk.java.net/browse/JDK-8264131 Thanks, Alan. IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. From kbarrett at openjdk.java.net Mon Mar 29 09:25:40 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 29 Mar 2021 09:25:40 GMT Subject: RFR: JDK-8260332: ParallelGC: Cooperative pretouch for oldgen expansion [v2] In-Reply-To: <8hmX1slrUPMWTec3K6Z1xSV7D6JEtbmLEox0JNZ93xQ=.66ea31cd-f702-4653-8a12-01cea1d50843@github.com> References: <8hmX1slrUPMWTec3K6Z1xSV7D6JEtbmLEox0JNZ93xQ=.66ea31cd-f702-4653-8a12-01cea1d50843@github.com> Message-ID: On Mon, 22 Mar 2021 17:39:35 GMT, Amit Pawar wrote: > > > [PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx](https://github.com/openjdk/jdk/files/6147180/PreTouchParallelChunkSize_TestResults_UpdatedForOldGenCase.xlsx) > > > > > > I'm not sure how to interpret this. In particular, it says "Test done using November 27rd Openjdk build". That predates a number of recent changes in this area that seem relevant. Is that correct? Or is that stale and the data has been updated for a newer baseline. > > Sorry for the confusion. I didnt test again and pre-touch time taken with different chunk size per thread was already recorded in the spreadsheet and thought to use it for reference to reply to David feedback "A change like this needs a lot more testing than that, both functionally and performance.". Getting any benefit from parallel pretouch here suggests we have many threads piling up on the expand mutex. Before JDK-8260045 (pushed together with JDK-8260044 on 2/12/2021) that would lead to "expansion storms", where several threads piled up on that mutex and serially entered and did a new expansion (with associated pretouch, and possibly of significant size). Getting rid of the expansion storms may alleviate the serial pretouch quite a bit. There may still be some benefit to cooperative parallization, but new measurements are needed to determine how much of a problem still exists. If it is still worth addressing, I think the proposed approach has problems, as it makes messy complicated code messier and more complicated. I think some preliminary refactoring is needed. For example, splitting MutableSpace::initialize into a setup-pages part and set-the-range part. I've not explored in detail what's needed yet, pending new measurements. (I haven't had time to do those measurements yet myself.) ------------- PR: https://git.openjdk.java.net/jdk/pull/2976 From stefank at openjdk.java.net Mon Mar 29 09:26:18 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 29 Mar 2021 09:26:18 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers [v2] In-Reply-To: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: > The JIT compiler embeds pointers to addresses within an object. These are called derived pointers. When the GC moves objects, these pointers need to be updated explicitly, because the GC only deals with the real oops of the objects (base pointer). > > The code that deals with this uses oop* for the address containing the base pointer. This is fine, the address contains an oop. However, it also uses oop* for the interior pointer, even though the contents is not a valid oop. > > This creates temporary oops that does not conform to the normal requirements for oops. For example, the lower three bits could be set. This makes it problematic to write stricter verification code. > > I propose that we use intptr_t* instead of oop*, and only use oop* when the location is known to contain a valid oop. Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - star alignment cleanup - Merge remote-tracking branch 'origin/master' into 8264268_dervied_pointer_types - Add static assert - Cleanups - derived_pointer enum class - 8264268: Don't use oop types for derived pointers ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3214/files - new: https://git.openjdk.java.net/jdk/pull/3214/files/38f494a4..d231e3ba Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3214&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3214&range=00-01 Stats: 1808 lines in 122 files changed: 1075 ins; 187 del; 546 mod Patch: https://git.openjdk.java.net/jdk/pull/3214.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3214/head:pull/3214 PR: https://git.openjdk.java.net/jdk/pull/3214 From stefank at openjdk.java.net Mon Mar 29 09:26:19 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 29 Mar 2021 09:26:19 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers [v2] In-Reply-To: References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: On Sat, 27 Mar 2021 23:03:49 GMT, John R Rose wrote: > Looks good. Thanks. > > Maybe add `typedef intptr_t derived_oop_t` to make the operations on d-oops a little more distinctive. I used Kim's enum suggestion. > > Maybe add a static assert that `sizeof(oop) == sizeof(intptr_t)`, just in case there's a problem with compressed oops in the future. Done. ------------- PR: https://git.openjdk.java.net/jdk/pull/3214 From stefank at openjdk.java.net Mon Mar 29 09:26:21 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 29 Mar 2021 09:26:21 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers [v2] In-Reply-To: References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: On Sun, 28 Mar 2021 12:00:03 GMT, Kim Barrett wrote: > I agree with jrose that having a type name for derived pointers would be better. But I wondered if it might be possible to make it stronger than just an alias type for intptr_t. I tried making it an enum class with a few supporting operations, and it doesn't look too bad. > https://github.com/kimbarrett/openjdk-jdk/tree/derived_ptr Sounds good. I've taken your patch and and did some minor cleanups. > src/hotspot/share/compiler/oopMap.cpp line 290: > >> 288: intptr_t* derived_loc = (intptr_t*)fr->oopmapreg_to_location(omv.reg(),reg_map); >> 289: guarantee(derived_loc != NULL, "missing saved register"); >> 290: oop *base_loc = fr->oopmapreg_to_oop_location(omv.content_reg(), reg_map); > > `oop *base_loc` -> `oop* base_loc` Done ------------- PR: https://git.openjdk.java.net/jdk/pull/3214 From stefank at openjdk.java.net Mon Mar 29 09:35:58 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 29 Mar 2021 09:35:58 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers [v3] In-Reply-To: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: <4K7raDpK97UwvrxPZ9L_DD6mkxmxbQz-PzRQ3lafs2E=.0f86a291-e10d-4512-a892-4e32d0b4d528@github.com> > The JIT compiler embeds pointers to addresses within an object. These are called derived pointers. When the GC moves objects, these pointers need to be updated explicitly, because the GC only deals with the real oops of the objects (base pointer). > > The code that deals with this uses oop* for the address containing the base pointer. This is fine, the address contains an oop. However, it also uses oop* for the interior pointer, even though the contents is not a valid oop. > > This creates temporary oops that does not conform to the normal requirements for oops. For example, the lower three bits could be set. This makes it problematic to write stricter verification code. > > I propose that we use intptr_t* instead of oop*, and only use oop* when the location is known to contain a valid oop. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Remove unused value_fn parameter ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3214/files - new: https://git.openjdk.java.net/jdk/pull/3214/files/d231e3ba..1318ab6a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3214&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3214&range=01-02 Stats: 7 lines in 2 files changed: 0 ins; 2 del; 5 mod Patch: https://git.openjdk.java.net/jdk/pull/3214.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3214/head:pull/3214 PR: https://git.openjdk.java.net/jdk/pull/3214 From kbarrett at openjdk.java.net Mon Mar 29 09:36:01 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Mon, 29 Mar 2021 09:36:01 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers [v2] In-Reply-To: References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: On Mon, 29 Mar 2021 09:26:18 GMT, Stefan Karlsson wrote: >> The JIT compiler embeds pointers to addresses within an object. These are called derived pointers. When the GC moves objects, these pointers need to be updated explicitly, because the GC only deals with the real oops of the objects (base pointer). >> >> The code that deals with this uses oop* for the address containing the base pointer. This is fine, the address contains an oop. However, it also uses oop* for the interior pointer, even though the contents is not a valid oop. >> >> This creates temporary oops that does not conform to the normal requirements for oops. For example, the lower three bits could be set. This makes it problematic to write stricter verification code. >> >> I propose that we use intptr_t* instead of oop*, and only use oop* when the location is known to contain a valid oop. > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - star alignment cleanup > - Merge remote-tracking branch 'origin/master' into 8264268_dervied_pointer_types > - Add static assert > - Cleanups > - derived_pointer enum class > - 8264268: Don't use oop types for derived pointers Marked as reviewed by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3214 From stefank at openjdk.java.net Mon Mar 29 09:36:02 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 29 Mar 2021 09:36:02 GMT Subject: RFR: 8264268: Don't use oop types for derived pointers [v3] In-Reply-To: References: <2UO90DoNZ6JCfM5sT-N5Wow2vHpk9lOdtlgqHY5ix6A=.662b95a7-977b-4fef-94c6-fa5b08c18733@github.com> Message-ID: On Sun, 28 Mar 2021 04:43:37 GMT, Kim Barrett wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove unused value_fn parameter > > src/hotspot/share/compiler/oopMap.cpp line 196: > >> 194: } >> 195: >> 196: static void add_derived_oop(oop* base, intptr_t* derived, OopClosure* oop_fn) { > > `add_derived_oop` and `ignore_derived_oop` ignore `oop_fn`. Maybe they should assert that it's `&do_nothing_cl` ? Hmm. `&do_nothing_cl` isn't being passed down to `oop_fn`. It's being passed down as `value_fn`, which isn't used at all. I've pushed a commit to remove this, to make it less confusing. ------------- PR: https://git.openjdk.java.net/jdk/pull/3214 From stefank at openjdk.java.net Mon Mar 29 09:38:29 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 29 Mar 2021 09:38:29 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 02:17:21 GMT, Kim Barrett wrote: >> Some parts of the JVM puts an marker to show that a location does not contain a valid oop. The code that handles this typically look like this: >> >> oop* p = ... >> if (*p != Universe::non_oop_word()) >> >> This means that sometimes the *p will create an oop that contains the non_oop_word. This makes it problematic to add stricter oop verification. I propose that we add a new function that checks the value of locations without converting it to an oop. >> >> (Note: I'm testing the new dependent pull Skara feature with this PR. It builds depends on the pr/3214 branch) > > Looks good. Thanks, Kim. Now lets see what happens now that #3214 has been updated ... ------------- PR: https://git.openjdk.java.net/jdk/pull/3215 From stefank at openjdk.java.net Mon Mar 29 09:46:57 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Mon, 29 Mar 2021 09:46:57 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops [v2] In-Reply-To: References: Message-ID: > Some parts of the JVM puts an marker to show that a location does not contain a valid oop. The code that handles this typically look like this: > > oop* p = ... > if (*p != Universe::non_oop_word()) > > This means that sometimes the *p will create an oop that contains the non_oop_word. This makes it problematic to add stricter oop verification. I propose that we add a new function that checks the value of locations without converting it to an oop. > > (Note: I'm testing the new dependent pull Skara feature with this PR. It builds depends on the pr/3214 branch) Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch '8264268_dervied_pointer_types' into 8264271_avoid_creating_non_oop_word_oops - 8264271: Avoid creating non_oop_word oops ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3215/files - new: https://git.openjdk.java.net/jdk/pull/3215/files/b3c1b292..41ce5fd1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3215&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3215&range=00-01 Stats: 1813 lines in 122 files changed: 1075 ins; 189 del; 549 mod Patch: https://git.openjdk.java.net/jdk/pull/3215.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3215/head:pull/3215 PR: https://git.openjdk.java.net/jdk/pull/3215 From ihse at openjdk.java.net Mon Mar 29 09:47:28 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 29 Mar 2021 09:47:28 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> Message-ID: On Sun, 28 Mar 2021 04:40:46 GMT, Yi Yang wrote: >> Based on Ioi's suggestion I decided to try with a different locale as well. I tried setting my system locale to something else and with that I was able to reproduce the warnings you report, so it _could_ be an issue with locale settings. AFAIK only `en-us` is supported. Maybe you could confirm/check your locale settings as well? (can run `systeminfo` to get the current setting) >> >> I've had problems in the past as well because I had the wrong locale set, and some of the tests were failing because of that. So, maybe rather than disabling the warnings, it might be more prudent to change the system locale of the used build systems to prevent similar issues in the future (FWIW, the display language doesn't seem to affect `cl` so that could still be whatever is convenient). > >> Based on Ioi's suggestion I decided to try with a different locale as well. I tried setting my system locale to something else and with that I was able to reproduce the warnings you report, so it _could_ be an issue with locale settings. AFAIK only `en-us` is supported. Maybe you could confirm/check your locale settings as well? (can run `systeminfo` to get the current setting) >> >> I've had problems in the past as well because I had the wrong locale set, and some of the tests were failing because of that. So, maybe rather than disabling the warnings, it might be more prudent to change the system locale of the used build systems to prevent similar issues in the future (FWIW, the display language doesn't seem to affect `cl` so that could still be whatever is convenient). > > Hi Jorn, > > Sorry for the delayed response. I set the locale of my Cygwin environment to en-us via `export LC_ALL="en_US.UTF-8"`, these warnings are generated when compiling as well as before. Should I change this setting globally instead of just changing it in Cygwin? > > Anyway, it seems that this problem is caused by the locale setting because as you mentioned, this problem appears when you change the locale setting to Chinese. Setting the locale to English does not have this problem. I checked the building document, but there is no mention of the need to set the locale option to en-us before building JDK. If this is really a necessary step for building, I think we should add this step in the building document, otherwise, I think we should fix this problem in HotSpot. > > Best Regards > Yang @kelthuzadx Hi Yang, Setting locale to US English used to be documented as a build requirement. When the "new" build-infra system was introduced several years ago, we thought that all locale-dependent issues were solved, and removed that requirement. Later on, issues crept in on non-Windows platforms, but these were handled by setting LC_ALL=C in the build system itself while building. The problem with requiring US English as locale on Windows is that you cannot set that for a single process, but must change the entire system locale for the user (which also often requires a reboot). Otoh, if we do *not* require US English, the test matrix grows almost without bounds, and we might run into a lot of weird problems (like this one!). So I'm not really comfortable with just patching around this issue, since: a) it does not occur in what is at least the "recommended" locale, and b) more issues are likely to creep up in the future (in fact, there might already be testing issues as Jorn says) On the other hand, I am not really comfortable either with just stating in the build document that US English is the only supported Windows locale, since it has such far-reaching consequences for the individual developers. In short, I'm torn between two bad solutions, but I'm definitely leaning towards the latter. If only there were some way of setting the locale just for cl.exe! :-( ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From stuefe at openjdk.java.net Mon Mar 29 09:59:30 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 29 Mar 2021 09:59:30 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> Message-ID: <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> On Sat, 27 Mar 2021 07:27:17 GMT, Thomas Stuefe wrote: >> Xin Liu has updated the pull request incrementally with two additional commits since the last revision: >> >> - 8229517: Support for optional asynchronous/buffered logging >> >> add a constraint for the option LogAsyncInterval. >> - 8229517: Support for optional asynchronous/buffered logging >> >> LogMessage supports async_mode. >> remove the option AsyncLogging >> renanme the option GCLogBufferSize to AsyncLogBufferSize >> move drop_log() to LogAsyncFlusher. > > Hi Xin, > > thank you for your detailed answers. > > As I wrote, I think this is a useful change. A prior design discussion with a rough sketch would have made things easier. Also, it would have been good to have the CSR discussion beforehand, since it affects how complex the implementation needs to be. I don't know whether there had been design discussions beforehand; if I missed them, I apologize. > > I am keenly aware that design discussions often lead nowhere because no-one answers. So I understand why you started with a patch. > > About your proposal: > > I do not think it can be made airtight, and I think that is okay - if we work with a limited flush buffer and we log too much, things will get dropped, that is unavoidable. But it has to be reliable and comprehensible after the fact. > > As you write, the patch you propose works well with AWS, but I suspect that is an environment with limited variables, and outside use of the VM could be much more diverse. We must make sure to roll out only well designed solutions which work for us all. > > E.g. a log system which randomly omits log entries because some internal buffer is full without giving any indication *in the log itself* is a terrible idea :). Since log files are a cornerstone for our support, I am interested in a good solution. > > First off, the CSR: > > --- > 1) With what you propose, we could have a arbitrary combination of targets with different log files and different async options: > java -Xlog:os*:file=os.log::async=false -Xlog:os+thread:file=thread.log::async=true > > > Do we really need that much freedom? How probable is that someone wants different async options for different trace sinks? The more freedom we have here the more complex the implementation gets. All that stuff has to be tested. Why not just make "async" a global setting. > > 2) AsyncLogBufferSize should be a user-definable memory size, not "number of slots". The fact that internally we keep a vector of disjunct memory snippets is an implementation detail; the user should just give a memory size and the VM should interprete this. This leaves us the freedom to later change the implementation as we see fit. > > 3) LogAsyncInterval should not exist at all. If it has to exist, it should be a diagnostic switch, not a production one; but ideally, we would just log as soon as there is something to log, see below. > > --- > > Implementation: > > The use of the WatcherThread and PeriodicTask. Polling is plain inefficient, beside the concerns Robbin voiced about blocking things. This is a typical producer-consumer problem, and I would implement it using an own dedicated flusher thread and a monitor. The flusher thread should wake only if there is something to write. This is something I would not do in a separate RFE but now. It would also disarm any arguments against blocking the WatcherThread. > > ---- > > The fact that every log message gets strduped could be done better. This can be left for a future RFE - but it explains why I dislike "AsyncLogBufferSize" being "number of entries" instead of a memory size. > > I think processing a memory-size AsyncLogBufferSize can be kept simple: it would be okay to just guess an average log line length and go with that. Lets say 256 chars. An AsyncLogBufferSize=1M could thus be translated to 4096 entries in your solution. If the sum of all 4096 allocated lines overshoots 1M from time to time, well so be it. > > A future better solution could use a preallocated fixed sized buffer. There are two ways to do this, the naive but memory inefficient way - array of fixed sized text slots like the event system does. And a smart way: a ring buffer of variable sized strings, '\0' separated, laid out one after the other in memory. The latter is a bit more involved, but can be done, and it would be fast and very memory efficient. But as I wrote, this is an optimization which can be postponed. > > ---- > > I may misunderstand the patch, but do you resolve decorators when the flusher is printing? Would this not distort time-dependent decorators (timemillis, timenanos, uptime etc)? Since we record the time of printing, not the time of logging?. > > If yes, it may be better to resolve the message early and just store the plain string and print that. Basically this would mean to move the whole buffering down a layer or two right at where the raw strings get printed. This would be vastly simplified if we abandon the "async for every trace sink" notion in favor of just a global flag. > > This would also save a bit of space, since we would not have to carry all the meta information in `AsyncLogMessage` around. I count at least three 64bit slots, possibly 4-5, which alone makes for ~40 bytes per message. Resolved decorators are often smaller than that. > > Please find further remarks inline. > >> hi, Thomas, >> >> Thank you for reviewing this PR. >> >> > Hi Xin, >> > I skimmed over the patch, but have a number of high level questions - things which have not been clear from your description. >> > >> > * Who does the writing, and who is affected when the writing stalls? >> >> The WatchThread eventually flushes those buffered messages. if the writing stalls, it blocks periodic tasks. >> It blocks long enough, other periodic tasks are skipped. >> >> > * Do you then block or throw output away? >> > >> > * If the former, how do you mitigate the ripple effect? >> > * If the latter, how does the reader of the log file know that something is missing? >> >> The capacity of buffer is limited, which is `AsyncLogBufferSize` (2k by default). >> Actually, logTagSet.cpp limits the maximal length of a vwrite is 512. That means that maximal memory used by this buffer is 1M (=2k * 0.5k). >> >> If the buffer overflows, it starts dropping the heads. this behavior simulates a ringbuffer. >> If you enable `-XX:+Verbose`, the dropping message will be printed to the tty console. >> >> I prefer to drop messages than keeping them growing because later may trigger out-of-memory error. >> >> > * How often do you flush? How do you prevent missing output in the log file in case of crashes? >> >> The interval is defined by `LogAsyncInterval` (300ms by default). I insert a statement `async->flusher()` in `ostream_abort()`. >> > > If the flusher blocks, this could block VM shutdown? Would this be different from what we do now, e.g. since all log output is serialized and done by one thread? Its probably fine, but we should think about this. > >> > * Can this really the full brunt of logging (-Xlog:*=trace) over many threads? >> > to be honest, it can't. I see a lot of dropping messages on console with -XX:+Verbose. >> >> I have tuned parameters that it won't drop messages easily for normal GC activity with info verbosity. >> `-Xlog:*=trace` will drop messages indeed, but this is tunable. I have a [stress test](https://github.com/navyxliu/JavaGCworkload/blob/master/runJavaUL.sh) to tweak parameters. >> >> > * Does this work with multiple target and multiple IO files? >> >> Yes, it works if you have multiple outputs. `LogAsyncFlusher` is singleton. one single buffer and one thread serve them all. > > The question was how we handle multiple trace sinks, see my "CSR" remarks. > >> >> > * Does it cost anything if logging is off or not async? >> >> so far, LogAsyncFlusher as a periodic task remains active even no output is in async_mode. >> it wakes up every `LogAsyncInterval` ms. it's a dummy task because the deque is always empty. the cost is almost nothing. >> >> > Update: Okay, I see you use PeriodicTask and the WatcherThread. Is this really enough? I would be concerned that it either runs too rarely to be able to swallow all output or that it runs that often that it monopolizes the WatcherThread. >> > I actually expected a separate Thread - or multiple, one per output - for this, waking up when there is something to write. That would also be more efficient than constant periodic polling. >> >> You concern is reasonable. I don't understand why there is only one watchThread and up to 10 periodic tasks are crowded in it. >> If it's a bottleneck, I plan to improve this infrastructure. I can make hotspot supports multiple watcher threads and spread periodic tasks among them. All watcher threads are connected using linked list to manage. >> >> Can we treat it as a separated task? for normal usage, I think the delay is quite managed. Writing thousands of lines to a file usually can be done in sub-ms. >> >> > * How is the performance impact when we have lots of concurrent writes from many threads? I see that you use a Mutex to synchronize the logging threads with the flush service. Before, these threads would have done concurrent IO and that would be handled by the libc, potentially without locking. >> >> IMHO, logging shouldn't hurt performance a lot. At least, those do impact on performance are not supposed to enable by default. On the other side, I hope logging messages from other threads avoid from interweaving when I enable them to read. >> That leads me to use mutex. That actually improves readability. >> >> My design target is non-blocking. pop_all() is an ad-hoc operation which pop up all elements and release the mutex immediately. writeback() does IO without it. > > Since you use a mutex it introduces synchronization, however short, across all logging threads. So it influences runtime behavior. For the record, I think this is okay; maybe a future RFE could improve this with a lockless algorithm. I just wanted to know if you measured anything, and I was curious whether there is a difference now between synchronous and asynchronous logging. > > (Funnily, asynchronous logging is really more synchronous in a sense, since it synchronizes all logging threads across a common resource). > >> >> In our real applications, we haven't seen this feature downgrade GC performance yet. >> >> > I think this feature could be useful. I am a bit concerned with the increased complexity this brings. UL is already a very (I think unnecessarily) complex codebase. Maybe we should try to reduce its complexity first before adding new features to it. This is just my opinion, lets see what others think. >> > Cheers, Thomas >> >> I believe UL has its own reasons. In my defense, I don't make UL more complex. I only changed a couple of lines in one of its implementation file(logFileOutput.cpp) and didn't change its interfaces. >> >> I try my best to reuse existing codebase. We can always refactor existing code([JDK-8239066](https://bugs.openjdk.java.net/browse/JDK-8239066), [JDK-8263840](https://bugs.openjdk.java.net/browse/JDK-8263840)), but it's not this PR's purpose. >> > > I understand. Its fine to do this in a later RFE. > >> thanks, >> --lx > > Cheers, Thomas Note that I am really in favor of bringing async logging to UL; this issue bopped up again and again, brought in various forms by various people. It will be good to finally tackle this. But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR? ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From ihse at openjdk.java.net Mon Mar 29 09:59:30 2021 From: ihse at openjdk.java.net (Magnus Ihse Bursie) Date: Mon, 29 Mar 2021 09:59:30 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> Message-ID: On Mon, 29 Mar 2021 09:44:51 GMT, Magnus Ihse Bursie wrote: >>> Based on Ioi's suggestion I decided to try with a different locale as well. I tried setting my system locale to something else and with that I was able to reproduce the warnings you report, so it _could_ be an issue with locale settings. AFAIK only `en-us` is supported. Maybe you could confirm/check your locale settings as well? (can run `systeminfo` to get the current setting) >>> >>> I've had problems in the past as well because I had the wrong locale set, and some of the tests were failing because of that. So, maybe rather than disabling the warnings, it might be more prudent to change the system locale of the used build systems to prevent similar issues in the future (FWIW, the display language doesn't seem to affect `cl` so that could still be whatever is convenient). >> >> Hi Jorn, >> >> Sorry for the delayed response. I set the locale of my Cygwin environment to en-us via `export LC_ALL="en_US.UTF-8"`, these warnings are generated when compiling as well as before. Should I change this setting globally instead of just changing it in Cygwin? >> >> Anyway, it seems that this problem is caused by the locale setting because as you mentioned, this problem appears when you change the locale setting to Chinese. Setting the locale to English does not have this problem. I checked the building document, but there is no mention of the need to set the locale option to en-us before building JDK. If this is really a necessary step for building, I think we should add this step in the building document, otherwise, I think we should fix this problem in HotSpot. >> >> Best Regards >> Yang > > @kelthuzadx Hi Yang, > > Setting locale to US English used to be documented as a build requirement. When the "new" build-infra system was introduced several years ago, we thought that all locale-dependent issues were solved, and removed that requirement. Later on, issues crept in on non-Windows platforms, but these were handled by setting LC_ALL=C in the build system itself while building. > > The problem with requiring US English as locale on Windows is that you cannot set that for a single process, but must change the entire system locale for the user (which also often requires a reboot). Otoh, if we do *not* require US English, the test matrix grows almost without bounds, and we might run into a lot of weird problems (like this one!). > > So I'm not really comfortable with just patching around this issue, since: > a) it does not occur in what is at least the "recommended" locale, and > b) more issues are likely to creep up in the future (in fact, there might already be testing issues as Jorn says) > > On the other hand, I am not really comfortable either with just stating in the build document that US English is the only supported Windows locale, since it has such far-reaching consequences for the individual developers. > > In short, I'm torn between two bad solutions, but I'm definitely leaning towards the latter. If only there were some way of setting the locale just for cl.exe! :-( I searched the net once more for setting the locale, and this time I found some creative workarounds on superuser. The suggestion is to create a *secondary* user account, and set US English as locale for that account. Then you can go back to your main account, and us the "Run as..." functionality to execute an arbitrary command as that user. This could be done by: `%comspec% runas /profile /user:yourotheruser "the_application_you_want_ to_run_in_english"` or using the GUI (shift+right click on the icon, select `Run as different user`). I assume you would be able to start a cygwin shell like this, and have all processes started in that shell belonging to this US English user. @kelthuzadx Can you please verify if this method works? If so, I believe it is convenient enough for us to be able to require US English locale for building on Windows. ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From david.holmes at oracle.com Mon Mar 29 10:12:54 2021 From: david.holmes at oracle.com (David Holmes) Date: Mon, 29 Mar 2021 20:12:54 +1000 Subject: RFR: 8264285: Do not support FLAG_SET_XXX for VM flags of string type In-Reply-To: References: Message-ID: On 29/03/2021 3:53 pm, Ioi Lam wrote: > On Fri, 26 Mar 2021 16:21:03 GMT, Ioi Lam wrote: > >> We have two versions of `JVMFlagAccess::ccstrAtPut()` that are slightly different. >> >> The following version is supposed to be used only by the `FLAG_SET_{CMDLINE,ERGO,MGMT}` macros. However, it's not used anywhere in the HotSpot source code: >> >> JVMFlag::Error JVMFlagAccess::ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin) { >> JVMFlag* faddr = JVMFlag::flag_from_enum(flag); >> assert(faddr->is_ccstr(), "wrong flag type"); >> ccstr old_value = faddr->get_ccstr(); >> trace_flag_changed(faddr, old_value, value, origin); >> char* new_value = os::strdup_check_oom(value); >> faddr->set_ccstr(new_value); >> if (!faddr->is_default() && old_value != NULL) { >> // Prior value is heap allocated so free it. >> FREE_C_HEAP_ARRAY(char, old_value); >> } >> faddr->set_origin(origin); >> return JVMFlag::SUCCESS; >> } >> >> It's not clear whether this unused version is actually correct since the last JVMFlag rewrite in [JDK-8081833](https://bugs.openjdk.java.net/browse/JDK-8081833), due to complete lack of testing. Let's remove this version to simplify code maintenance. >> >> If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. > >> >> ... and using ccstrAtPut doesn't update the origin of the flag as might >> be desired when using the macros. > > This is the version I removed: > > static JVMFlag::Error ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin); > > This is the remaining version: > > static JVMFlag::Error ccstrAtPut(JVMFlag* flag, ccstr* value, JVMFlagOrigin origin); > > So they are practically the same API. The origin is changed in both. The only difference is they have unobvious subtle difference in how they handle the buffer allocation. Sorry I'm confused, if both support setting the origin why can't you use this version with the macros? Thanks, David ----- > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3219 > From mdoerr at openjdk.java.net Mon Mar 29 10:21:46 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Mon, 29 Mar 2021 10:21:46 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting [v2] In-Reply-To: References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> Message-ID: On Fri, 26 Mar 2021 14:07:52 GMT, Lutz Schmidt wrote: >> This enhancement is intended to improve the hardware feature detection and reporting, in particular for more recently introduced hardware. The enhancement is a prerequisite for possible future feature exploitation. >> >> Reviews are highly welcome and appreciated. > > Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: > > update copyright headers I didn't expect the change to become that large. But it looks good to me. The lengthy output only gets generated with -XX:+Verbose. That's fine. src/hotspot/cpu/s390/vm_version_s390.cpp line 87: > 85: "system-z, g8-z14, ldisp_fast, extimm, pcrel_load/store, cmpb, cond_load/store, interlocked_update, txm, vectorinstr, instrext2, venh1)", > 86: "system-z, g9-z15, ldisp_fast, extimm, pcrel_load/store, cmpb, cond_load/store, interlocked_update, txm, vectorinstr, instrext2, venh1, instrext3, VEnh2 )" > 87: }; Would be nice to generate the feature string from a table it instead of having so many copies. But I'm ok with it for now. src/hotspot/cpu/s390/vm_version_s390.cpp line 94: > 92: > 93: if (Verbose || PrintAssembly || PrintStubCode) { > 94: print_features_internal("CPU Version as detected internally: ", PrintAssembly || PrintStubCode); 2 spaces while other usages have no spaces? ------------- Marked as reviewed by mdoerr (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3196 From lucy at openjdk.java.net Mon Mar 29 10:36:48 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Mon, 29 Mar 2021 10:36:48 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting [v2] In-Reply-To: References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> Message-ID: <5OhEhVbnzUEotih1ykgz3Omnt3jQVEYG4B2uMFbCROY=.cbe91475-a6a5-41c3-a11c-4e23d9df9937@github.com> On Mon, 29 Mar 2021 10:19:02 GMT, Martin Doerr wrote: >> Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: >> >> update copyright headers > > I didn't expect the change to become that large. But it looks good to me. The lengthy output only gets generated with -XX:+Verbose. That's fine. Thank you for the review, Martin! > src/hotspot/cpu/s390/vm_version_s390.cpp line 94: > >> 92: >> 93: if (Verbose || PrintAssembly || PrintStubCode) { >> 94: print_features_internal("CPU Version as detected internally: ", PrintAssembly || PrintStubCode); > > 2 spaces while other usages have no spaces? You are right. These spaces were excessive. Now they are gone. ------------- PR: https://git.openjdk.java.net/jdk/pull/3196 From lucy at openjdk.java.net Mon Mar 29 10:36:47 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Mon, 29 Mar 2021 10:36:47 GMT Subject: RFR: 8264173: [s390] Improve Hardware Feature Detection And Reporting [v3] In-Reply-To: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> References: <-OjFHEcBr4ajS6JQWPsPHXm2w8MNQ5b028UlabrDv84=.174c9149-ca67-4929-a3b5-0bc6f561df5e@github.com> Message-ID: <-dA2uhXwH7eME201-lzohAOpXigVwsUjtGACZZWMRXc=.0a4e611f-a3c0-470f-a310-5d17715492e1@github.com> > This enhancement is intended to improve the hardware feature detection and reporting, in particular for more recently introduced hardware. The enhancement is a prerequisite for possible future feature exploitation. > > Reviews are highly welcome and appreciated. Lutz Schmidt has updated the pull request incrementally with one additional commit since the last revision: cleaned up some spaces ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3196/files - new: https://git.openjdk.java.net/jdk/pull/3196/files/d15d1157..f894238e Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3196&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3196&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3196.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3196/head:pull/3196 PR: https://git.openjdk.java.net/jdk/pull/3196 From akozlov at openjdk.java.net Mon Mar 29 11:45:34 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 29 Mar 2021 11:45:34 GMT Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 Message-ID: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> Please review a fix for compiler/debug/VerifyAdapterSharing.java failure on macos/aarch64 platform. The root cause is in missing W^X switch in JNI DestroyJavaVM. I reviewed the rest of the JNI Invoke Interface functions. DetachCurrentThread needs a similar change, although nothing fails immediately. So DetachCurrentThread is changed as a precaution. ------------- Commit messages: - Switch W^X in DestroyJavaVM, DetachCurrentThread Changes: https://git.openjdk.java.net/jdk/pull/3241/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3241&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8262894 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3241.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3241/head:pull/3241 PR: https://git.openjdk.java.net/jdk/pull/3241 From ysuenaga at openjdk.java.net Mon Mar 29 12:28:25 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Mon, 29 Mar 2021 12:28:25 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> Message-ID: <0qKPSmxPH02xshv9gpcarh-LIc6xiZnlYVCDrRPtCP0=.94eaa31a-501b-4d48-ade0-ae1abf6acddf@github.com> On Mon, 29 Mar 2021 09:56:10 GMT, Thomas Stuefe wrote: > But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR? I posted similar diacussion to hotspot-runtime-dev last November. It aims to implement to send UL via network socket. I believe this PR helps it. https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From coleenp at openjdk.java.net Mon Mar 29 13:26:35 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 29 Mar 2021 13:26:35 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 05:00:25 GMT, David Holmes wrote: >> This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. >> >> Tested with tier1-7. > > Looks good! > > Thanks, > David Thanks David. @tstuefe Can you review this? ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From dholmes at openjdk.java.net Mon Mar 29 13:47:35 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Mon, 29 Mar 2021 13:47:35 GMT Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> Message-ID: On Mon, 29 Mar 2021 11:39:31 GMT, Anton Kozlov wrote: > Please review a fix for compiler/debug/VerifyAdapterSharing.java failure on macos/aarch64 platform. The root cause is in missing W^X switch in JNI DestroyJavaVM. > > I reviewed the rest of the JNI Invoke Interface functions. DetachCurrentThread needs a similar change, although nothing fails immediately. So DetachCurrentThread is changed as a precaution. Hi Anton, In so much as I understand the fact transitions are missing the introduction of those transitions seems fine - but can be simplified due to an existing code quirk (see comments below). My main concern with this W^X stuff is that I don't see a clear way to know exactly where a transition needs to be placed. The missing cases here suggest it should be handled in the thread-state transition code, but you've previously written: " when we execute JVM code (owned by libjvm.so, starting from JVM entry function), we switch to Write state. When we leave JVM to execute generated or JNI code, we switch to Executable state. I would like to highlight that JVM code does not mean the VM state of the java thread" so I'm unclear exactly how we identify the points where these transitions must occur? What kind of "VM code" must be guarded this way? I don't see this documented in the code anywhere. Thanks, David src/hotspot/share/prims/jni.cpp line 3728: > 3726: > 3727: // We are going to VM, change W^X state to the expected one. > 3728: MACOS_AARCH64_ONLY(WXMode oldmode = thread->enable_wx(WXWrite)); No need to save old state - see below. src/hotspot/share/prims/jni.cpp line 3738: > 3736: } else { > 3737: ThreadStateTransition::transition(thread, _thread_in_vm, _thread_in_native); > 3738: MACOS_AARCH64_ONLY(thread->enable_wx(oldmode)); This is actually unnecessary as Threads::destroy_vm never returns anything but true (we should change it to a void method and clean this up). If it were to fail you would have to know at what point it failed to determine whether you can actually touch the current thread any more. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3241 From yyang at openjdk.java.net Mon Mar 29 14:44:28 2021 From: yyang at openjdk.java.net (Yi Yang) Date: Mon, 29 Mar 2021 14:44:28 GMT Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> Message-ID: On Mon, 29 Mar 2021 09:56:09 GMT, Magnus Ihse Bursie wrote: >> @kelthuzadx Hi Yang, >> >> Setting locale to US English used to be documented as a build requirement. When the "new" build-infra system was introduced several years ago, we thought that all locale-dependent issues were solved, and removed that requirement. Later on, issues crept in on non-Windows platforms, but these were handled by setting LC_ALL=C in the build system itself while building. >> >> The problem with requiring US English as locale on Windows is that you cannot set that for a single process, but must change the entire system locale for the user (which also often requires a reboot). Otoh, if we do *not* require US English, the test matrix grows almost without bounds, and we might run into a lot of weird problems (like this one!). >> >> So I'm not really comfortable with just patching around this issue, since: >> a) it does not occur in what is at least the "recommended" locale, and >> b) more issues are likely to creep up in the future (in fact, there might already be testing issues as Jorn says) >> >> On the other hand, I am not really comfortable either with just stating in the build document that US English is the only supported Windows locale, since it has such far-reaching consequences for the individual developers. >> >> In short, I'm torn between two bad solutions, but I'm definitely leaning towards the latter. If only there were some way of setting the locale just for cl.exe! :-( > > I searched the net once more for setting the locale, and this time I found some creative workarounds on superuser. The suggestion is to create a *secondary* user account, and set US English as locale for that account. Then you can go back to your main account, and us the "Run as..." functionality to execute an arbitrary command as that user. > > This could be done by: `%comspec% runas /profile /user:yourotheruser "the_application_you_want_ to_run_in_english"` or using the GUI (shift+right click on the icon, select `Run as different user`). > > I assume you would be able to start a cygwin shell like this, and have all processes started in that shell belonging to this US English user. > > @kelthuzadx Can you please verify if this method works? If so, I believe it is convenient enough for us to be able to require US English locale for building on Windows. Hi Magnus, > I searched the net once more for setting the locale, and this time I found some creative workarounds on superuser. The suggestion is to create a secondary user account, and set US English as locale for that account. Then you can go back to your main account, and us the "Run as..." functionality to execute an arbitrary command as that user. > This could be done by: %comspec% runas /profile /user:yourotheruser "the_application_you_want_ to_run_in_english" or using the GUI (shift+right click on the icon, select Run as different user). Thanks for your investigations and kind suggestions. It is more troublesome to add new a user to the Chinese system and set its system locale to English. Instead of doing this, I prefer to directly change the system locale to English. When I set the system locale to English(`Control Panel->Change date, time,...->Administrative->Change System locale->English`), and it indeed works for building! No warnings were generated. All works fine. > a) it does not occur in what is at least the "recommended" locale, and > b) more issues are likely to creep up in the future (in fact, there might already be testing issues as Jorn says) > On the other hand, I am not really comfortable either with just stating in the build document that US English is the only supported Windows locale, since it has such far-reaching consequences for the individual developers. You convinced me, I agree with you that stating these has far-reaching consequences and your internal test matrix will become incredibly heavy. However, I think we can add a section in the FAQ or other places in the building document to give a solution for such problems as much as possible, e.g. Q: Why I can not build JDK on a non-English system? What should I do next? A: Maybe you can change your system locale to English and try again Just IMHO, :-) Best Regards, Yang ------------- PR: https://git.openjdk.java.net/jdk/pull/3107 From aph at redhat.com Mon Mar 29 15:03:01 2021 From: aph at redhat.com (Andrew Haley) Date: Mon, 29 Mar 2021 16:03:01 +0100 Subject: RFC: JEP drafts PAC for Linux/AArch64 (JDK-8264130) and Arm64 for MacOS/AArch64 (JDK-8264131) In-Reply-To: <55F74A14-5AB8-44F7-8903-BE03AE087484@arm.com> References: <55F74A14-5AB8-44F7-8903-BE03AE087484@arm.com> Message-ID: <077be8df-e810-ba35-c465-54b008f69ad7@redhat.com> On 3/29/21 10:01 AM, Alan Hayward wrote: > I?ve been investigating PAC for the AArch64 ports - figuring out > what should be supported and trying it out in code. PAC is an > AArch64 extension that provides instructions for signing and > authenticating values and addresses; it can be used to bring > protection against various types of attacks, for a small performance > cost. If OpenJDK is running on a system with PAC protection enabled > in the kernel, then it should use the feature. I question this "should", and would like to see a "because." I understand why this stuff is attractive, but as I understand it PAC is mostly a band-aid for unsafe programming languages with nasty features like stack-allocated buffer overflows. In a sense, what PAC is trying to do is bring C and C++ closer to languages such as Java by strengthening pointer checks, which is no bad thing. I am aware, of course, that mistakes can be made in HotSpot itself, which is written in C++, so there may be some point to it in the JVM. I'd like to see how complex the implementation is likely to be, especially given that we deliberately, as a matter of design, redirect return address pointers in several places. (That's not a bug, it's a feature.) PAC support potentially makes the AArch64 port significantly more complex, and therefore potentially more buggy. I know that Arm wants this feature to be used where possible, but it has to be justified by the benefit of users. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From github.com+168222+mgkwill at openjdk.java.net Mon Mar 29 15:38:45 2021 From: github.com+168222+mgkwill at openjdk.java.net (Marcus G K Williams) Date: Mon, 29 Mar 2021 15:38:45 GMT Subject: RFR: 8256155: os::Linux Populate all large_page_sizes, select smallest page size in reserve_memory_special_huge_tlbfs* [v26] In-Reply-To: References: Message-ID: <5HfK00K9K9Z9E9vnFgvW1Z2bpZLVmXYy72vM4ZMN1Uk=.985bdf35-b2e3-4eed-8933-9ce3af5edd26@github.com> > When using LargePageSizeInBytes=1G, os::Linux::reserve_memory_special_huge_tlbfs* cannot select large pages smaller than 1G. Code heap usually uses less than 1G, so currently the code precludes code heap from using > Large pages in this circumstance and when os::Linux::reserve_memory_special_huge_tlbfs* is called page sizes fall back to Linux::page_size() (usually 4k). > > This change allows the above use case by populating all large_page_sizes present in /sys/kernel/mm/hugepages in _page_sizes upon calling os::Linux::setup_large_page_size(). > > In os::Linux::reserve_memory_special_huge_tlbfs* we then select the largest large page size available in _page_sizes that is smaller than bytes being reserved. Marcus G K Williams has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 43 commits: - Rebase on pull/3073 Signed-off-by: Marcus G K Williams - Merge branch 'pull/3073' into update_hlp - Thomas review. Changed commit_memory_special to return bool to signal if the request succeeded or not. - Self review. Update helper name to better match commit_memory_special(). - Marcus review. Updated comments. - Ivan review Renamed helper to commit_memory_special and updated the comments. - 8262291: Refactor reserve_memory_special_huge_tlbfs - Merge branch 'master' into update_hlp - Addressed kstefanj review suggestions Signed-off-by: Marcus G K Williams - Use SIZE_FORMAT in logging Signed-off-by: Marcus G K Williams - ... and 33 more: https://git.openjdk.java.net/jdk/compare/41276eb8...a9b3dee4 ------------- Changes: https://git.openjdk.java.net/jdk/pull/1153/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=1153&range=25 Stats: 282 lines in 4 files changed: 80 ins; 101 del; 101 mod Patch: https://git.openjdk.java.net/jdk/pull/1153.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/1153/head:pull/1153 PR: https://git.openjdk.java.net/jdk/pull/1153 From stuefe at openjdk.java.net Mon Mar 29 16:49:27 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 29 Mar 2021 16:49:27 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: <7vek4vBeq_m6kHjh6V5999ZsZYJ-SAvTg3LCtajlBrk=.25130296-3ca8-4437-9528-2a38447569db@github.com> Message-ID: On Mon, 29 Mar 2021 04:59:49 GMT, David Holmes wrote: >> It can't work from the VMThread. Patricio and I were chatting yesterday and he pointed out neither of these VM operations can nest (VM_ChangeBreakpoints and VM_MetaspaceGC) making up names but you get the point. > > Okay. Are you sure? This means that calls to your new non-TRAPS Metaspace::allocate() fail every time the GC threshold is touched. Since CLMS->allocate() fails not only for "hard" OOMS, when we ran against a hard limit, but also when we hit the GC threshold. satisfy_failed_xxx handles both cases; in the latter case specifically it increases the threshold and retries the allocation. I guess the next "real" allocate, one done with TRAPS, would then remove the blockage and increase the threshold and life goes on, but meanwhile a number of metaspace allocations could have unnecessarily failed. It looks like you could solve that by manually calling `ClassLoaderMetaspace::expand_and_allocate` after a failing allocate in your non-TRAPS version; but I'm not completely sure that is right either, since it would mean we miss one GC opportunity. This stuff feels way too complex. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From stuefe at openjdk.java.net Mon Mar 29 16:59:32 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Mon, 29 Mar 2021 16:59:32 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 13:24:11 GMT, Coleen Phillimore wrote: >> Looks good! >> >> Thanks, >> David > > Thanks David. @tstuefe Can you review this? Hi Coleen, GH swallowed my in code comment, so I repeat it here... I'm not sure this is correct. Your new non-TRAPS Metaspace::allocate() would fail every time the GC threshold is touched. Where the old TRAPS version would break through the threshold and allocate successfully. CLMS->allocate() fails not only for "hard" OOMS, when we ran against MaxMetaspaceSize or CompressedClassSpaceSize, but also when we hit the GC threshold. `satisfy_failed_xxx` handles the latter case by increasing the threshold and retrying the allocation, in addition to scheduling a new GC maybe. I guess the next "real" allocate, done with TRAPS, would then remove the blockage and increase the threshold and life goes on, but a number of metaspace allocations could have unnecessarily failed. This means that periodically we end up without profile counters on methods, and whatever the effect is of CLDG->had_metaspace_oom. It looks like you could solve that by manually calling `ClassLoaderMetaspace::expand_and_allocate` after a failing `ClassLoaderMetaspace::allocate` in your non-TRAPS version. That may mean though that we miss out on GC possibilities, basically ignoring some instances of a triggered GC threshold without doing a GC. Would be nice to simplify this. See also https://github.com/openjdk/jdk/pull/2289. This stuff is very complex. Cheers, Thomas ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From hseigel at openjdk.java.net Mon Mar 29 17:45:56 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Mon, 29 Mar 2021 17:45:56 GMT Subject: RFR: 8264193: Remove TRAPS parameters for modules and defaultmethods Message-ID: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> Please review this change for JDK-8264193 to remove unneeded TRAPS parameters from modules and default methods files. Besides removing TRAPS, Modules::get_named_module() was changed to return an oop instead of a jobject, removing its need for a TRAPS parameter. This change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. Thanks, Harold ------------- Commit messages: - 8264193: Remove TRAPS parameters for modules and defaultmethods Changes: https://git.openjdk.java.net/jdk/pull/3247/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3247&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264193 Stats: 61 lines in 8 files changed: 1 ins; 13 del; 47 mod Patch: https://git.openjdk.java.net/jdk/pull/3247.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3247/head:pull/3247 PR: https://git.openjdk.java.net/jdk/pull/3247 From akozlov at openjdk.java.net Mon Mar 29 19:18:34 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Mon, 29 Mar 2021 19:18:34 GMT Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> Message-ID: <03NlfuqqZVMecpcrcJ6-GqRCohFtUzRpZl2ZvlZPk7o=.226e91f5-1b82-4215-a500-0d7925fad3a9@github.com> On Mon, 29 Mar 2021 13:44:18 GMT, David Holmes wrote: >> Please review a fix for compiler/debug/VerifyAdapterSharing.java failure on macos/aarch64 platform. The root cause is in missing W^X switch in JNI DestroyJavaVM. >> >> I reviewed the rest of the JNI Invoke Interface functions. DetachCurrentThread needs a similar change, although nothing fails immediately. So DetachCurrentThread is changed as a precaution. > > Hi Anton, > > In so much as I understand the fact transitions are missing the introduction of those transitions seems fine - but can be simplified due to an existing code quirk (see comments below). > > My main concern with this W^X stuff is that I don't see a clear way to know exactly where a transition needs to be placed. The missing cases here suggest it should be handled in the thread-state transition code, but you've previously written: > > " when we execute JVM code (owned by libjvm.so, starting from JVM entry function), we switch to Write state. When we leave JVM to execute generated or JNI code, we switch to Executable state. I would like to highlight that JVM code does not mean the VM state of the java thread" > > so I'm unclear exactly how we identify the points where these transitions must occur? What kind of "VM code" must be guarded this way? I don't see this documented in the code anywhere. > > Thanks, > David Hi David, Thank you for the review. > so I'm unclear exactly how we identify the points where these transitions must occur? What kind of "VM code" must be guarded this way? I don't see this documented in the code anywhere. For usual JNI function implementation we switch to WXWrite in JNI_ENTRY (JNI_ENTRY_NO_PRESERVE to be precise), so xxx_ENTRY style macro defines a border between JVM and the rest of the code. JNI Invocation functions are also called directly from native code, so they should have W^X transition. But since they are implementing something rather special, they don't use any special ENTRY macro, which makes them less evident to be JVM entry points. Here and in other cases, when we do W^X in an apparently random function, it is because the function is called directly from the interpreter or native or generated code, but the function is not defined with ENTRY macro. Hope it clarifies the logic a bit. I've checked `Threads::destroy_vm`, you're right! But I'd consider missing handling after possible `false` as a bug. If you don't mind, I prefer the extra code in the existing else clause. But I agree the whole clause can be cleaned up. Thanks for pointing! ------------- PR: https://git.openjdk.java.net/jdk/pull/3241 From lfoltan at openjdk.java.net Mon Mar 29 19:20:36 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Mon, 29 Mar 2021 19:20:36 GMT Subject: RFR: 8264193: Remove TRAPS parameters for modules and defaultmethods In-Reply-To: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> References: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> Message-ID: On Mon, 29 Mar 2021 17:40:09 GMT, Harold Seigel wrote: > Please review this change for JDK-8264193 to remove unneeded TRAPS parameters from modules and default methods files. Besides removing TRAPS, Modules::get_named_module() was changed to return an oop instead of a jobject, removing its need for a TRAPS parameter. > > This change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Looks good Harold! Lois ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3247 From coleenp at openjdk.java.net Mon Mar 29 20:07:46 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 29 Mar 2021 20:07:46 GMT Subject: Withdrawn: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 21:47:46 GMT, Coleen Phillimore wrote: > This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. > > Tested with tier1-7. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Mon Mar 29 20:18:25 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 29 Mar 2021 20:18:25 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 16:56:23 GMT, Thomas Stuefe wrote: >> Thanks David. @tstuefe Can you review this? > > Hi Coleen, > > GH swallowed my in code comment, so I repeat it here... > > I'm not sure this is correct. Your new non-TRAPS Metaspace::allocate() would fail every time the GC threshold is touched. Where the old TRAPS version would break through the threshold and allocate successfully. > > CLMS->allocate() fails not only for "hard" OOMS, when we ran against MaxMetaspaceSize or CompressedClassSpaceSize, but also when we hit the GC threshold. `satisfy_failed_xxx` handles the latter case by increasing the threshold and retrying the allocation, in addition to scheduling a new GC maybe. > > I guess the next "real" allocate, done with TRAPS, would then remove the blockage and increase the threshold and life goes on, but a number of metaspace allocations could have unnecessarily failed. This means that periodically we end up without profile counters on methods, and whatever the effect is of CLDG->had_metaspace_oom. > > It looks like you could solve that by manually calling `ClassLoaderMetaspace::expand_and_allocate` after a failing `ClassLoaderMetaspace::allocate` in your non-TRAPS version. That may mean though that we miss out on GC possibilities, basically ignoring some instances of a triggered GC threshold without doing a GC. > > Would be nice to simplify this. See also https://github.com/openjdk/jdk/pull/2289. This stuff is very complex. > > Cheers, Thomas I deleted this branch by mistake, now restored. > I'm not sure this is correct. Your new non-TRAPS Metaspace::allocate() would fail every time the GC threshold is touched. Where the old TRAPS version would break through the threshold and allocate successfully. I realize this. It's just an attempt to allocate and it's designed to be used during a safepoint for only this allocation. I could change this to only call the non-TRAPS version of MethodCounters if we're at a safepoint? Would that help? Then the only time we'll miss out on metaspace counters periodically is if they were created to set breakpoints in a safepoint. I'd hate for this special case to know more about metaspace, ala calling ClassLoaderMetaspace::expand_and_allocate. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From ccheung at openjdk.java.net Mon Mar 29 20:34:41 2021 From: ccheung at openjdk.java.net (Calvin Cheung) Date: Mon, 29 Mar 2021 20:34:41 GMT Subject: RFR: 8264193: Remove TRAPS parameters for modules and defaultmethods In-Reply-To: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> References: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> Message-ID: On Mon, 29 Mar 2021 17:40:09 GMT, Harold Seigel wrote: > Please review this change for JDK-8264193 to remove unneeded TRAPS parameters from modules and default methods files. Besides removing TRAPS, Modules::get_named_module() was changed to return an oop instead of a jobject, removing its need for a TRAPS parameter. > > This change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold LGTM. ------------- Marked as reviewed by ccheung (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3247 From coleenp at openjdk.java.net Mon Mar 29 21:22:06 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 29 Mar 2021 21:22:06 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v2] In-Reply-To: References: Message-ID: > This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. > > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Only allow non-TRAPS version of Metaspace::allocate at a safepoint (or by non-Java thread) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3207/files - new: https://git.openjdk.java.net/jdk/pull/3207/files/3da8d1b5..bd7c32d4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3207&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3207&range=00-01 Stats: 17 lines in 4 files changed: 13 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/jdk/pull/3207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3207/head:pull/3207 PR: https://git.openjdk.java.net/jdk/pull/3207 From iklam at openjdk.java.net Mon Mar 29 21:40:25 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 29 Mar 2021 21:40:25 GMT Subject: RFR: 8264285: Do not support FLAG_SET_XXX for VM flags of string type In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 05:50:40 GMT, Ioi Lam wrote: >> We have two versions of `JVMFlagAccess::ccstrAtPut()` that are slightly different. >> >> The following version is supposed to be used only by the `FLAG_SET_{CMDLINE,ERGO,MGMT}` macros. However, it's not used anywhere in the HotSpot source code: >> >> JVMFlag::Error JVMFlagAccess::ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin) { >> JVMFlag* faddr = JVMFlag::flag_from_enum(flag); >> assert(faddr->is_ccstr(), "wrong flag type"); >> ccstr old_value = faddr->get_ccstr(); >> trace_flag_changed(faddr, old_value, value, origin); >> char* new_value = os::strdup_check_oom(value); >> faddr->set_ccstr(new_value); >> if (!faddr->is_default() && old_value != NULL) { >> // Prior value is heap allocated so free it. >> FREE_C_HEAP_ARRAY(char, old_value); >> } >> faddr->set_origin(origin); >> return JVMFlag::SUCCESS; >> } >> >> It's not clear whether this unused version is actually correct since the last JVMFlag rewrite in [JDK-8081833](https://bugs.openjdk.java.net/browse/JDK-8081833), due to complete lack of testing. Let's remove this version to simplify code maintenance. >> >> If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. > >> > If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. >> >> ... and using ccstrAtPut doesn't update the origin of the flag as might >> be desired when using the macros. > > This is the version I removed: > > static JVMFlag::Error ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin); > > This is the remaining version: > > static JVMFlag::Error ccstrAtPut(JVMFlag* flag, ccstr* value, JVMFlagOrigin origin); > > So they are practically the same API. The origin is changed in both. The only difference is they have unobvious subtle difference in how they handle the buffer allocation. Thanks to @dholmes-ora for pointing out the problems with this PR. I have changed the REF to fix the problem differently. To avoid confusion, I am closing this PR. I opened a new PR (https://github.com/openjdk/jdk/pull/3254) for the new fix. ------------- PR: https://git.openjdk.java.net/jdk/pull/3219 From iklam at openjdk.java.net Mon Mar 29 21:40:25 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 29 Mar 2021 21:40:25 GMT Subject: Withdrawn: 8264285: Do not support FLAG_SET_XXX for VM flags of string type In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 16:21:03 GMT, Ioi Lam wrote: > We have two versions of `JVMFlagAccess::ccstrAtPut()` that are slightly different. > > The following version is supposed to be used only by the `FLAG_SET_{CMDLINE,ERGO,MGMT}` macros. However, it's not used anywhere in the HotSpot source code: > > JVMFlag::Error JVMFlagAccess::ccstrAtPut(JVMFlagsEnum flag, ccstr value, JVMFlagOrigin origin) { > JVMFlag* faddr = JVMFlag::flag_from_enum(flag); > assert(faddr->is_ccstr(), "wrong flag type"); > ccstr old_value = faddr->get_ccstr(); > trace_flag_changed(faddr, old_value, value, origin); > char* new_value = os::strdup_check_oom(value); > faddr->set_ccstr(new_value); > if (!faddr->is_default() && old_value != NULL) { > // Prior value is heap allocated so free it. > FREE_C_HEAP_ARRAY(char, old_value); > } > faddr->set_origin(origin); > return JVMFlag::SUCCESS; > } > > It's not clear whether this unused version is actually correct since the last JVMFlag rewrite in [JDK-8081833](https://bugs.openjdk.java.net/browse/JDK-8081833), due to complete lack of testing. Let's remove this version to simplify code maintenance. > > If you need to modify flags of the string type, do not use `FLAG_SET_{CMDLINE,ERGO,MGMT}`. (A `static_assert` is added to prevent this). Instead, use the remaining version of `JVMFlagAccess::ccstrAtPut()`. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.java.net/jdk/pull/3219 From iklam at openjdk.java.net Mon Mar 29 21:47:02 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Mon, 29 Mar 2021 21:47:02 GMT Subject: RFR: 8264285: Clean the modification of ccstr JVM flags Message-ID: There are two versions of JVMFlagAccess::ccstrAtPut() for modifying JVM flags of the ccstr type (i.e., strings). - One version requires the caller to free the old value, but some callers don't do that (writeableFlags.cpp). - The other version frees the old value on behalf of the caller. However, this version is accessible only via FLAG_SET_XXX macros and is currently unused. So it's unclear whether it actually works. We should combine these two versions into a single function, fix problems in the callers, and add test cases. The old value should be freed automatically, because typically the caller isn't interested in the old value. Note that the FLAG_SET_XXX macros do not return the old value. Requiring the caller of FLAG_SET_XXX to free the old value would be tedious and error prone. ------------- Commit messages: - restored SET_FLAG_XXX for ccstr type, and fixed bugs in existing ccstr modification code - 8264285: Do not support FLAG_SET_XXX for VM flags of string type Changes: https://git.openjdk.java.net/jdk/pull/3254/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3254&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264285 Stats: 205 lines in 9 files changed: 160 ins; 24 del; 21 mod Patch: https://git.openjdk.java.net/jdk/pull/3254.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3254/head:pull/3254 PR: https://git.openjdk.java.net/jdk/pull/3254 From coleenp at openjdk.java.net Mon Mar 29 23:10:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Mon, 29 Mar 2021 23:10:42 GMT Subject: RFR: 8264193: Remove TRAPS parameters for modules and defaultmethods In-Reply-To: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> References: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> Message-ID: On Mon, 29 Mar 2021 17:40:09 GMT, Harold Seigel wrote: > Please review this change for JDK-8264193 to remove unneeded TRAPS parameters from modules and default methods files. Besides removing TRAPS, Modules::get_named_module() was changed to return an oop instead of a jobject, removing its need for a TRAPS parameter. > > This change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Looks good!! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3247 From david.holmes at oracle.com Tue Mar 30 00:57:44 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Mar 2021 10:57:44 +1000 Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: <03NlfuqqZVMecpcrcJ6-GqRCohFtUzRpZl2ZvlZPk7o=.226e91f5-1b82-4215-a500-0d7925fad3a9@github.com> References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> <03NlfuqqZVMecpcrcJ6-GqRCohFtUzRpZl2ZvlZPk7o=.226e91f5-1b82-4215-a500-0d7925fad3a9@github.com> Message-ID: On 30/03/2021 5:18 am, Anton Kozlov wrote: > On Mon, 29 Mar 2021 13:44:18 GMT, David Holmes wrote: > >>> Please review a fix for compiler/debug/VerifyAdapterSharing.java failure on macos/aarch64 platform. The root cause is in missing W^X switch in JNI DestroyJavaVM. >>> >>> I reviewed the rest of the JNI Invoke Interface functions. DetachCurrentThread needs a similar change, although nothing fails immediately. So DetachCurrentThread is changed as a precaution. >> >> Hi Anton, >> >> In so much as I understand the fact transitions are missing the introduction of those transitions seems fine - but can be simplified due to an existing code quirk (see comments below). >> >> My main concern with this W^X stuff is that I don't see a clear way to know exactly where a transition needs to be placed. The missing cases here suggest it should be handled in the thread-state transition code, but you've previously written: >> >> " when we execute JVM code (owned by libjvm.so, starting from JVM entry function), we switch to Write state. When we leave JVM to execute generated or JNI code, we switch to Executable state. I would like to highlight that JVM code does not mean the VM state of the java thread" >> >> so I'm unclear exactly how we identify the points where these transitions must occur? What kind of "VM code" must be guarded this way? I don't see this documented in the code anywhere. >> >> Thanks, >> David > > Hi David, > > Thank you for the review. > >> so I'm unclear exactly how we identify the points where these transitions must occur? What kind of "VM code" must be guarded this way? I don't see this documented in the code anywhere. > > For usual JNI function implementation we switch to WXWrite in JNI_ENTRY (JNI_ENTRY_NO_PRESERVE to be precise), so xxx_ENTRY style macro defines a border between JVM and the rest of the code. JNI Invocation functions are also called directly from native code, so they should have W^X transition. But since they are implementing something rather special, they don't use any special ENTRY macro, which makes them less evident to be JVM entry points. Here and in other cases, when we do W^X in an apparently random function, it is because the function is called directly from the interpreter or native or generated code, but the function is not defined with ENTRY macro. Hope it clarifies the logic a bit. I get the gist of where you have placed the transitions around the various "entry" points to VM code, but I don't understand what kind of VM code actually requires these transitions. For example a LEAF function doesn't have the transition so the code called from there can't require it - but what is the characteristic of code that can require it? > I've checked `Threads::destroy_vm`, you're right! But I'd consider missing handling after possible `false` as a bug. If you don't mind, I prefer the extra code in the existing else clause. But I agree the whole clause can be cleaned up. Thanks for pointing! I'll clean this part up under JDK-8264372. Cheers, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3241 > From dholmes at openjdk.java.net Tue Mar 30 01:20:44 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 30 Mar 2021 01:20:44 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v2] In-Reply-To: References: Message-ID: <8lcINJaJDOg62ESQch_n30qQTKXSauOH5qGJuD98T4I=.97409776-53ef-4308-bc1f-dffb6f2e907d@github.com> On Mon, 29 Mar 2021 21:22:06 GMT, Coleen Phillimore wrote: >> This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. >> >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Only allow non-TRAPS version of Metaspace::allocate at a safepoint (or by non-Java thread) Changes requested by dholmes (Reviewer). src/hotspot/share/oops/method.cpp line 570: > 568: if (current->is_Java_thread()) { > 569: // For when TRAPS is JavaThread. > 570: counters = MethodCounters::allocate(mh, current->as_Java_thread()); I'm not at all clear what we are doing here and it seems premature to anticipate the TRAPS change to JavaThread. The code will need reworking for that change because you still check for a pending exception regardless of what type of Thread current is. I don't see why we need to call a TRAPS version of allocate when we are going to clear any exception - just call the non-traps version (and remove the assert you added). Just because we are in a JavaThread it doesn't mean we have to throw exceptions. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From dholmes at openjdk.java.net Tue Mar 30 01:42:46 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 30 Mar 2021 01:42:46 GMT Subject: RFR: 8264193: Remove TRAPS parameters for modules and defaultmethods In-Reply-To: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> References: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> Message-ID: On Mon, 29 Mar 2021 17:40:09 GMT, Harold Seigel wrote: > Please review this change for JDK-8264193 to remove unneeded TRAPS parameters from modules and default methods files. Besides removing TRAPS, Modules::get_named_module() was changed to return an oop instead of a jobject, removing its need for a TRAPS parameter. > > This change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold Great cleanup Harold! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3247 From coleenp at openjdk.java.net Tue Mar 30 02:41:43 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 30 Mar 2021 02:41:43 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v2] In-Reply-To: <8lcINJaJDOg62ESQch_n30qQTKXSauOH5qGJuD98T4I=.97409776-53ef-4308-bc1f-dffb6f2e907d@github.com> References: <8lcINJaJDOg62ESQch_n30qQTKXSauOH5qGJuD98T4I=.97409776-53ef-4308-bc1f-dffb6f2e907d@github.com> Message-ID: On Tue, 30 Mar 2021 01:17:27 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Only allow non-TRAPS version of Metaspace::allocate at a safepoint (or by non-Java thread) > > src/hotspot/share/oops/method.cpp line 570: > >> 568: if (current->is_Java_thread()) { >> 569: // For when TRAPS is JavaThread. >> 570: counters = MethodCounters::allocate(mh, current->as_Java_thread()); > > I'm not at all clear what we are doing here and it seems premature to anticipate the TRAPS change to JavaThread. The code will need reworking for that change because you still check for a pending exception regardless of what type of Thread current is. > I don't see why we need to call a TRAPS version of allocate when we are going to clear any exception - just call the non-traps version (and remove the assert you added). Just because we are in a JavaThread it doesn't mean we have to throw exceptions. With this change, if the thread is not a Java thread, say the VMThread, we cannot throw an exception because we can't allocate a Java object for that exception. That's why we call the non-TRAPS version for !JavaThread. So there's no need to check for a pending exception because it can't throw one. Also, eventually we want to enforce that only JavaThreads can throw/catch Java exceptions. The change requested by Thomas was that if we *can* thrown an exception, we should call the TRAPS version because that version will adjust the GC threshold. We can't call GC with the non-TRAPS version. So this is the simplest thing. We have two versions of this function (the TRAPS version calls the non-TRAPS version). The non-TRAPS version is the special one that can be used by the VMThread for allocating metadata, all other callers stay the same. I can remove the as_Java_thread() and line 569 and you can add it back when we're able to change TRAPS to JavaThread. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Tue Mar 30 03:06:11 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 30 Mar 2021 03:06:11 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v3] In-Reply-To: References: Message-ID: > This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. > > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Make which version of MethodCounters::allocate() is called clearer. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3207/files - new: https://git.openjdk.java.net/jdk/pull/3207/files/bd7c32d4..459f63bf Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3207&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3207&range=01-02 Stats: 16 lines in 1 file changed: 8 ins; 1 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/3207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3207/head:pull/3207 PR: https://git.openjdk.java.net/jdk/pull/3207 From dongbo at openjdk.java.net Tue Mar 30 03:22:12 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 30 Mar 2021 03:22:12 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic [v2] In-Reply-To: References: Message-ID: > In JDK-8248188, IntrinsicCandidate and API is added for Base64 decoding. > Base64 decoding can be improved on aarch64 with ld4/tbl/tbx/st3, a basic idea can be found at http://0x80.pl/articles/base64-simd-neon.html#encoding-quadwords. > > Patch passed jtreg tier1-3 tests with linux-aarch64-server-fastdebug build. > Tests in `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java` runned specially for the correctness of the implementation. > > There can be illegal characters at the start of the input if the data is MIME encoded. > It would be no benefits to use SIMD for this case, so the stub use no-simd instructions for MIME encoded data now. > > A JMH micro, Base64Decode.java, is added for performance test. > With different input length (upper-bounded by parameter `maxNumBytes` in the JMH micro), > we witness ~2.5x improvements with long inputs and no regression with short inputs for raw base64 decodeing, minor improvements (~10.95%) for MIME on Kunpeng916. > > The Base64Decode.java JMH micro-benchmark results: > > Benchmark (lineSize) (maxNumBytes) Mode Cnt Score Error Units > > # Kunpeng916, intrinsic > Base64Decode.testBase64Decode 4 1 avgt 5 48.614 ? 0.609 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 58.199 ? 1.650 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 69.400 ? 0.931 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 96.818 ? 1.687 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 122.856 ? 9.217 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 130.935 ? 1.667 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 143.627 ? 1.751 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 152.311 ? 1.178 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 342.631 ? 0.584 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 573.635 ? 1.050 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 9534.136 ? 45.172 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 22718.726 ? 192.070 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 63.558 ? 0.336 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.504 ? 0.848 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 120.591 ? 0.608 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 324.314 ? 6.236 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 532.678 ? 4.670 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 678.126 ? 4.324 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 771.603 ? 6.393 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 889.608 ? 0.759 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 3663.557 ? 3.422 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7017.784 ? 9.128 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 128670.660 ? 7951.521 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 317113.667 ? 161.758 ns/op > > # Kunpeng916, default > Base64Decode.testBase64Decode 4 1 avgt 5 48.455 ? 0.571 ns/op > Base64Decode.testBase64Decode 4 3 avgt 5 57.937 ? 0.505 ns/op > Base64Decode.testBase64Decode 4 7 avgt 5 73.823 ? 1.452 ns/op > Base64Decode.testBase64Decode 4 32 avgt 5 106.484 ? 1.243 ns/op > Base64Decode.testBase64Decode 4 64 avgt 5 141.004 ? 1.188 ns/op > Base64Decode.testBase64Decode 4 80 avgt 5 156.284 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 96 avgt 5 174.137 ? 0.177 ns/op > Base64Decode.testBase64Decode 4 112 avgt 5 188.445 ? 0.572 ns/op > Base64Decode.testBase64Decode 4 512 avgt 5 610.847 ? 1.559 ns/op > Base64Decode.testBase64Decode 4 1000 avgt 5 1155.368 ? 0.813 ns/op > Base64Decode.testBase64Decode 4 20000 avgt 5 19751.477 ? 24.669 ns/op > Base64Decode.testBase64Decode 4 50000 avgt 5 50046.586 ? 523.155 ns/op > Base64Decode.testBase64MIMEDecode 4 1 avgt 10 64.130 ? 0.238 ns/op > Base64Decode.testBase64MIMEDecode 4 3 avgt 10 82.096 ? 0.205 ns/op > Base64Decode.testBase64MIMEDecode 4 7 avgt 10 118.849 ? 0.610 ns/op > Base64Decode.testBase64MIMEDecode 4 32 avgt 10 331.177 ? 4.732 ns/op > Base64Decode.testBase64MIMEDecode 4 64 avgt 10 549.117 ? 0.177 ns/op > Base64Decode.testBase64MIMEDecode 4 80 avgt 10 702.951 ? 4.572 ns/op > Base64Decode.testBase64MIMEDecode 4 96 avgt 10 799.566 ? 0.301 ns/op > Base64Decode.testBase64MIMEDecode 4 112 avgt 10 923.749 ? 0.389 ns/op > Base64Decode.testBase64MIMEDecode 4 512 avgt 10 4000.725 ? 2.519 ns/op > Base64Decode.testBase64MIMEDecode 4 1000 avgt 10 7674.994 ? 9.281 ns/op > Base64Decode.testBase64MIMEDecode 4 20000 avgt 10 142059.001 ? 157.920 ns/op > Base64Decode.testBase64MIMEDecode 4 50000 avgt 10 355698.369 ? 216.542 ns/op Dong Bo has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - trivial fixes - Handling error in SIMD case with loops, combining two non-SIMD cases into one code blob, addressing other comments - Merge branch 'master' into aarch64.base64.decode - 8256245: AArch64: Implement Base64 decoding intrinsic ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3228/files - new: https://git.openjdk.java.net/jdk/pull/3228/files/8a898aec..e658ebf4 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3228&range=00-01 Stats: 9524 lines in 363 files changed: 7727 ins; 450 del; 1347 mod Patch: https://git.openjdk.java.net/jdk/pull/3228.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3228/head:pull/3228 PR: https://git.openjdk.java.net/jdk/pull/3228 From dongbo at openjdk.java.net Tue Mar 30 03:27:40 2021 From: dongbo at openjdk.java.net (Dong Bo) Date: Tue, 30 Mar 2021 03:27:40 GMT Subject: RFR: 8256245: AArch64: Implement Base64 decoding intrinsic In-Reply-To: References: <_ZrhnM9OyXLckhtT27laLzWPZbCFZTPjm6ePbZdbyOs=.fcc6aaba-1578-443a-aa57-8141a99231f6@github.com> Message-ID: On Mon, 29 Mar 2021 08:38:59 GMT, Andrew Haley wrote: > > With an intial implemention, we can have almost half of the code size reduced (1312B -> 748B). Sounds OK to you? > > Sounds great, but I'm still somewhat concerned that the non-SIMD case only offers 3-12% performance gain. Make it just 748 bytes, and therefore not icache-hostile, then perhaps the balance of risk and reward is justified. Hi, @theRealAph @nick-arm The code is updated. The error handling in SIMD case was rewriten as loops. Also combined the two non-SIMD code blocks into one. Due to we have only one non-SIMD loop now, it is moved into `generate_base64_decodeBlock`. The size of the stub is 692 bytes, the non-SIMD loop takes about 92 bytes if my calculation is right. Verified with tests `test/jdk/java/util/Base64/` and `compiler/intrinsics/base64/TestBase64.java`. Compared with previous implementation, the performance changes are negligible. Other comments are addressed too. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/3228 From minqi at openjdk.java.net Tue Mar 30 03:48:08 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Tue, 30 Mar 2021 03:48:08 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v7] In-Reply-To: References: Message-ID: > Hi, Please review > > Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with > `java -XX:DumpLoadedClassList= .... ` > to collect shareable class names and saved in file `` , then > `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` > With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. > The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. > New added jcmd option: > `jcmd VM.cds static_dump ` > or > `jcmd VM.cds dynamic_dump ` > To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. > The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. > > Tests: tier1,tier2,tier3,tier4 > > Thanks > Yumin Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: Remove CDS.getVMArguments, changed to use VM.getRuntimeVMArguments. Removed unused function from ClassLoader. Improved InstanceKlass::is_shareable() and related test. Added more test scenarios. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2737/files - new: https://git.openjdk.java.net/jdk/pull/2737/files/3834f042..cef6328f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=06 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=05-06 Stats: 236 lines in 12 files changed: 89 ins; 69 del; 78 mod Patch: https://git.openjdk.java.net/jdk/pull/2737.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737 PR: https://git.openjdk.java.net/jdk/pull/2737 From dholmes at openjdk.java.net Tue Mar 30 03:51:43 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 30 Mar 2021 03:51:43 GMT Subject: RFR: 8264285: Clean the modification of ccstr JVM flags In-Reply-To: References: Message-ID: <71SWS17lpVrTS_4--6mimeyjCYjYzP_VO_lJ-rImnxg=.a75340ae-d17e-4f0d-8868-21d4449d64f6@github.com> On Mon, 29 Mar 2021 21:35:52 GMT, Ioi Lam wrote: > There are two versions of JVMFlagAccess::ccstrAtPut() for modifying JVM flags of the ccstr type (i.e., strings). > > - One version requires the caller to free the old value, but some callers don't do that (writeableFlags.cpp). > - The other version frees the old value on behalf of the caller. However, this version is accessible only via FLAG_SET_XXX macros and is currently unused. So it's unclear whether it actually works. > > We should combine these two versions into a single function, fix problems in the callers, and add test cases. The old value should be freed automatically, because typically the caller isn't interested in the old value. > > Note that the FLAG_SET_XXX macros do not return the old value. Requiring the caller of FLAG_SET_XXX to free the old value would be tedious and error prone. Hi Ioi, This looks good to me. Thanks for fixing it up. One minor nit below. Thanks, David src/hotspot/share/services/writeableFlags.cpp line 250: > 248: if (err == JVMFlag::SUCCESS) { > 249: assert(value == NULL, "old value is freed automatically and not returned"); > 250: } The whole block should be ifdef DEBUG. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3254 From david.holmes at oracle.com Tue Mar 30 03:59:16 2021 From: david.holmes at oracle.com (David Holmes) Date: Tue, 30 Mar 2021 13:59:16 +1000 Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v2] In-Reply-To: References: <8lcINJaJDOg62ESQch_n30qQTKXSauOH5qGJuD98T4I=.97409776-53ef-4308-bc1f-dffb6f2e907d@github.com> Message-ID: <8536da1d-5049-7fee-76b0-db6f16195bdc@oracle.com> On 30/03/2021 12:41 pm, Coleen Phillimore wrote: > On Tue, 30 Mar 2021 01:17:27 GMT, David Holmes wrote: > >>> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Only allow non-TRAPS version of Metaspace::allocate at a safepoint (or by non-Java thread) >> >> src/hotspot/share/oops/method.cpp line 570: >> >>> 568: if (current->is_Java_thread()) { >>> 569: // For when TRAPS is JavaThread. >>> 570: counters = MethodCounters::allocate(mh, current->as_Java_thread()); >> >> I'm not at all clear what we are doing here and it seems premature to anticipate the TRAPS change to JavaThread. The code will need reworking for that change because you still check for a pending exception regardless of what type of Thread current is. >> I don't see why we need to call a TRAPS version of allocate when we are going to clear any exception - just call the non-traps version (and remove the assert you added). Just because we are in a JavaThread it doesn't mean we have to throw exceptions. > > With this change, if the thread is not a Java thread, say the VMThread, we cannot throw an exception because we can't allocate a Java object for that exception. That's why we call the non-TRAPS version for !JavaThread. So there's no need to check for a pending exception because it can't throw one. Also, eventually we want to enforce that only JavaThreads can throw/catch Java exceptions. > The change requested by Thomas was that if we *can* thrown an exception, we should call the TRAPS version because that version will adjust the GC threshold. We can't call GC with the non-TRAPS version. Ah I see. That makes sense. > So this is the simplest thing. We have two versions of this function (the TRAPS version calls the non-TRAPS version). The non-TRAPS version is the special one that can be used by the VMThread for allocating metadata, all other callers stay the same. > I can remove the as_Java_thread() and line 569 and you can add it back when we're able to change TRAPS to JavaThread. No that's fine. I've added feedback on the latest version. Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3207 > From dholmes at openjdk.java.net Tue Mar 30 04:01:04 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 30 Mar 2021 04:01:04 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v3] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 03:06:11 GMT, Coleen Phillimore wrote: >> This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. >> >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Make which version of MethodCounters::allocate() is called clearer. Hi Coleen, Updated code is much clearer but I have a suggested refactoring to avoid the duplicated OOM code. Thanks, David src/hotspot/share/oops/method.cpp line 570: > 568: if (current->is_Java_thread()) { > 569: Thread* THREAD = current; > 570: counters = MethodCounters::allocate(mh, THREAD); Can you add a comment before this line: // Use the TRAPS version for a JavaThread so it will adjust the GC threshold if needed. Thanks. src/hotspot/share/oops/method.cpp line 572: > 570: counters = MethodCounters::allocate(mh, THREAD); > 571: if (HAS_PENDING_EXCEPTION) { > 572: CLEAR_PENDING_EXCEPTION; // MethodData above doesn't clear exception I don't understand the comment. src/hotspot/share/oops/method.cpp line 575: > 573: CompileBroker::log_metaspace_failure(); > 574: ClassLoaderDataGraph::set_metaspace_oom(true); > 575: return NULL; You could factor this out for both cases by testing "counters == NULL". ------------- Changes requested by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3207 From stuefe at openjdk.java.net Tue Mar 30 04:04:42 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 30 Mar 2021 04:04:42 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v2] In-Reply-To: References: <8lcINJaJDOg62ESQch_n30qQTKXSauOH5qGJuD98T4I=.97409776-53ef-4308-bc1f-dffb6f2e907d@github.com> Message-ID: On Tue, 30 Mar 2021 02:38:44 GMT, Coleen Phillimore wrote: >> src/hotspot/share/oops/method.cpp line 570: >> >>> 568: if (current->is_Java_thread()) { >>> 569: // For when TRAPS is JavaThread. >>> 570: counters = MethodCounters::allocate(mh, current->as_Java_thread()); >> >> I'm not at all clear what we are doing here and it seems premature to anticipate the TRAPS change to JavaThread. The code will need reworking for that change because you still check for a pending exception regardless of what type of Thread current is. >> I don't see why we need to call a TRAPS version of allocate when we are going to clear any exception - just call the non-traps version (and remove the assert you added). Just because we are in a JavaThread it doesn't mean we have to throw exceptions. > > With this change, if the thread is not a Java thread, say the VMThread, we cannot throw an exception because we can't allocate a Java object for that exception. That's why we call the non-TRAPS version for !JavaThread. So there's no need to check for a pending exception because it can't throw one. Also, eventually we want to enforce that only JavaThreads can throw/catch Java exceptions. > The change requested by Thomas was that if we *can* thrown an exception, we should call the TRAPS version because that version will adjust the GC threshold. We can't call GC with the non-TRAPS version. > So this is the simplest thing. We have two versions of this function (the TRAPS version calls the non-TRAPS version). The non-TRAPS version is the special one that can be used by the VMThread for allocating metadata, all other callers stay the same. > I can remove the as_Java_thread() and line 569 and you can add it back when we're able to change TRAPS to JavaThread. > I deleted this branch by mistake, now restored. > > > I'm not sure this is correct. Your new non-TRAPS Metaspace::allocate() would fail every time the GC threshold is touched. Where the old TRAPS version would break through the threshold and allocate successfully. > > I realize this. It's just an attempt to allocate and it's designed to be used during a safepoint for only this allocation. I could change this to only call the non-TRAPS version of MethodCounters if we're at a safepoint? Would that help? Then the only time we'll miss out on metaspace counters periodically is if they were created to set breakpoints in a safepoint. I think that would be better. I am unclear on what happened in this case before; did we also miss out on allocating the Counters? > > I'd hate for this special case to know more about metaspace, ala calling ClassLoaderMetaspace::expand_and_allocate. Even within Metaspace::allocate(no TRAPS)? Its in metaspace land, surely it would be fine to call expand there like this: MetaWord* Metaspace::allocate(ClassLoaderData* loader_data, size_t word_size, MetaspaceObj::Type type) { MetaWord* result = loader_data->metaspace_non_null()->allocate(...); if (!result) { MetaWord* result = loader_data->metaspace_non_null()->expand_and_allocate(...); } (Note that I will be gone into vacation shortly and I'm a bit short on time; I'm not sure I can finish this review. If you go with your approach, my only request would be to comment the prototypes for the two allocate functions a bit clearer and/or maybe rename one as allocate_no_exception or the other as allocate_with_exception) ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From never at openjdk.java.net Tue Mar 30 04:30:44 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Tue, 30 Mar 2021 04:30:44 GMT Subject: RFR: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 00:35:30 GMT, Coleen Phillimore wrote: >> 8264016: [JVMCI] add some thread local fields for use by JVMCI > > Marked as reviewed by coleenp (Reviewer). Thanks for the reviews. I'll defer any loom related issues to some future date. ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From never at openjdk.java.net Tue Mar 30 04:30:45 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Tue, 30 Mar 2021 04:30:45 GMT Subject: Integrated: 8264016: [JVMCI] add some thread local fields for use by JVMCI In-Reply-To: References: Message-ID: On Tue, 23 Mar 2021 06:11:44 GMT, Tom Rodriguez wrote: > 8264016: [JVMCI] add some thread local fields for use by JVMCI This pull request has now been integrated. Changeset: 182b11c3 Author: Tom Rodriguez URL: https://git.openjdk.java.net/jdk/commit/182b11c3 Stats: 14 lines in 3 files changed: 14 ins; 0 del; 0 mod 8264016: [JVMCI] add some thread local fields for use by JVMCI Reviewed-by: dholmes, iklam, coleenp ------------- PR: https://git.openjdk.java.net/jdk/pull/3147 From dholmes at openjdk.java.net Tue Mar 30 05:09:45 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 30 Mar 2021 05:09:45 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp [v2] In-Reply-To: <5W8i9Wro1OWbzlUbEyeTy4TBLQmhWysLSjDcjadMygc=.a8348509-517b-48c4-be70-68b3ddb1088b@github.com> References: <5W8i9Wro1OWbzlUbEyeTy4TBLQmhWysLSjDcjadMygc=.a8348509-517b-48c4-be70-68b3ddb1088b@github.com> Message-ID: On Mon, 29 Mar 2021 08:03:58 GMT, Yasumasa Suenaga wrote: >> I tried to build OpenJDK with g++-10.2.1_pre1-r3 on Alpine Linux 3.13.2, but I saw following warning: >> >> >> 668 | alloca(((pid ^ counter++) & 7) * 128); >> | ^ >> cc1plus: all warnings being treated as errors > > Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: > > Remove alloca() from some platforms I think I am comfortable with the changes as proposed. A summary of what we found for each platform should be put in the bug report. Thanks, David src/hotspot/os/linux/os_linux.cpp line 671: > 669: static int counter = 0; > 670: int pid = os::current_process_id(); > 671: void *stackmem = alloca(((pid ^ counter++) & 7) * 128); Please add a comment: // Ensure the alloca result is used in a way that prevents the compiler from eliding it. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Tue Mar 30 05:58:17 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 30 Mar 2021 05:58:17 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp [v3] In-Reply-To: References: Message-ID: > I tried to build OpenJDK with g++-10.2.1_pre1-r3 on Alpine Linux 3.13.2, but I saw following warning: > > > 668 | alloca(((pid ^ counter++) & 7) * 128); > | ^ > cc1plus: all warnings being treated as errors Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: Add comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3042/files - new: https://git.openjdk.java.net/jdk/pull/3042/files/4485a021..7773498f Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3042&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3042&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3042.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3042/head:pull/3042 PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Tue Mar 30 05:58:20 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 30 Mar 2021 05:58:20 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp [v2] In-Reply-To: References: <5W8i9Wro1OWbzlUbEyeTy4TBLQmhWysLSjDcjadMygc=.a8348509-517b-48c4-be70-68b3ddb1088b@github.com> Message-ID: On Tue, 30 Mar 2021 05:06:05 GMT, David Holmes wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove alloca() from some platforms > > src/hotspot/os/linux/os_linux.cpp line 671: > >> 669: static int counter = 0; >> 670: int pid = os::current_process_id(); >> 671: void *stackmem = alloca(((pid ^ counter++) & 7) * 128); > > Please add a comment: > > // Ensure the alloca result is used in a way that prevents the compiler from eliding it. Added it in new commit. ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From ysuenaga at openjdk.java.net Tue Mar 30 06:00:59 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 30 Mar 2021 06:00:59 GMT Subject: RFR: 8263718: unused-result warning happens at os_linux.cpp [v2] In-Reply-To: References: <5W8i9Wro1OWbzlUbEyeTy4TBLQmhWysLSjDcjadMygc=.a8348509-517b-48c4-be70-68b3ddb1088b@github.com> Message-ID: <_fwXlba2zOGlxvF4NDbUKm7psfPRid2kZaGHTpRMlE0=.75ce10c1-f848-44fe-8d5e-70044d2fe4d9@github.com> On Tue, 30 Mar 2021 05:06:21 GMT, David Holmes wrote: >> Yasumasa Suenaga has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove alloca() from some platforms > > I think I am comfortable with the changes as proposed. A summary of what we found for each platform should be put in the bug report. > > Thanks, > David Thanks @dholmes-ora for your review! > A summary of what we found for each platform should be put in the bug report. I left it to the [comment](https://bugs.openjdk.java.net/browse/JDK-8263718?focusedCommentId=14410190&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-14410190). ------------- PR: https://git.openjdk.java.net/jdk/pull/3042 From xliu at openjdk.java.net Tue Mar 30 06:18:59 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 30 Mar 2021 06:18:59 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: <0qKPSmxPH02xshv9gpcarh-LIc6xiZnlYVCDrRPtCP0=.94eaa31a-501b-4d48-ade0-ae1abf6acddf@github.com> References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> <0qKPSmxPH02xshv9gpcarh-LIc6xiZnlYVCDrRPtCP0=.94eaa31a-501b-4d48-ade0-ae1abf6acddf@github.com> Message-ID: On Mon, 29 Mar 2021 12:25:13 GMT, Yasumasa Suenaga wrote: > > But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR? > > I posted similar diacussion to hotspot-runtime-dev last November. It aims to implement to send UL via network socket. I believe this PR helps it. > > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html Interesting. This design diagram is similar to this PR, but I don't think it is a good idea to have a blocking message buffer. As mentioned in prior thread, it makes hotspot be more subject to external factors. TCP/UPD is an even more representative example of blocking IO than harddrive, isn't it? ### Design and its Rationale For async-logging feature, we proposed a lossy non-blocking design here. A [bounded deque](https://github.com/openjdk/jdk/pull/3135/files#diff-5a3c326d548886f56ef0c46f4a63f7c58f76e1c51fada9a874d40d12a43f15b0R40) or ringbuffer gives a strong guarantee that log sites won't block java threads and the critical internal threads. This is the very problem we are meant to solve. It can be proven that we cannot have all three guarantees at the same time: **non-blocking**, **bounded memory** and **log fidelity**. To overcome blocking I/O, which sometimes is not under our control, we think it's fair to trade log fidelity for non-blocking. If we kept fidelity and chose unbound buffer, we could end up with some spooky out-out-memory errors on some resource resource-constrained hardwares. We understand that the platforms hotspot running range from powerful servers to embedded devices. By leaving the buffer size adjustable, we can fit more scenarios. Nevertheless, with a bounded buffer, we believe developers can still capture important logging traits as long as the window is big enough and log messages are consecutive. The current implementation does provide those two attributes. ### A new proposal based on current implementation I agree with reviewers' comments above. It's questionable to use the singleton `WatcherThread` to do IO-intensive job here. It may hinder other tasks. David's guess is right. I was not familiar with hotspot thread and quite frustrated to deal with a special-task thread's lifecycle. That why I used PeriodicTask. I feel more confident to take that challenge again. Just like Yasumasa [depicted](https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d), I can create a dedicated NonJavaThread to flush logs instead. Yesterday, I found `WatcherThread::unpark()` uses its monitor to wake up other pending tasks. I think we can implement in this way. Once log sites observe the buffer is half-full, it uses `monitor::notify()` to wake up flusher thread to work. I think logging event is high-frequent but less contentious. Waking it up for each log message is not so economical. I have a lossy buffer anyway, so I suggest to have two checkpoints only: 1) half-full. 2) full. ### Wrap it up We would like to propose a lossy design of async-logging in this PR. It is a trade off, so I don't think it's a good idea to handle all logs in async mode. In practice, we hope people only choose `async-logging` for those logs which really may happen at safepoints. I understand Yasumasa's problem. If you would like to consider netcat or nfs/sshfs, I think your problem can still be solved in the existing file-based output. In this way, you can also utilize this feature by setting your "file" output async mode, then it makes your hotspot non-blocking over TCP as well. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From xliu at openjdk.java.net Tue Mar 30 07:01:52 2021 From: xliu at openjdk.java.net (Xin Liu) Date: Tue, 30 Mar 2021 07:01:52 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 20:55:41 GMT, Volker Simonis wrote: >> Xin Liu has updated the pull request incrementally with two additional commits since the last revision: >> >> - 8229517: Support for optional asynchronous/buffered logging >> >> add a constraint for the option LogAsyncInterval. >> - 8229517: Support for optional asynchronous/buffered logging >> >> LogMessage supports async_mode. >> remove the option AsyncLogging >> renanme the option GCLogBufferSize to AsyncLogBufferSize >> move drop_log() to LogAsyncFlusher. > > src/hotspot/share/logging/logAsyncFlusher.hpp line 116: > >> 114: bool equals(const AsyncLogMessage& o) const { >> 115: return (&_output == &o._output) && (_message == o._message || !strcmp(_message, o._message)); >> 116: } > > [`strcmp()` is not defined for `NULL`](https://en.cppreference.com/w/cpp/string/byte/strcmp) but you can have `_message == NULL` if you've transferred ownership in the copy constructor. yes, this is subtle bug! thanks! I thought that if _message is NULL, then o._message must be NULL, then it will be true for _message == o._message. I was wrong. > src/hotspot/share/logging/logAsyncFlusher.hpp line 124: > >> 122: >> 123: class LogAsyncFlusher : public PeriodicTask { >> 124: private: > > As far as I know, `PeriodicTask` is designed for short running task. But `LogAsyncFlusher::task()` will now call `AsyncLogMessage::writeback()` which does blocking I/O and can block for quite some time (that's why we have this change in the first place :). How does this affect the other periodic tasks and the `WatcherThread`. What's the worst case scenario if the `WatcherThread` is blocked? Is this any better than before? ack. other reviewers also raise this question. I propose a dedicated nonjavathread to flush log. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From akozlov at openjdk.java.net Tue Mar 30 07:14:40 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 30 Mar 2021 07:14:40 GMT Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: <03NlfuqqZVMecpcrcJ6-GqRCohFtUzRpZl2ZvlZPk7o=.226e91f5-1b82-4215-a500-0d7925fad3a9@github.com> References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> <03NlfuqqZVMecpcrcJ6-GqRCohFtUzRpZl2ZvlZPk7o=.226e91f5-1b82-4215-a500-0d7925fad3a9@github.com> Message-ID: On Mon, 29 Mar 2021 19:15:20 GMT, Anton Kozlov wrote: >> Hi Anton, >> >> In so much as I understand the fact transitions are missing the introduction of those transitions seems fine - but can be simplified due to an existing code quirk (see comments below). >> >> My main concern with this W^X stuff is that I don't see a clear way to know exactly where a transition needs to be placed. The missing cases here suggest it should be handled in the thread-state transition code, but you've previously written: >> >> " when we execute JVM code (owned by libjvm.so, starting from JVM entry function), we switch to Write state. When we leave JVM to execute generated or JNI code, we switch to Executable state. I would like to highlight that JVM code does not mean the VM state of the java thread" >> >> so I'm unclear exactly how we identify the points where these transitions must occur? What kind of "VM code" must be guarded this way? I don't see this documented in the code anywhere. >> >> Thanks, >> David > > Hi David, > > Thank you for the review. > >> so I'm unclear exactly how we identify the points where these transitions must occur? What kind of "VM code" must be guarded this way? I don't see this documented in the code anywhere. > > For usual JNI function implementation we switch to WXWrite in JNI_ENTRY (JNI_ENTRY_NO_PRESERVE to be precise), so xxx_ENTRY style macro defines a border between JVM and the rest of the code. JNI Invocation functions are also called directly from native code, so they should have W^X transition. But since they are implementing something rather special, they don't use any special ENTRY macro, which makes them less evident to be JVM entry points. Here and in other cases, when we do W^X in an apparently random function, it is because the function is called directly from the interpreter or native or generated code, but the function is not defined with ENTRY macro. Hope it clarifies the logic a bit. > > I've checked `Threads::destroy_vm`, you're right! But I'd consider missing handling after possible `false` as a bug. If you don't mind, I prefer the extra code in the existing else clause. But I agree the whole clause can be cleaned up. Thanks for pointing! > what kind of > VM code actually requires these transitions. For example a LEAF function > doesn't have the transition so the code called from there can't require > it - but what is the characteristic of code that can require it? I assume any VM entry code should have W^X transitions. Now VM_LEAF_BASE transits to WXWrite, and the macro is used in various kinds of LEAF's. An initial W^X implementation had no transitions in LEAF functions, but now the code and W^X policy should be much more straightforward. If we still don't have the transition somewhere, there should be a reason and a comment in the code. Otherwise, it's a bug :) ------------- PR: https://git.openjdk.java.net/jdk/pull/3241 From ysuenaga at openjdk.java.net Tue Mar 30 07:21:53 2021 From: ysuenaga at openjdk.java.net (Yasumasa Suenaga) Date: Tue, 30 Mar 2021 07:21:53 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> <0qKPSmxPH02xshv9gpcarh-LIc6xiZnlYVCDrRPtCP0=.94eaa31a-501b-4d48-ade0-ae1abf6acddf@github.com> Message-ID: On Tue, 30 Mar 2021 06:15:06 GMT, Xin Liu wrote: >>> But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR? >> >> I posted similar diacussion to hotspot-runtime-dev last November. It aims to implement to send UL via network socket. I believe this PR helps it. >> >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html > >> > But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR? >> >> I posted similar diacussion to hotspot-runtime-dev last November. It aims to implement to send UL via network socket. I believe this PR helps it. >> >> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html > > Interesting. This design diagram is similar to this PR, but I don't think it is a good idea to have a blocking message buffer. > As mentioned in prior thread, it makes hotspot be more subject to external factors. TCP/UPD is an even more representative example of blocking IO than harddrive, isn't it? > > ### Design and its Rationale > For async-logging feature, we proposed a lossy non-blocking design here. A [bounded deque](https://github.com/openjdk/jdk/pull/3135/files#diff-5a3c326d548886f56ef0c46f4a63f7c58f76e1c51fada9a874d40d12a43f15b0R40) or ringbuffer gives a strong guarantee that log sites won't block java threads and the critical internal threads. This is the very problem we are meant to solve. > > > It can be proven that we cannot have all three guarantees at the same time: **non-blocking**, **bounded memory** and **log fidelity**. To overcome blocking I/O, which sometimes is not under our control, we think it's fair to trade log fidelity for non-blocking. If we kept fidelity and chose unbound buffer, we could end up with some spooky out-of-memory errors on some resource-constrained hardwares. We understand that the platforms hotspot running range from powerful servers to embedded devices. By leaving the buffer size adjustable, we can fit more scenarios. Nevertheless, with a bounded buffer, we believe developers can still capture important logging traits as long as the window is big enough and log messages are consecutive. The current implementation does provide those two merits. > > ### A new proposal based on current implementation > I agree with reviewers' comments above. It's questionable to use the singleton `WatcherThread` to do IO-intensive job here. It may hinder other tasks. David's guess is right. I was not familiar with hotspot thread and quite frustrated to deal with a special-task thread's lifecycle. That why I used PeriodicTask. I feel more confident to take that challenge again. > > Just like Yasumasa [depicted](https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d), I can create a dedicated NonJavaThread to flush logs instead. Yesterday, I found `WatcherThread::unpark()` uses its monitor to wake up other pending tasks. I think we can implement in this way. Once log sites observe the buffer is half-full, it uses `monitor::notify()` to resume flusher thread. I think logging event is high-frequent but less contentious. Waking it up for each log message is not so economical. I have a lossy buffer anyway, so I suggest to have two checkpoints only: 1) half-full. 2) full. > > ### Wrap it up > We would like to propose a lossy design of async-logging in this PR. It is a trade off, so I don't think it's a good idea to handle all logs in async mode. In practice, we hope people only choose `async-logging` for those logs which may happen at safepoints. > > I understand Yasumasa's problem. If you would like to consider netcat or nfs/sshfs, I think your problem can still be solved in the existing file-based output. In this way, you can also utilize this feature by setting your "file" output async mode, then it makes your hotspot non-blocking over TCP as well. This proposal mostly looks good to me, but it is better if async support is implement in higher level class. LogFileOutput class inherited as following, and you modified LogFileOutput now (you might change LogFileStreamOutput later) * LogOutput * LogFileStreamOutput * LogFileOutput I want to add async support to LogFileStreamOutput or LogFileStreamOutput because it helps us if we add other UL output (e.g. network socket) in future. ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From stefank at openjdk.java.net Tue Mar 30 07:23:36 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 30 Mar 2021 07:23:36 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops [v3] In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 09:35:27 GMT, Stefan Karlsson wrote: >> Looks good. > > Thanks, Kim. Now lets see what happens now that #3214 has been updated ... The bots accidentally closed this. The PR will be reopened today or tomorrow. ------------- PR: https://git.openjdk.java.net/jdk/pull/3215 From pliden at openjdk.java.net Tue Mar 30 07:35:43 2021 From: pliden at openjdk.java.net (Per Liden) Date: Tue, 30 Mar 2021 07:35:43 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops [v2] In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 09:46:57 GMT, Stefan Karlsson wrote: >> Some parts of the JVM puts an marker to show that a location does not contain a valid oop. The code that handles this typically look like this: >> >> oop* p = ... >> if (*p != Universe::non_oop_word()) >> >> This means that sometimes the *p will create an oop that contains the non_oop_word. This makes it problematic to add stricter oop verification. I propose that we add a new function that checks the value of locations without converting it to an oop. >> >> (Note: I'm testing the new dependent pull Skara feature with this PR. It builds depends on the pr/3214 branch) > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: > > - Merge branch '8264268_dervied_pointer_types' into 8264271_avoid_creating_non_oop_word_oops > - 8264271: Avoid creating non_oop_word oops Looks good! ------------- Marked as reviewed by pliden (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3215 From stefank at openjdk.java.net Tue Mar 30 07:44:40 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 30 Mar 2021 07:44:40 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops [v3] In-Reply-To: References: Message-ID: <0kGPMCkrKaKDOYWtKuhOvdeEjMdngKSm6Idn3R57aPc=.56e0c28e-527e-426c-94bf-06865ce6733a@github.com> > Some parts of the JVM puts an marker to show that a location does not contain a valid oop. The code that handles this typically look like this: > > oop* p = ... > if (*p != Universe::non_oop_word()) > > This means that sometimes the *p will create an oop that contains the non_oop_word. This makes it problematic to add stricter oop verification. I propose that we add a new function that checks the value of locations without converting it to an oop. > > (Note: I'm testing the new dependent pull Skara feature with this PR. It builds depends on the pr/3214 branch) Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge remote-tracking branch 'origin/master' into 8264271_avoid_creating_non_oop_word_oops - Merge branch '8264268_dervied_pointer_types' into 8264271_avoid_creating_non_oop_word_oops - Remove unused value_fn parameter - star alignment cleanup - Merge remote-tracking branch 'origin/master' into 8264268_dervied_pointer_types - Add static assert - Cleanups - derived_pointer enum class - 8264271: Avoid creating non_oop_word oops - 8264268: Don't use oop types for derived pointers ------------- Changes: https://git.openjdk.java.net/jdk/pull/3215/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3215&range=02 Stats: 58 lines in 7 files changed: 38 ins; 10 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/3215.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3215/head:pull/3215 PR: https://git.openjdk.java.net/jdk/pull/3215 From stefank at openjdk.java.net Tue Mar 30 07:44:41 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 30 Mar 2021 07:44:41 GMT Subject: RFR: 8264271: Avoid creating non_oop_word oops [v2] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 07:32:25 GMT, Per Liden wrote: >> Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: >> >> - Merge branch '8264268_dervied_pointer_types' into 8264271_avoid_creating_non_oop_word_oops >> - 8264271: Avoid creating non_oop_word oops > > Looks good! Thanks, Per. I'm working with Robin to fix this branch. Currently it shows changes from the dependent branch, but after a merge with openjdk/master, this should hopefully be solved. ------------- PR: https://git.openjdk.java.net/jdk/pull/3215 From manc at openjdk.java.net Tue Mar 30 07:49:00 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 30 Mar 2021 07:49:00 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v3] In-Reply-To: References: Message-ID: > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man Man Cao has updated the pull request incrementally with one additional commit since the last revision: Changed to try_pop() and eliminated conditional critical section. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2986/files - new: https://git.openjdk.java.net/jdk/pull/2986/files/91f22bbd..7035ff56 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2986&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2986&range=01-02 Stats: 247 lines in 7 files changed: 100 ins; 88 del; 59 mod Patch: https://git.openjdk.java.net/jdk/pull/2986.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2986/head:pull/2986 PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Tue Mar 30 07:52:41 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 30 Mar 2021 07:52:41 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 10:43:08 GMT, Ivan Walulya wrote: >> Man Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Address comment and add a gtest. > > src/hotspot/share/utilities/lockFreeQueue.hpp line 33: > >> 31: >> 32: // The LockFreeQueue template provides a lock-free FIFO. Its structure >> 33: // and usage is similar to LockFreeStack. It has inner paddings, and > > probably need to add the conditional critical sections to the LockFreeStack for this description to be correct. But that can be done in a separate PR. Superceded by not adding the ConditionalCriticalSection. > src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 33: > >> 31: #include "utilities/lockFreeQueue.hpp" >> 32: #include "logging/log.hpp" >> 33: > > Don't we need inline specifiers for the functions below? Added. > src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 108: > >> 106: // returns released objects to a free list for reuse, it could cause >> 107: // excessive allocations. >> 108: GlobalCounter::ConditionalCriticalSection cs(use_rcu ? > > ` GlobalCounter::ConditionalCriticalSection cs(Thread::current());` should be fine, not sure how much is gained by skipping the `Thread::current()` call. Superceded by not adding the ConditionalCriticalSection. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Tue Mar 30 08:27:47 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 30 Mar 2021 08:27:47 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 07:56:19 GMT, Kim Barrett wrote: >> Man Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Address comment and add a gtest. > > src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 100: > >> 98: template >> 99: template >> 100: T* LockFreeQueue::pop() { > > On further consideration I don't think this `use_rcu` conditionalized `pop` is the right path. The current behavior (with the embedded critical section) was for the specific use case in `G1DirtyCardQueueSet`. But for a general tool, I think a different approach is needed. I think better would be to not provide `pop()` at all, and instead provide `try_pop()`, which has a tri-status result: success, lost a race, or lost to an in-progress operation. So something like: > enum class LockFreeQueuePopStatus { > success, > lost_race, > operation_in_progress > }; > > // Member of LockFreeQueue > // Executes the body of the old pop loop once, with appropriate > // adjustments to return value and returning rather than retrying. > Pair try_pop(); > Then let the specific use-case determine the context in which try_pop should be called and how to handle the various possible results. > > This eliminates `ConditionalCriticalSection` (which seems strange). This also eliminates the `G1DirtyCardQueueSet`-specific subclass of `LockFreeQueue`. Instead we have (private) `G1DirtyCardQueueSet::pop_queue()`: > BufferNode* G1DirtyCardQueueSet::pop_queue() { > using Status = LockFreeQueuePopStatus; > Thread* current_thread = Thread::current(); > while (true) { > GlobalCounter::CriticalSection cs(current_thread); > Pair pop_result = _completed.try_pop(); > switch (pop_result.first) { > case Status::success: return pop_result.second; > case Status::operation_in_progress: return nullptr; > case Status::lost_race: break; // Try again. > } > } > } > I'm also not sure whether the G1 case actually needs the critical section inside the loop anymore. That might be a holdover from an earlier version where the operation-in-progress case did just loop to try again. It definitely does need a critical section though; the life cycle management for the BufferNodes depends on it. Thanks for the suggestion, and yes this makes sense. Done. Could you double-check if the updated comments are appropriate? My only concern is that the difference between operation_in_progress and lost_race may be too subtle for most client code. I suppose most client code can just do "retry until succeeded" like in the test. I don't expect these two cases to differ much in run-time. Is there any performance data to show that the operation_in_progress case indeed takes much longer for retrying? I checked review threads for JDK-8237143 and JDK-8238867 but didn't find any. Anyway, this CR should use the tri-status approach (and the intra-loop critical section approach), in order to avoid behavior change. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From aph at openjdk.java.net Tue Mar 30 08:50:47 2021 From: aph at openjdk.java.net (Andrew Haley) Date: Tue, 30 Mar 2021 08:50:47 GMT Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> <03NlfuqqZVMecpcrcJ6-GqRCohFtUzRpZl2ZvlZPk7o=.226e91f5-1b82-4215-a500-0d7925fad3a9@github.com> Message-ID: On Tue, 30 Mar 2021 07:11:49 GMT, Anton Kozlov wrote: > I assume any VM entry code should have W^X transitions. That sounds very extreme to me. I guess it all depends on the cost of doing the transition. I'm curious why the W^X transition isn't done only when accessing the code cache. ------------- PR: https://git.openjdk.java.net/jdk/pull/3241 From adinn at redhat.com Tue Mar 30 11:11:21 2021 From: adinn at redhat.com (Andrew Dinn) Date: Tue, 30 Mar 2021 12:11:21 +0100 Subject: RFC: JEP drafts PAC for Linux/AArch64 (JDK-8264130) and Arm64 for MacOS/AArch64 (JDK-8264131) In-Reply-To: <55F74A14-5AB8-44F7-8903-BE03AE087484@arm.com> References: <55F74A14-5AB8-44F7-8903-BE03AE087484@arm.com> Message-ID: Hi Alan, I'm fairly strongly aligned with Andrew Haley as regards this proposal i.e. PAC appears to be of far less value in a managed runtime like Java than it is for other app deployment models the thing PAC is intended to stop i.e. updating of stacked return addresses from within user space is actually something the JVM does legitimately and (relatively) safely Given those two positions I think you need to come up with a strong argument to motivate employing PAC on AArch64 -- and living with whatever performance overheads it imposes -- before we proceed further. I'd also like to see a plan for how we might allow the JVM to continue safely to update return addresses. regards, Andrew Dinn ----------- On 29/03/2021 10:01, Alan Hayward wrote: > Hi all, > > I?ve been investigating PAC for the AArch64 ports - figuring out what should be supported and trying it out in code. PAC is an AArch64 extension that provides instructions for signing and authenticating values and addresses; it can be used to bring protection against various types of attacks, for a small performance cost. If OpenJDK is running on a system with PAC protection enabled in the kernel, then it should use the feature. > > I?ve started by implementing the same support as GCC/LLVM - namely signing return addresses. So far I have this seemingly fully working in interpreter only; and C1/C2 crashing in deoptimization. > I?ve also got an early attempt at MacOS arm64e. In addition to signing return addresses, arm64e requires signing function pointers. > The upcoming PAuth ABI for Linux includes all of the above plus additional features. I?ve not made any attempt at this yet. > > All of this comes at a cost. Current estimate is 3% on average for signing return addresses. This almost vanishes on non PAC hardware, or when the feature is disabled, as in both cases the PAC instructions are treated as NOPs. Arm64e has the advantage that it is compiled twice within the same fat binary meaning the arm64e version will not have the extra NOPs. > > I?ve opened JEPs for both the Linux and Arm64e work. These are my first attempts at writing a JEP, so any comments would be greatly appreciated. > > PAC-RET protection for Linux/AArch64: > https://bugs.openjdk.java.net/browse/JDK-8264130 > > Arm64e support for MacOS/AArch64: > https://bugs.openjdk.java.net/browse/JDK-8264131 > > > Thanks, > Alan. > > IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. > -- regards, Andrew Dinn ----------- Red Hat Distinguished Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From akozlov at openjdk.java.net Tue Mar 30 11:12:41 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Tue, 30 Mar 2021 11:12:41 GMT Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> <03NlfuqqZVMecpcrcJ6-GqRCohFtUzRpZl2ZvlZPk7o=.226e91f5-1b82-4215-a500-0d7925fad3a9@github.com> Message-ID: <2Jbigp6NqZdI47olHz7xS_M6RYaZ9M7bNCyTiOKz6hs=.96efe010-6ee9-4894-8fd7-9709ea8df343@github.com> On Tue, 30 Mar 2021 08:47:40 GMT, Andrew Haley wrote: > > I assume any VM entry code should have W^X transitions. > > That sounds very extreme to me. I guess it all depends on the cost of doing the transition. It is made for simplicity and robustness. I did a few iterations fixing various issues until this scheme came up, which I pretty confident now in correctness. Yes, it depends on the transition to be fast enough. To get the sense of "fast" I did https://github.com/openjdk/jdk/pull/2200#issuecomment-773382787. But this approach can be and probably should be optimized. > I'm curious why the W^X transition isn't done only when accessing the code cache. There are a lot of places where we need or potentially need the transition. Every deoptimization may want to write to codecache. Adding explicit transitions is rather tedious and error-prone. I've tried this and it takes significant time to get to `java -version` working, without guarantees it will work to something bigger. Another approach is to do a lazy transition to WXWrite on segfault after we try to write to codecache in VM and to ensure WXExec during exit from VM. Is it beneficial or not depends on a ratio between entries into runtime and ones actually need WXWrite. This raises a question about a realistic workload to measure effects after various W^X approaches. I used the first iteration of macro benchmarks like Renaissance and SPECjvm2008 with zero warm-up, in the assumption that the most number of runtime calls and deoptimization happens there. Is there a better workload for this purpose? ------------- PR: https://git.openjdk.java.net/jdk/pull/3241 From Alan.Hayward at arm.com Tue Mar 30 11:54:21 2021 From: Alan.Hayward at arm.com (Alan Hayward) Date: Tue, 30 Mar 2021 11:54:21 +0000 Subject: RFC: JEP drafts PAC for Linux/AArch64 (JDK-8264130) and Arm64 for MacOS/AArch64 (JDK-8264131) In-Reply-To: <077be8df-e810-ba35-c465-54b008f69ad7@redhat.com> References: <55F74A14-5AB8-44F7-8903-BE03AE087484@arm.com> <077be8df-e810-ba35-c465-54b008f69ad7@redhat.com> Message-ID: <9CFED213-8655-4F76-BDA7-59CE97762A65@arm.com> > On 29 Mar 2021, at 16:03, Andrew Haley wrote: > > On 3/29/21 10:01 AM, Alan Hayward wrote: > >> I?ve been investigating PAC for the AArch64 ports - figuring out >> what should be supported and trying it out in code. PAC is an >> AArch64 extension that provides instructions for signing and >> authenticating values and addresses; it can be used to bring >> protection against various types of attacks, for a small performance >> cost. If OpenJDK is running on a system with PAC protection enabled >> in the kernel, then it should use the feature. > > I question this "should", and would like to see a "because." I > understand why this stuff is attractive, but as I understand it PAC is > mostly a band-aid for unsafe programming languages with nasty features > like stack-allocated buffer overflows. In a sense, what PAC is trying > to do is bring C and C++ closer to languages such as Java by > strengthening pointer checks, which is no bad thing. I am aware, of > course, that mistakes can be made in HotSpot itself, which is written > in C++, so there may be some point to it in the JVM. Agreed with these points. Part of me would like to say that at a minimum we should compile the JVM with the branch protection GCC/LLVM flags (with no other changes required). But, the jvm will still be generating lots of code which is exploitable, which is all the attacker needs. The flow of attack then would be exploiting a bug in the JVM to gain write access and then using that to chain together gadgets in the generated code. In the JEP I mentioned 100,000 gadgets in a simple helloworld run. I need to do more investigation to figure out what percentage comes from generated code, but even a small percentage is quite a lot. > > I'd like to see how complex the implementation is likely to be, > especially given that we deliberately, as a matter of design, redirect > return address pointers in several places. (That's not a bug, it's a > feature.) PAC support potentially makes the AArch64 port significantly > more complex, and therefore potentially more buggy. I know that Arm > wants this feature to be used where possible, but it has to be > justified by the benefit of users. In the general case, it?s just a matter of adding a small amount of code into enter() and replacing calls to ret(lr) with a new function that does the right thing. The non-simple cases are a bit more subtle. Given that I?m still working through issues, I?m not certain where the code will end up. I?ve already seen some of the return address redirection, and I?ve had some situations where I need to sign using the esp instead of the sp. I really don?t want to end up in a situation where developers need to take extra special care because of PAC every time they add new functionality into the AArch64 backend. I?m hoping the right set of macro assembler functions should hide everything. But, I?m very conscious of the danger of having code that?s potentially fragile on hardware that isn?t (yet) widely available. I?ll be happy to share some code once I?m a little further through the issues. All the above is just assuming Linux. For MacOS everything is amplified because of the additional protection. If Arm64e becomes the default on MacOS, then I wonder if the JVM will be ?required? to produce arm64e compliant code. (Thanks for the reply) Alan. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. From magnus.ihse.bursie at oracle.com Tue Mar 30 11:57:02 2021 From: magnus.ihse.bursie at oracle.com (Magnus Ihse Bursie) Date: Tue, 30 Mar 2021 13:57:02 +0200 Subject: RFR: 8263028: Windows build fails due to several treat-warning-as-errors In-Reply-To: References: <6cv_HeWJ9HsBrB7NSFU-TGl4PP0Tp820KzwJ-FRn_so=.e4d8ab11-4203-4278-a829-43b6f1626465@github.com> <_4snjvydeDKDu6aZgC1Fqr_kc6sHII7F7Ywinh3H9Rw=.61007752-81c3-44a0-911f-4dd184259966@github.com> Message-ID: On 2021-03-29 16:44, Yi Yang wrote: > On Mon, 29 Mar 2021 09:56:09 GMT, Magnus Ihse Bursie wrote: > >> This could be done by: %comspec% runas /profile /user:yourotheruser "the_application_you_want_ to_run_in_english" or using the GUI (shift+right click on the icon, select Run as different user). > Thanks for your investigations and kind suggestions. It is more troublesome to add new a user to the Chinese system and set its system locale to English. Instead of doing this, I prefer to directly change the system locale to English. I understand that you find the easier to just change your local settings. Could I nevertheless trouble you to try out the method described above, i.e. to create a English locale user, and run the build from a shell started by %comspec% runas /profile /user:yourotheruser c:\cygwin64\bin\bash.exe ? If that works, I'd like to document it in the build manual as an alternative for those who do not want to change the locale of their main user account. > You convinced me, I agree with you that stating these has far-reaching consequences and your internal test matrix will become incredibly heavy. However, I think we can add a section in the FAQ or other places in the building document to give a solution for such problems as much as possible, e.g. I have opened https://bugs.openjdk.java.net/browse/JDK-8264425 about updating the build documentation. I suggest that this PR/issue can now be closed. /Magnus From neliasso at openjdk.java.net Tue Mar 30 12:13:22 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 30 Mar 2021 12:13:22 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: <14AlrpsyzREqo4XE-uDRzIsIeLnLcRzB5SIC9UU66Oc=.9d544536-3d41-40e7-a888-1bfd3484f8f5@github.com> Message-ID: On Fri, 26 Mar 2021 22:29:55 GMT, Vladimir Kozlov wrote: > I thought `exclude` command does not specify which compilation level (and corresponding compiler) is disabled - it disables all compilations. > But may be it is not true for directives. @neliasso, please, correct me if I am wrong. You can set Exclude differently for c1 and c2. That's sometimes very handy when creating more complex combinations of directives. ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From neliasso at openjdk.java.net Tue Mar 30 12:13:14 2021 From: neliasso at openjdk.java.net (Nils Eliasson) Date: Tue, 30 Mar 2021 12:13:14 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: Message-ID: <2_9uxN3Z44gvJa4lJsakbaBTaccWCnQkvBIeceftIYI=.7485a181-7314-4b31-9790-cfa9bdb38f12@github.com> On Thu, 25 Mar 2021 15:41:45 GMT, Christian Hagedorn wrote: >> While playing around with `WB_IsMethodCompilable` together with `compileonly` I ran into some surprising results for methods that should never be compiled (not part of `compileonly`): `isMethodCompilable` returns true instead of false when such an excluded method was not yet tried to be compiled. >> >> The reason for it is that `WB_IsMethodCompilable` directly checks `CompilationPolicy::can_be_compiled()` which calls `Method::is_not_compilable()`. However, the `ExcludeOption` compiler directive is only evaluated lazily upon a compilation attempt. Therefore, if a method was not tried to be compiled, yet, `Method::is_not_compilable()` always returns false, regardless of any set compiler directive. >> >> I therefore suggest to additionally check the `ExcludeOption` in `WB_IsMethodCompilable`. I also cleaned up some wrong use of `CompLevel_any` and `CompLevel_all` as suggested by @veresov: `CompLevel_any` should only be used to query the state as in `is_*()` methods and `CompLevel_all` when changing the state is in `set_*()` methods. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > fix typo Looks good. ------------- Marked as reviewed by neliasso (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3195 From hseigel at openjdk.java.net Tue Mar 30 12:16:31 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 30 Mar 2021 12:16:31 GMT Subject: RFR: 8264193: Remove TRAPS parameters for modules and defaultmethods In-Reply-To: References: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> Message-ID: <3SIedhLpUax0_zUv_0UU6Qokqi8YiatPxwgCvGKbJYM=.c89d8306-e980-4268-a51c-54e76b1bc16d@github.com> On Tue, 30 Mar 2021 01:40:02 GMT, David Holmes wrote: >> Please review this change for JDK-8264193 to remove unneeded TRAPS parameters from modules and default methods files. Besides removing TRAPS, Modules::get_named_module() was changed to return an oop instead of a jobject, removing its need for a TRAPS parameter. >> >> This change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. >> >> Thanks, Harold > > Great cleanup Harold! > > Thanks, > David Thanks Lois, Calvin, Coleen, and David for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/3247 From hseigel at openjdk.java.net Tue Mar 30 12:23:23 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Tue, 30 Mar 2021 12:23:23 GMT Subject: Integrated: 8264193: Remove TRAPS parameters for modules and defaultmethods In-Reply-To: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> References: <6WUbVyj36re-dOR20BOerxSJckUtyoHjwo6z9r1g-as=.fddf28f1-8cdd-4954-b734-65b36f5b8de4@github.com> Message-ID: On Mon, 29 Mar 2021 17:40:09 GMT, Harold Seigel wrote: > Please review this change for JDK-8264193 to remove unneeded TRAPS parameters from modules and default methods files. Besides removing TRAPS, Modules::get_named_module() was changed to return an oop instead of a jobject, removing its need for a TRAPS parameter. > > This change was tested with Mach5 tiers 1 and 2 on Linux, Mac OS, and Windows, and Mach5 tiers 3-5 on Linux x64. > > Thanks, Harold This pull request has now been integrated. Changeset: 6e74c3ab Author: Harold Seigel URL: https://git.openjdk.java.net/jdk/commit/6e74c3ab Stats: 61 lines in 8 files changed: 1 ins; 13 del; 47 mod 8264193: Remove TRAPS parameters for modules and defaultmethods Reviewed-by: lfoltan, ccheung, coleenp, dholmes ------------- PR: https://git.openjdk.java.net/jdk/pull/3247 From rehn at openjdk.java.net Tue Mar 30 13:20:10 2021 From: rehn at openjdk.java.net (Robbin Ehn) Date: Tue, 30 Mar 2021 13:20:10 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> <0qKPSmxPH02xshv9gpcarh-LIc6xiZnlYVCDrRPtCP0=.94eaa31a-501b-4d48-ade0-ae1abf6acddf@github.com> Message-ID: <38hXykjHJTEhOD0CAggi-VnbcQra-I9Js8BWeM86s88=.ef06f49b-ab26-4b3a-827e-da79ba242302@github.com> On Tue, 30 Mar 2021 07:19:08 GMT, Yasumasa Suenaga wrote: >>> > But I agree that talking about the design first would be helpful. Maybe have a little mailing list thread to stop polluting this PR? >>> >>> I posted similar diacussion to hotspot-runtime-dev last November. It aims to implement to send UL via network socket. I believe this PR helps it. >>> >>> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html >> >> Interesting. This design diagram is similar to this PR, but I don't think it is a good idea to have a blocking message buffer. >> As mentioned in prior thread, it makes hotspot be more subject to external factors. TCP/UPD is an even more representative example of blocking IO than harddrive, isn't it? >> >> ### Design and its Rationale >> For async-logging feature, we proposed a lossy non-blocking design here. A [bounded deque](https://github.com/openjdk/jdk/pull/3135/files#diff-5a3c326d548886f56ef0c46f4a63f7c58f76e1c51fada9a874d40d12a43f15b0R40) or ringbuffer gives a strong guarantee that log sites won't block java threads and the critical internal threads. This is the very problem we are meant to solve. >> >> >> It can be proven that we cannot have all three guarantees at the same time: **non-blocking**, **bounded memory** and **log fidelity**. To overcome blocking I/O, which sometimes is not under our control, we think it's fair to trade log fidelity for non-blocking. If we kept fidelity and chose unbound buffer, we could end up with some spooky out-of-memory errors on some resource-constrained hardwares. We understand that the platforms hotspot running range from powerful servers to embedded devices. By leaving the buffer size adjustable, we can fit more scenarios. Nevertheless, with a bounded buffer, we believe developers can still capture important logging traits as long as the window is big enough and log messages are consecutive. The current implementation does provide those two merits. >> >> ### A new proposal based on current implementation >> I agree with reviewers' comments above. It's questionable to use the singleton `WatcherThread` to do IO-intensive job here. It may hinder other tasks. David's guess is right. I was not familiar with hotspot thread and quite frustrated to deal with a special-task thread's lifecycle. That why I used PeriodicTask. I feel more confident to take that challenge again. >> >> Just like Yasumasa [depicted](https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d), I can create a dedicated NonJavaThread to flush logs instead. Yesterday, I found `WatcherThread::unpark()` uses its monitor to wake up other pending tasks. I think we can implement in this way. Once log sites observe the buffer is half-full, it uses `monitor::notify()` to resume flusher thread. I think logging event is high-frequent but less contentious. Waking it up for each log message is not so economical. I have a lossy buffer anyway, so I suggest to have two checkpoints only: 1) half-full. 2) full. >> >> ### Wrap it up >> We would like to propose a lossy design of async-logging in this PR. It is a trade off, so I don't think it's a good idea to handle all logs in async mode. In practice, we hope people only choose `async-logging` for those logs which may happen at safepoints. >> >> I understand Yasumasa's problem. If you would like to consider netcat or nfs/sshfs, I think your problem can still be solved in the existing file-based output. In this way, you can also utilize this feature by setting your "file" output async mode, then it makes your hotspot non-blocking over TCP as well. > > This proposal mostly looks good to me, but it is better if async support is implement in higher level class. > LogFileOutput class inherited as following, and you modified LogFileOutput now (you might change LogFileStreamOutput later) > > * LogOutput > * LogFileStreamOutput > * LogFileOutput > > I want to add async support to LogFileStreamOutput or LogFileStreamOutput because it helps us if we add other UL output (e.g. network socket) in future. Hi Xin, regrading the VM thread blocking on logs. If you instead use two arrays, one active and one for flushing, you can swap them with atomic stores from the flushing thread. And use GlobalCounter::write_synchronize(); to make sure no writer is still using the swapped out array for logging. The logging thread would use GlobalCounter::critical_section_begin(), atomic inc position to get the spot in the array for the log, store the log and then GlobalCounter::critical_section_end(). That way you will never block a logging thread with the flushing and run enqueues in parallel. If you want really want smooth logging you could also remove the strdup, since it may cause a syscall to "break in" more memory. To solve that you could use the arrays as memory instead, and do bump the pointer allocation with an atomic increment to size needed instead of position. I tested a bit locally generally I don't think there is an issue with blocking the VM thread on flushing. So I'm not really that concern about this, but it's always nice to have an algorithm which is constant time instead. (Neither CS begin()/end() or atomic inc can fail or loop on x86) ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From lucy at openjdk.java.net Tue Mar 30 13:20:20 2021 From: lucy at openjdk.java.net (Lutz Schmidt) Date: Tue, 30 Mar 2021 13:20:20 GMT Subject: RFR: 8261957: [PPC64] Support for Concurrent Thread-Stack Processing In-Reply-To: References: Message-ID: On Fri, 5 Mar 2021 09:58:35 GMT, Martin Doerr wrote: > I'd like to support Concurrent Thread-Stack Processing on PPC64. This will be needed by ShenandoahGC and zGC when implemented. Maybe for other purposes in the future, too. > I'm using conditional trap instructions by default, so we don't need the extra stubs unless -XX:-UseSIGTRAP is used. > > Original change: https://github.com/openjdk/jdk/commit/b9873e18 The changes look good to me. I had only one minor suggestion. It's your choice to accept or disregard it. Thank you for contributing this enhancement! src/hotspot/cpu/ppc/macroAssembler_ppc.cpp line 3069: > 3067: fp = R0; > 3068: ld(fp, _abi0(callers_sp), R1_SP); > 3069: } You could move this block down into the else path, preceding the cmpld() ------------- Marked as reviewed by lucy (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2841 From stefank at openjdk.java.net Tue Mar 30 13:32:18 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 30 Mar 2021 13:32:18 GMT Subject: Integrated: 8264271: Avoid creating non_oop_word oops In-Reply-To: References: Message-ID: On Fri, 26 Mar 2021 12:01:35 GMT, Stefan Karlsson wrote: > Some parts of the JVM puts an marker to show that a location does not contain a valid oop. The code that handles this typically look like this: > > oop* p = ... > if (*p != Universe::non_oop_word()) > > This means that sometimes the *p will create an oop that contains the non_oop_word. This makes it problematic to add stricter oop verification. I propose that we add a new function that checks the value of locations without converting it to an oop. > > (Note: I'm testing the new dependent pull Skara feature with this PR. It builds depends on the pr/3214 branch) This pull request has now been integrated. Changeset: 2c9365d7 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/2c9365d7 Stats: 58 lines in 7 files changed: 38 ins; 10 del; 10 mod 8264271: Avoid creating non_oop_word oops Reviewed-by: kbarrett, pliden ------------- PR: https://git.openjdk.java.net/jdk/pull/3215 From stuefe at openjdk.java.net Tue Mar 30 13:36:09 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Tue, 30 Mar 2021 13:36:09 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: <38hXykjHJTEhOD0CAggi-VnbcQra-I9Js8BWeM86s88=.ef06f49b-ab26-4b3a-827e-da79ba242302@github.com> References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> <0qKPSmxPH02xshv9gpcarh-LIc6xiZnlYVCDrRPtCP0=.94eaa31a-501b-4d48-ade0-ae1abf6acddf@github.com> <38hXykjHJTEhOD0CAggi-VnbcQra-I9Js8BWeM86s88=.ef06f49b-ab26-4b3a-827e-da79ba242302@github.com> Message-ID: On Tue, 30 Mar 2021 13:17:05 GMT, Robbin Ehn wrote: > Hi Xin, regrading the VM thread blocking on logs. > > If you instead use two arrays, one active and one for flushing, you can swap them with atomic stores from the flushing thread. > And use GlobalCounter::write_synchronize(); to make sure no writer is still using the swapped out array for logging. > > The logging thread would use GlobalCounter::critical_section_begin(), atomic inc position to get the spot in the array for the log, store the log and then GlobalCounter::critical_section_end(). > > That way you will never block a logging thread with the flushing and run enqueues in parallel. > > If you want really want smooth logging you could also remove the strdup, since it may cause a syscall to "break in" more memory. > To solve that you could use the arrays as memory instead, and do bump the pointer allocation with an atomic increment to size needed instead of position. +1. This is what I meant with my strdup() critique. Does the Deque does not also allocate memory for its entries dynamically? If yes, we'd have at least two allocations per log, which I would avoid. I'd really prefer a simple stupid fixed-sized array here (or two, the double buffering Robbin proposed is a nice touch). As I wrote before, this would make UL also more robust in case we ever want to log low level VM stuff without running into circularities. Ideally, UL should never have relied on VM infrastructure to begin with. That is a design flaw IMHO. UL calling - while logging - into os::malloc makes me deeply uneasy. > > I tested a bit locally generally I don't think there is an issue with blocking the VM thread on flushing. > So I'm not really that concern about this, but it's always nice to have an algorithm which is constant time instead. (Neither CS begin()/end() or atomic inc can fail or loop on x86) ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From mdoerr at openjdk.java.net Tue Mar 30 14:17:53 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 30 Mar 2021 14:17:53 GMT Subject: RFR: 8261957: [PPC64] Support for Concurrent Thread-Stack Processing [v2] In-Reply-To: References: Message-ID: > I'd like to support Concurrent Thread-Stack Processing on PPC64. This will be needed by ShenandoahGC and zGC when implemented. Maybe for other purposes in the future, too. > I'm using conditional trap instructions by default, so we don't need the extra stubs unless -XX:-UseSIGTRAP is used. > > Original change: https://github.com/openjdk/jdk/commit/b9873e18 Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: cleanup MacroAssembler::safepoint_poll ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2841/files - new: https://git.openjdk.java.net/jdk/pull/2841/files/68326ddc..66a90b69 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2841&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2841&range=00-01 Stats: 26 lines in 1 file changed: 7 ins; 10 del; 9 mod Patch: https://git.openjdk.java.net/jdk/pull/2841.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2841/head:pull/2841 PR: https://git.openjdk.java.net/jdk/pull/2841 From mdoerr at openjdk.java.net Tue Mar 30 14:21:16 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Tue, 30 Mar 2021 14:21:16 GMT Subject: RFR: 8261957: [PPC64] Support for Concurrent Thread-Stack Processing [v2] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 13:16:35 GMT, Lutz Schmidt wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> cleanup MacroAssembler::safepoint_poll > > The changes look good to me. > I had only one minor suggestion. It's your choice to accept or disregard it. > Thank you for contributing this enhancement! Thanks for the review and the suggestion. I've cleaned up MacroAssembler::safepoint_poll. ------------- PR: https://git.openjdk.java.net/jdk/pull/2841 From kbarrett at openjdk.java.net Tue Mar 30 14:47:17 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 30 Mar 2021 14:47:17 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: References: Message-ID: <1dNwCuM6w1b1ZOKz2vIsKYoH8Y70TyRggc9QZUSazoY=.8cc0379f-0cf6-4535-b5f0-a1e43379ed40@github.com> On Tue, 30 Mar 2021 08:24:16 GMT, Man Cao wrote: >> src/hotspot/share/utilities/lockFreeQueue.inline.hpp line 100: >> >>> 98: template >>> 99: template >>> 100: T* LockFreeQueue::pop() { >> >> On further consideration I don't think this `use_rcu` conditionalized `pop` is the right path. The current behavior (with the embedded critical section) was for the specific use case in `G1DirtyCardQueueSet`. But for a general tool, I think a different approach is needed. I think better would be to not provide `pop()` at all, and instead provide `try_pop()`, which has a tri-status result: success, lost a race, or lost to an in-progress operation. So something like: >> enum class LockFreeQueuePopStatus { >> success, >> lost_race, >> operation_in_progress >> }; >> >> // Member of LockFreeQueue >> // Executes the body of the old pop loop once, with appropriate >> // adjustments to return value and returning rather than retrying. >> Pair try_pop(); >> Then let the specific use-case determine the context in which try_pop should be called and how to handle the various possible results. >> >> This eliminates `ConditionalCriticalSection` (which seems strange). This also eliminates the `G1DirtyCardQueueSet`-specific subclass of `LockFreeQueue`. Instead we have (private) `G1DirtyCardQueueSet::pop_queue()`: >> BufferNode* G1DirtyCardQueueSet::pop_queue() { >> using Status = LockFreeQueuePopStatus; >> Thread* current_thread = Thread::current(); >> while (true) { >> GlobalCounter::CriticalSection cs(current_thread); >> Pair pop_result = _completed.try_pop(); >> switch (pop_result.first) { >> case Status::success: return pop_result.second; >> case Status::operation_in_progress: return nullptr; >> case Status::lost_race: break; // Try again. >> } >> } >> } >> I'm also not sure whether the G1 case actually needs the critical section inside the loop anymore. That might be a holdover from an earlier version where the operation-in-progress case did just loop to try again. It definitely does need a critical section though; the life cycle management for the BufferNodes depends on it. > > Thanks for the suggestion, and yes this makes sense. Done. Could you double-check if the updated comments are appropriate? > > My only concern is that the difference between operation_in_progress and lost_race may be too subtle for most client code. I suppose most client code can just do "retry until succeeded" like in the test. > I don't expect these two cases to differ much in run-time. Is there any performance data to show that the operation_in_progress case indeed takes much longer for retrying? I checked review threads for JDK-8237143 and JDK-8238867 but didn't find any. > > Anyway, this CR should use the tri-status approach (and the intra-loop critical section approach), in order to avoid behavior change. In the "lost-race" cases, there was cmpxchg contention that the current thread lost. Retry is a reasonable thing to do, though that's up to the application. Success on retry is of course not guaranteed, because there could again be contention with some other thread, but at least somebody is making progress. In the "operation-in-progress" cases the queue is in a state where the current operation cannot make progress until a different thread changes the state. If that other thread happens to have been descheduled or something like that, it could be a (comparatively) very long time before that happens, with lots of expensive spinning by a retrying try_pop. So I think the comment for that state needs some more work. Maybe something like this? "An in-progress concurrent operation interfered with taking what had been the only remaining element in the queue. A concurrent try_pop may have already claimed it, but not completely updated the queue. Alternatively, a concurrent push/append may have not yet linked the new entry(s) to the former sole entry. Retrying the try_pop will continue to fail in this way until that other thread has updated the queue's internal structure." That was sufficient for the G1DCQS usage, so I stopped trying to do better. (I don't even know if it's possible to do better, though I have some old notes on an idea.) But it's pretty ugly for a general utility, which is why I left this queue class buried inside G1DCQS rather than hoisting it out and generalizing it. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From stefank at openjdk.java.net Tue Mar 30 14:52:20 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Tue, 30 Mar 2021 14:52:20 GMT Subject: RFR: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ Message-ID: There's a mismatch in some toolchains about what part should provide the nullptr_t definition. This patch takes the easy way out and include cstddef and changes the two usages nullptr_t to std::nullptr_t. See the bug report for more details. We could have redefined nullptr_t to resolve this, but that would have required more extensive testing, so I left that as a potential future cleanup. I've tested this by compiling with clang on linux. I'm going to let GHA testing testing run, and will also run our tier1 testing. ------------- Commit messages: - 8264346: nullptr_t undefined in global namespace for clang+libstdc++ Changes: https://git.openjdk.java.net/jdk/pull/3269/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3269&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264346 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3269.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3269/head:pull/3269 PR: https://git.openjdk.java.net/jdk/pull/3269 From kbarrett at openjdk.java.net Tue Mar 30 14:53:23 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 30 Mar 2021 14:53:23 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v3] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 07:49:00 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > Man Cao has updated the pull request incrementally with one additional commit since the last revision: > > Changed to try_pop() and eliminated conditional critical section. Changes requested by kbarrett (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From gziemski at openjdk.java.net Tue Mar 30 16:13:06 2021 From: gziemski at openjdk.java.net (Gerard Ziemski) Date: Tue, 30 Mar 2021 16:13:06 GMT Subject: RFR: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> Message-ID: On Mon, 29 Mar 2021 11:39:31 GMT, Anton Kozlov wrote: > Please review a fix for compiler/debug/VerifyAdapterSharing.java failure on macos/aarch64 platform. The root cause is in missing W^X switch in JNI DestroyJavaVM. > > I reviewed the rest of the JNI Invoke Interface functions. DetachCurrentThread needs a similar change, although nothing fails immediately. So DetachCurrentThread is changed as a precaution. Looks good to me, thank you for fixing it. ------------- Marked as reviewed by gziemski (Committer). PR: https://git.openjdk.java.net/jdk/pull/3241 From kvn at openjdk.java.net Tue Mar 30 16:17:10 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 30 Mar 2021 16:17:10 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 15:41:45 GMT, Christian Hagedorn wrote: >> While playing around with `WB_IsMethodCompilable` together with `compileonly` I ran into some surprising results for methods that should never be compiled (not part of `compileonly`): `isMethodCompilable` returns true instead of false when such an excluded method was not yet tried to be compiled. >> >> The reason for it is that `WB_IsMethodCompilable` directly checks `CompilationPolicy::can_be_compiled()` which calls `Method::is_not_compilable()`. However, the `ExcludeOption` compiler directive is only evaluated lazily upon a compilation attempt. Therefore, if a method was not tried to be compiled, yet, `Method::is_not_compilable()` always returns false, regardless of any set compiler directive. >> >> I therefore suggest to additionally check the `ExcludeOption` in `WB_IsMethodCompilable`. I also cleaned up some wrong use of `CompLevel_any` and `CompLevel_all` as suggested by @veresov: `CompLevel_any` should only be used to query the state as in `is_*()` methods and `CompLevel_all` when changing the state is in `set_*()` methods. >> >> Thanks, >> Christian > > Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: > > fix typo Marked as reviewed by kvn (Reviewer). ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From kvn at openjdk.java.net Tue Mar 30 16:17:10 2021 From: kvn at openjdk.java.net (Vladimir Kozlov) Date: Tue, 30 Mar 2021 16:17:10 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: <14AlrpsyzREqo4XE-uDRzIsIeLnLcRzB5SIC9UU66Oc=.9d544536-3d41-40e7-a888-1bfd3484f8f5@github.com> Message-ID: On Tue, 30 Mar 2021 12:06:36 GMT, Nils Eliasson wrote: >> I thought `exclude` command does not specify which compilation level (and corresponding compiler) is disabled - it disables all compilations. >> But may be it is not true for directives. @neliasso, please, correct me if I am wrong. > >> I thought `exclude` command does not specify which compilation level (and corresponding compiler) is disabled - it disables all compilations. >> But may be it is not true for directives. @neliasso, please, correct me if I am wrong. > > You can set Exclude differently for c1 and c2. That's sometimes very handy when creating more complex combinations of directives. Changes good then. ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From volker.simonis at gmail.com Tue Mar 30 18:18:33 2021 From: volker.simonis at gmail.com (Volker Simonis) Date: Tue, 30 Mar 2021 20:18:33 +0200 Subject: Request For Comment: Asynchronous Logging Message-ID: Hi, I'd like to (re)start a discussion on asynchronous logging [1,2,3,4]. We are successfully using this feature productively at Amazon both in jdk8 and jdk11 to reduce the tail latency of services which use logging. We think that async logging is a useful addition to the current logging framework which might be beneficial to a wider range of users. The following write-up tries to capture the comments and suggestions from the previous discussions we are aware of. Current state: - HotSpot uses the so called "Unified Logging" (UL) framework which was introduced by JEP 158 [5] in JDK 9. Most logs have been retrofitted to use UL since then (e.g. "JEP 271: Unified GC Logging" [6]). - The current UL implementation is based on the standard C buffered stream I/O interface [7]. The LogFileStreamOutput class which writes logs to abstract FILE streams is the only child of the abstract base class LogOutput. LogFileStreamOutput has three child classes LogStdoutOutput, LogStderrOutput and LogFileOutput which write to stdout, stderr or an arbitrary file respectively. The initial UL JEP 158 [5] envisioned logging to a socket but has not implemented it. At least one such extension has been proposed since then [8]. - UL synchronizes logs from different threads with the help of the standard C flockfile()/funlockfile() [9] functions. But even without this explicit locking, all the "stdio functions are thread-safe. This is achieved by assigning to each FILE object a lockcount and (if the lockcount is nonzero) an owning thread. For each library call, these functions wait until the FILE object is no longer locked by a different thread, then lock it, do the requested I/O, and unlock the object again" [9]. A quick look at the glibc sources reveals that FILE locking is implemented with the help of futex() [10] which breaks down to s simple atomic compare and swap (CAS) on the fast path. - Notice that UL "synchronizes" logs from different threads to avoid log interleaving. But it does not "serialize" logs according to the time at which they occurred. This is because the underlying stdio functions do not guarantee a specific order for different threads waiting on a locked FILE stream. E.g. if three log events A, B, C occur in that order, the first will lock the output stream. If the log events B and C both arrive while the stream is locked, it is unspecified which of B and C will be logged first after A releases the lock. Problem statement: - The amount of time a log event will block its FILE stream depends on the underlying file system. This can range from a few nanoseconds for in-memory file systems or milliseconds for physical discs under heavy load up to several seconds in the worst case scenario for e.g. network file systems. A blocked log output stream will block all concurrent threads which try to write log messages at the same time. If logging is done during a safepoint, this can significantly increase the safepoint time (e.g. several parallel GC threads trying to log at the same time). We can treat stdout/stderr as special files here without loss of generality. Proposed solution: Extend UL with an asynchronous logging facility. Asynchronous logging will be optional and disabled by default. It should have the following properties: - If disabled (the default) asynchronous logging should have no observable impact on logging. - If enabled, log messages will be stored in an intermediate data structure (e.g. a double ended queue). - A service thread will concurrently read and remove log messages from that data structure in a FIFO style and write them to the output stream - Storing log messages in the intermediate data structure should take constant time and not longer than logging a message takes in the traditional UL system (in general the time should be much shorter because the actual I/O is deferred). - Asynchronous logging trades memory overhead (i.e. the size of the intermediate data structure) for log accuracy. This means that in the unlikely case where the service thread which does the asynchronous logging falls behind the log producing threads, some logs might be lost. However, the probability for this to happen can be minimized by increasing the configurable size of the intermediate data structure. - The final output produced by asynchronous logging should be equal to the output from normal logging if no messages had to be dropped. Notice that in contrast to the traditional unified logging, asynchronous logging will give us the possibility to not only synchronize log events, but to optionally also serialize them based on their generation time if that's desired. This is because we are in full control of the synchronization primitives for the intermediate data structure which stores the logs. - If log messages had to be dropped, this should be logged in the log output (e.g. "[..] 42 messages dropped due to async logging") - Asynchronous logging should ideally be implemented in such a way that it can be easily adapted by alternative log targets like for example sockets in the future. Alternative solutions: - It has repeatedly been suggested to place the log files into a memory file system but we don't think this is an optimal solution. Main memory is often a constrained resource and we don't want log files to compete with the JVM for it in such cases. - It has also been argued to place the log files on a fast file system which is only used for logging but in containerized environments file system are often virtualized and the properties of the underlying physical devices are not obvious. - The load on the file system might be unpredictable due to other applications on the same host. - All these workarounds won't work if we eventually implement direct logging to a network socket as suggested in [8]. Implementation details / POC: - A recent pull request [2] for JDK-8229517 [3] proposed to use a simple deque implementation derived from HotSpot's LinkedListImpl class for the intermediate data structure. It synchronizes access to the queue with a MutexLocker which is internally implemented with pthread_lock() and results in an atomic CAS on the fast path. So performance-wise the locking itself is not different from the flockfile()/funlockfile() functionality currently used by UL but adding a log message to the deque should be constant as it basically only requires a strdup(). And we could even eliminate the strdup() if we'd pre-allocate a big enough array for holding the log messages as proposed in the pull request [2]. - The pull pull request [2] for JDK-8229517 [3] proposed to set the async flag as an attribute of the Xlog option which feels more natural because UL configuration is traditionally done within the Xlog option. But we could just as well use a global -XX flag to enable async logging? What are your preferences here? - The pull pull request [2] for JDK-8229517 [3] (mis)uses the WatcherThread as service thread to concurrently process the intermediate data structure and write the log messages out to the log stream. That should definitely be changed to an independent service thread. - The pull pull request [2] for JDK-8229517 [3] initially proposed that the "service thread" runs at a fixed interval to dump log messages to the log streams. But reviewers commented that this should better happen either continuously or based on the filling level of the intermediate data structure. What are your preferences here? - What are your preferences on the configuration of the intermediate data structure? Should it be configured based on the maximum number of log messages it can store or rather on the total size of the stored log messages? I think that for normal runs this distinction probably won't make a big difference because the size of log messages will probably be the same on average so "number of log messages" should always be proportional to "total size of log mesages". 1. Before diving into more implementation details, I'd first like to reach a general consensus that asynchronous logging is a useful feature that's worth while adding to HotSpot. 2. Once we agree on that, we should agree on the desired properties of asynchronous logging. I've tried to collect a basic set in the "Proposed solution" section. 3. If that's done as well, we can discuss various implementation details and finally prepare new pull requests. Thank you and best regards, Volker [1] https://bugs.openjdk.java.net/browse/JDK-8229517 [2] https://github.com/openjdk/jdk/pull/3135 [3] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html [4] https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-August/039130.html [5] https://openjdk.java.net/jeps/158 [6] https://openjdk.java.net/jeps/271 [7] https://man7.org/linux/man-pages/man3/stdio.3.html [8] https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d [9] https://man7.org/linux/man-pages/man3/flockfile.3.html [10] https://man7.org/linux/man-pages/man2/futex.2.html From simonis at openjdk.java.net Tue Mar 30 18:31:09 2021 From: simonis at openjdk.java.net (Volker Simonis) Date: Tue, 30 Mar 2021 18:31:09 GMT Subject: RFR: 8229517: Support for optional asynchronous/buffered logging [v2] In-Reply-To: References: <5lZuUwWVD5NwXo_gUOnUDUD4tdYUvils5Cx5X5r8elo=.5d1f1074-730a-4f88-ba67-67977ffe58d0@github.com> <37Cm5ItlFJ8nzQW_HI6-oyO_TuTK3f09BqiZ2-0l-iE=.fdde37ed-aa6b-4c28-bc30-0403542a518b@github.com> <0qKPSmxPH02xshv9gpcarh-LIc6xiZnlYVCDrRPtCP0=.94eaa31a-501b-4d48-ade0-ae1abf6acddf@github.com> <38hXykjHJTEhOD0CAggi-VnbcQra-I9Js8BWeM86s88=.ef06f49b-ab26-4b3a-827e-da79ba242302@github.com> Message-ID: On Tue, 30 Mar 2021 13:33:27 GMT, Thomas Stuefe wrote: >> Hi Xin, regrading the VM thread blocking on logs. >> >> If you instead use two arrays, one active and one for flushing, you can swap them with atomic stores from the flushing thread. >> And use GlobalCounter::write_synchronize(); to make sure no writer is still using the swapped out array for logging. >> >> The logging thread would use GlobalCounter::critical_section_begin(), atomic inc position to get the spot in the array for the log, store the log and then GlobalCounter::critical_section_end(). >> >> That way you will never block a logging thread with the flushing and run enqueues in parallel. >> >> If you want really want smooth logging you could also remove the strdup, since it may cause a syscall to "break in" more memory. >> To solve that you could use the arrays as memory instead, and do bump the pointer allocation with an atomic increment to size needed instead of position. >> >> I tested a bit locally generally I don't think there is an issue with blocking the VM thread on flushing. >> So I'm not really that concern about this, but it's always nice to have an algorithm which is constant time instead. (Neither CS begin()/end() or atomic inc can fail or loop on x86) > >> Hi Xin, regrading the VM thread blocking on logs. >> >> If you instead use two arrays, one active and one for flushing, you can swap them with atomic stores from the flushing thread. >> And use GlobalCounter::write_synchronize(); to make sure no writer is still using the swapped out array for logging. >> >> The logging thread would use GlobalCounter::critical_section_begin(), atomic inc position to get the spot in the array for the log, store the log and then GlobalCounter::critical_section_end(). >> >> That way you will never block a logging thread with the flushing and run enqueues in parallel. >> >> If you want really want smooth logging you could also remove the strdup, since it may cause a syscall to "break in" more memory. >> To solve that you could use the arrays as memory instead, and do bump the pointer allocation with an atomic increment to size needed instead of position. > > +1. This is what I meant with my strdup() critique. Does the Deque does not also allocate memory for its entries dynamically? If yes, we'd have at least two allocations per log, which I would avoid. I'd really prefer a simple stupid fixed-sized array here (or two, the double buffering Robbin proposed is a nice touch). > > As I wrote before, this would make UL also more robust in case we ever want to log low level VM stuff without running into circularities. Ideally, UL should never have relied on VM infrastructure to begin with. That is a design flaw IMHO. UL calling - while logging - into os::malloc makes me deeply uneasy. > >> >> I tested a bit locally generally I don't think there is an issue with blocking the VM thread on flushing. >> So I'm not really that concern about this, but it's always nice to have an algorithm which is constant time instead. (Neither CS begin()/end() or atomic inc can fail or loop on x86) Thanks everybody for your valuable comments. As requested in the PR, I've just started a [new discussion thread on hotspot-dev](https://mail.openjdk.java.net/pipermail/hotspot-dev/2021-March/050491.html) (with all current reviewers on CC). Before diving into more discussions about implementation details, I'd first like to: 1. Reach general consensus that asynchronous logging is a useful feature that's worth while adding to HotSpot. 2. Agree on the desired properties of asynchronous logging. I've tried to collect a basic set of desired properties in the "Proposed solution" section of that mail. 3. Discuss various implementation details and finally prepare new pull requests based on the that discussions. Your comments, suggestions and contributions are highly appreciated. Thank you and best regards, Volker ------------- PR: https://git.openjdk.java.net/jdk/pull/3135 From kbarrett at openjdk.java.net Tue Mar 30 19:11:41 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Tue, 30 Mar 2021 19:11:41 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v3] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 07:49:00 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > Man Cao has updated the pull request incrementally with one additional commit since the last revision: > > Changed to try_pop() and eliminated conditional critical section. src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp line 155: > 153: return pop_result.second; > 154: case Status::operation_in_progress: > 155: // This could happen when a concurrent operation interferes with I think most of this comment doesn't belong here. What's here now is largely restating what should be the documentation for `operation_in_progress`. src/hotspot/share/utilities/lockFreeQueue.hpp line 56: > 54: DEFINE_PAD_MINUS_SIZE(1, DEFAULT_CACHE_LINE_SIZE, sizeof(T*)); > 55: T* volatile _tail; > 56: DEFINE_PAD_MINUS_SIZE(2, DEFAULT_CACHE_LINE_SIZE, sizeof(T*)); LockFreeQueue should probably document it's padding. And maybe should not have the second (trailing) padding, but leave that to the specific usage. The trailing padding in the original was consistent with the usage there. Other uses might not need any trailing padding, or might need less. src/hotspot/share/utilities/lockFreeQueue.hpp line 98: > 96: // The operation succeeded. If pair.second is NULL, the queue is empty; > 97: // otherwise caller can assume ownership of the object pointed by > 98: // pair.second. Note that this case still subjects to ABA behavior; s/case still subjects/case is still subject/ ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Tue Mar 30 19:11:39 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 30 Mar 2021 19:11:39 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v4] In-Reply-To: References: Message-ID: <34dwf1No_CG6jWti-W-fYMp63qgBVrYPgZrNKNLp-vs=.b5655d16-d4cf-4d71-b8d5-1175f7e58953@github.com> > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man Man Cao has updated the pull request incrementally with one additional commit since the last revision: Revise comments and move trailing padding. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/2986/files - new: https://git.openjdk.java.net/jdk/pull/2986/files/7035ff56..844b3713 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=2986&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=2986&range=02-03 Stats: 26 lines in 3 files changed: 6 ins; 6 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/2986.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2986/head:pull/2986 PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Tue Mar 30 19:11:42 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 30 Mar 2021 19:11:42 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v2] In-Reply-To: <1dNwCuM6w1b1ZOKz2vIsKYoH8Y70TyRggc9QZUSazoY=.8cc0379f-0cf6-4535-b5f0-a1e43379ed40@github.com> References: <1dNwCuM6w1b1ZOKz2vIsKYoH8Y70TyRggc9QZUSazoY=.8cc0379f-0cf6-4535-b5f0-a1e43379ed40@github.com> Message-ID: On Tue, 30 Mar 2021 14:44:11 GMT, Kim Barrett wrote: >> Thanks for the suggestion, and yes this makes sense. Done. Could you double-check if the updated comments are appropriate? >> >> My only concern is that the difference between operation_in_progress and lost_race may be too subtle for most client code. I suppose most client code can just do "retry until succeeded" like in the test. >> I don't expect these two cases to differ much in run-time. Is there any performance data to show that the operation_in_progress case indeed takes much longer for retrying? I checked review threads for JDK-8237143 and JDK-8238867 but didn't find any. >> >> Anyway, this CR should use the tri-status approach (and the intra-loop critical section approach), in order to avoid behavior change. > > In the "lost-race" cases, there was cmpxchg contention that the current thread lost. Retry is a reasonable thing to do, though that's up to the application. Success on retry is of course not guaranteed, because there could again be contention with some other thread, but at least somebody is making progress. > > In the "operation-in-progress" cases the queue is in a state where the current operation cannot make progress until a different thread changes the state. If that other thread happens to have been descheduled or something like that, it could be a (comparatively) very long time before that happens, with lots of expensive spinning by a retrying try_pop. > > So I think the comment for that state needs some more work. Maybe something like this? > > "An in-progress concurrent operation interfered with taking what had been the only remaining element in the queue. A concurrent try_pop may have already claimed it, but not completely updated the queue. Alternatively, a concurrent push/append may have not yet linked the new entry(s) to the former sole entry. Retrying the try_pop will continue to fail in this way until that other thread has updated the queue's internal structure." > > That was sufficient for the G1DCQS usage, so I stopped trying to do better. (I don't even know if it's possible to do better, though I have some old notes on an idea.) But it's pretty ugly for a general utility, which is why I left this queue class buried inside G1DCQS rather than hoisting it out and generalizing it. Thanks for the explanation. Updated as suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Tue Mar 30 19:11:42 2021 From: manc at openjdk.java.net (Man Cao) Date: Tue, 30 Mar 2021 19:11:42 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v3] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 14:47:33 GMT, Kim Barrett wrote: >> Man Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Changed to try_pop() and eliminated conditional critical section. > > src/hotspot/share/utilities/lockFreeQueue.hpp line 56: > >> 54: DEFINE_PAD_MINUS_SIZE(1, DEFAULT_CACHE_LINE_SIZE, sizeof(T*)); >> 55: T* volatile _tail; >> 56: DEFINE_PAD_MINUS_SIZE(2, DEFAULT_CACHE_LINE_SIZE, sizeof(T*)); > > LockFreeQueue should probably document it's padding. And maybe should not have the second (trailing) padding, but leave that to the specific usage. The trailing padding in the original was consistent with the usage there. Other uses might not need any trailing padding, or might need less. Moved the padding. ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From xxinliu at amazon.com Tue Mar 30 21:21:15 2021 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 30 Mar 2021 21:21:15 +0000 Subject: Request For Comment: Asynchronous Logging In-Reply-To: References: Message-ID: <7FEF3035-8926-467C-AD7B-A001A9C8FD5B@amazon.com> Thanks Volker for this. I would like to append some additional materials. I forgot to mention them when I wrote the Rationale[5] yesterday. We identified and root-caused the tail-latency on a Linux system with software RAID in 2018. We have different implementations on jdk8u and jdk11u. We are seeking to merge this feature to tip. Nonetheless, it doesn't mean "async-logging facility" only solves Amazon's peculiar problem. When we studied this, we found many interesting references. Eg. LinkedIn reported and analyzed it well[1]. In particular, they mentioned that one reason was Linux cache writeback [2]. IMHO, that could impact almost all mass-storge Linux filesystems. Twitter also expressed that "I would love to hear if this can happen with OpenJDK!"[3]. This is also reported by other companies[4]. Thanks, --lx [1] https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic [2] https://yoshinorimatsunobu.blogspot.com/2014/03/why-buffered-writes-are-sometimes.html [3] https://www.evanjones.ca/jvm-mmap-pause-finding.html [4] https://mail.openjdk.java.net/pipermail/hotspot-dev/2020-June/042301.html [5] https://github.com/openjdk/jdk/pull/3135#issuecomment-809942487 ?On 3/30/21, 11:20 AM, "Volker Simonis" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. Hi, I'd like to (re)start a discussion on asynchronous logging [1,2,3,4]. We are successfully using this feature productively at Amazon both in jdk8 and jdk11 to reduce the tail latency of services which use logging. We think that async logging is a useful addition to the current logging framework which might be beneficial to a wider range of users. The following write-up tries to capture the comments and suggestions from the previous discussions we are aware of. Current state: - HotSpot uses the so called "Unified Logging" (UL) framework which was introduced by JEP 158 [5] in JDK 9. Most logs have been retrofitted to use UL since then (e.g. "JEP 271: Unified GC Logging" [6]). - The current UL implementation is based on the standard C buffered stream I/O interface [7]. The LogFileStreamOutput class which writes logs to abstract FILE streams is the only child of the abstract base class LogOutput. LogFileStreamOutput has three child classes LogStdoutOutput, LogStderrOutput and LogFileOutput which write to stdout, stderr or an arbitrary file respectively. The initial UL JEP 158 [5] envisioned logging to a socket but has not implemented it. At least one such extension has been proposed since then [8]. - UL synchronizes logs from different threads with the help of the standard C flockfile()/funlockfile() [9] functions. But even without this explicit locking, all the "stdio functions are thread-safe. This is achieved by assigning to each FILE object a lockcount and (if the lockcount is nonzero) an owning thread. For each library call, these functions wait until the FILE object is no longer locked by a different thread, then lock it, do the requested I/O, and unlock the object again" [9]. A quick look at the glibc sources reveals that FILE locking is implemented with the help of futex() [10] which breaks down to s simple atomic compare and swap (CAS) on the fast path. - Notice that UL "synchronizes" logs from different threads to avoid log interleaving. But it does not "serialize" logs according to the time at which they occurred. This is because the underlying stdio functions do not guarantee a specific order for different threads waiting on a locked FILE stream. E.g. if three log events A, B, C occur in that order, the first will lock the output stream. If the log events B and C both arrive while the stream is locked, it is unspecified which of B and C will be logged first after A releases the lock. Problem statement: - The amount of time a log event will block its FILE stream depends on the underlying file system. This can range from a few nanoseconds for in-memory file systems or milliseconds for physical discs under heavy load up to several seconds in the worst case scenario for e.g. network file systems. A blocked log output stream will block all concurrent threads which try to write log messages at the same time. If logging is done during a safepoint, this can significantly increase the safepoint time (e.g. several parallel GC threads trying to log at the same time). We can treat stdout/stderr as special files here without loss of generality. Proposed solution: Extend UL with an asynchronous logging facility. Asynchronous logging will be optional and disabled by default. It should have the following properties: - If disabled (the default) asynchronous logging should have no observable impact on logging. - If enabled, log messages will be stored in an intermediate data structure (e.g. a double ended queue). - A service thread will concurrently read and remove log messages from that data structure in a FIFO style and write them to the output stream - Storing log messages in the intermediate data structure should take constant time and not longer than logging a message takes in the traditional UL system (in general the time should be much shorter because the actual I/O is deferred). - Asynchronous logging trades memory overhead (i.e. the size of the intermediate data structure) for log accuracy. This means that in the unlikely case where the service thread which does the asynchronous logging falls behind the log producing threads, some logs might be lost. However, the probability for this to happen can be minimized by increasing the configurable size of the intermediate data structure. - The final output produced by asynchronous logging should be equal to the output from normal logging if no messages had to be dropped. Notice that in contrast to the traditional unified logging, asynchronous logging will give us the possibility to not only synchronize log events, but to optionally also serialize them based on their generation time if that's desired. This is because we are in full control of the synchronization primitives for the intermediate data structure which stores the logs. - If log messages had to be dropped, this should be logged in the log output (e.g. "[..] 42 messages dropped due to async logging") - Asynchronous logging should ideally be implemented in such a way that it can be easily adapted by alternative log targets like for example sockets in the future. Alternative solutions: - It has repeatedly been suggested to place the log files into a memory file system but we don't think this is an optimal solution. Main memory is often a constrained resource and we don't want log files to compete with the JVM for it in such cases. - It has also been argued to place the log files on a fast file system which is only used for logging but in containerized environments file system are often virtualized and the properties of the underlying physical devices are not obvious. - The load on the file system might be unpredictable due to other applications on the same host. - All these workarounds won't work if we eventually implement direct logging to a network socket as suggested in [8]. Implementation details / POC: - A recent pull request [2] for JDK-8229517 [3] proposed to use a simple deque implementation derived from HotSpot's LinkedListImpl class for the intermediate data structure. It synchronizes access to the queue with a MutexLocker which is internally implemented with pthread_lock() and results in an atomic CAS on the fast path. So performance-wise the locking itself is not different from the flockfile()/funlockfile() functionality currently used by UL but adding a log message to the deque should be constant as it basically only requires a strdup(). And we could even eliminate the strdup() if we'd pre-allocate a big enough array for holding the log messages as proposed in the pull request [2]. - The pull pull request [2] for JDK-8229517 [3] proposed to set the async flag as an attribute of the Xlog option which feels more natural because UL configuration is traditionally done within the Xlog option. But we could just as well use a global -XX flag to enable async logging? What are your preferences here? - The pull pull request [2] for JDK-8229517 [3] (mis)uses the WatcherThread as service thread to concurrently process the intermediate data structure and write the log messages out to the log stream. That should definitely be changed to an independent service thread. - The pull pull request [2] for JDK-8229517 [3] initially proposed that the "service thread" runs at a fixed interval to dump log messages to the log streams. But reviewers commented that this should better happen either continuously or based on the filling level of the intermediate data structure. What are your preferences here? - What are your preferences on the configuration of the intermediate data structure? Should it be configured based on the maximum number of log messages it can store or rather on the total size of the stored log messages? I think that for normal runs this distinction probably won't make a big difference because the size of log messages will probably be the same on average so "number of log messages" should always be proportional to "total size of log mesages". 1. Before diving into more implementation details, I'd first like to reach a general consensus that asynchronous logging is a useful feature that's worth while adding to HotSpot. 2. Once we agree on that, we should agree on the desired properties of asynchronous logging. I've tried to collect a basic set in the "Proposed solution" section. 3. If that's done as well, we can discuss various implementation details and finally prepare new pull requests. Thank you and best regards, Volker [1] https://bugs.openjdk.java.net/browse/JDK-8229517 [2] https://github.com/openjdk/jdk/pull/3135 [3] https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html [4] https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-August/039130.html [5] https://openjdk.java.net/jeps/158 [6] https://openjdk.java.net/jeps/271 [7] https://man7.org/linux/man-pages/man3/stdio.3.html [8] https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d [9] https://man7.org/linux/man-pages/man3/flockfile.3.html [10] https://man7.org/linux/man-pages/man2/futex.2.html From nradomski at openjdk.java.net Tue Mar 30 22:34:22 2021 From: nradomski at openjdk.java.net (Niklas Radomski) Date: Tue, 30 Mar 2021 22:34:22 GMT Subject: RFR: 8261957: [PPC64] Support for Concurrent Thread-Stack Processing [v2] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 14:17:53 GMT, Martin Doerr wrote: >> I'd like to support Concurrent Thread-Stack Processing on PPC64. This will be needed by ShenandoahGC and zGC when implemented. Maybe for other purposes in the future, too. >> I'm using conditional trap instructions by default, so we don't need the extra stubs unless -XX:-UseSIGTRAP is used. >> >> Original change: https://github.com/openjdk/jdk/commit/b9873e18 > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > cleanup MacroAssembler::safepoint_poll Marked as reviewed by nradomski (Author). ------------- PR: https://git.openjdk.java.net/jdk/pull/2841 From coleenp at openjdk.java.net Tue Mar 30 22:37:23 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 30 Mar 2021 22:37:23 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v3] In-Reply-To: References: Message-ID: <5RTXwHPyhhb0kjYv6nejzYMuVdLup3eV4ID3GqixwDs=.5779e800-41a1-4202-8590-92404b1e7e62@github.com> On Tue, 30 Mar 2021 03:53:54 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Make which version of MethodCounters::allocate() is called clearer. > > src/hotspot/share/oops/method.cpp line 570: > >> 568: if (current->is_Java_thread()) { >> 569: Thread* THREAD = current; >> 570: counters = MethodCounters::allocate(mh, THREAD); > > Can you add a comment before this line: > > // Use the TRAPS version for a JavaThread so it will adjust the GC threshold if needed. > > Thanks. ok, yes that says why. > src/hotspot/share/oops/method.cpp line 572: > >> 570: counters = MethodCounters::allocate(mh, THREAD); >> 571: if (HAS_PENDING_EXCEPTION) { >> 572: CLEAR_PENDING_EXCEPTION; // MethodData above doesn't clear exception > > I don't understand the comment. It's sort of a question, because the above code that does the same thing for MethodData doesn't clear the exception. In this case we should clear the exception I believe. I'll remove the comment. > src/hotspot/share/oops/method.cpp line 575: > >> 573: CompileBroker::log_metaspace_failure(); >> 574: ClassLoaderDataGraph::set_metaspace_oom(true); >> 575: return NULL; > > You could factor this out for both cases by testing "counters == NULL". Yes, this is better. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Tue Mar 30 22:49:15 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 30 Mar 2021 22:49:15 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v2] In-Reply-To: References: <8lcINJaJDOg62ESQch_n30qQTKXSauOH5qGJuD98T4I=.97409776-53ef-4308-bc1f-dffb6f2e907d@github.com> Message-ID: On Tue, 30 Mar 2021 04:01:43 GMT, Thomas Stuefe wrote: > I think that would be better. I am unclear on what happened in this case before; did we also miss out on allocating the Counters? Before this change, it was very unlikely that allocating metaspace counters in a breakpoint safepoint ran out of memory, so never threw the exception. Or else they did and returned NULL and all of the code around their allocation has handling for a null return. We're working on enforcing what should be a rule that only JavaThreads can throw exceptions and this was an exception to that. :) > Re: expand_and_allocate() I didn't want to expose internal metaspace functions or more handling for this special case, and that would prevent the nice sharing of most of the Metaspace::allocate code. It's allowed for method counters to return NULL here. If it's not long term, we should move the breakpoint counters to Method (but it would increase Method by a pointer size, which isn't good). I will rename the functions as requested. Have a nice vacation and thank you for your comments. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Tue Mar 30 23:04:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 30 Mar 2021 23:04:41 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v4] In-Reply-To: References: Message-ID: <3goJejfzIosKifkn8_QwTB3YaoB5na1nbHfDARhNwvA=.6e3b2bcc-2359-4393-a9b6-8649687ea92a@github.com> > This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. > > Tested with tier1-7. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Rename MethodCounter allocate functions, refactor null checking. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3207/files - new: https://git.openjdk.java.net/jdk/pull/3207/files/459f63bf..807c84ba Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3207&range=03 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3207&range=02-03 Stats: 24 lines in 3 files changed: 9 ins; 8 del; 7 mod Patch: https://git.openjdk.java.net/jdk/pull/3207.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3207/head:pull/3207 PR: https://git.openjdk.java.net/jdk/pull/3207 From david.holmes at oracle.com Tue Mar 30 23:41:11 2021 From: david.holmes at oracle.com (David Holmes) Date: Wed, 31 Mar 2021 09:41:11 +1000 Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v3] In-Reply-To: <5RTXwHPyhhb0kjYv6nejzYMuVdLup3eV4ID3GqixwDs=.5779e800-41a1-4202-8590-92404b1e7e62@github.com> References: <5RTXwHPyhhb0kjYv6nejzYMuVdLup3eV4ID3GqixwDs=.5779e800-41a1-4202-8590-92404b1e7e62@github.com> Message-ID: On 31/03/2021 8:37 am, Coleen Phillimore wrote: > On Tue, 30 Mar 2021 03:53:54 GMT, David Holmes wrote: > >>> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >>> >>> Make which version of MethodCounters::allocate() is called clearer. >> >> src/hotspot/share/oops/method.cpp line 570: >> >>> 568: if (current->is_Java_thread()) { >>> 569: Thread* THREAD = current; >>> 570: counters = MethodCounters::allocate(mh, THREAD); >> >> Can you add a comment before this line: >> >> // Use the TRAPS version for a JavaThread so it will adjust the GC threshold if needed. >> >> Thanks. > > ok, yes that says why. > >> src/hotspot/share/oops/method.cpp line 572: >> >>> 570: counters = MethodCounters::allocate(mh, THREAD); >>> 571: if (HAS_PENDING_EXCEPTION) { >>> 572: CLEAR_PENDING_EXCEPTION; // MethodData above doesn't clear exception >> >> I don't understand the comment. > > It's sort of a question, because the above code that does the same thing for MethodData doesn't clear the exception. In this case we should clear the exception I believe. I'll remove the comment. Ah I see. All the callers of build_interpreter_method_data clear the exception themselves, rather than it being cleared internally. That seems odd. >> src/hotspot/share/oops/method.cpp line 575: >> >>> 573: CompileBroker::log_metaspace_failure(); >>> 574: ClassLoaderDataGraph::set_metaspace_oom(true); >>> 575: return NULL; >> >> You could factor this out for both cases by testing "counters == NULL". > > Yes, this is better. Refactoring looks good! Thanks, David > ------------- > > PR: https://git.openjdk.java.net/jdk/pull/3207 > From dholmes at openjdk.java.net Tue Mar 30 23:44:18 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Tue, 30 Mar 2021 23:44:18 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v4] In-Reply-To: <3goJejfzIosKifkn8_QwTB3YaoB5na1nbHfDARhNwvA=.6e3b2bcc-2359-4393-a9b6-8649687ea92a@github.com> References: <3goJejfzIosKifkn8_QwTB3YaoB5na1nbHfDARhNwvA=.6e3b2bcc-2359-4393-a9b6-8649687ea92a@github.com> Message-ID: On Tue, 30 Mar 2021 23:04:41 GMT, Coleen Phillimore wrote: >> This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. >> >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Rename MethodCounter allocate functions, refactor null checking. Looks good. ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Tue Mar 30 23:57:16 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Tue, 30 Mar 2021 23:57:16 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v4] In-Reply-To: References: <3goJejfzIosKifkn8_QwTB3YaoB5na1nbHfDARhNwvA=.6e3b2bcc-2359-4393-a9b6-8649687ea92a@github.com> Message-ID: On Tue, 30 Mar 2021 23:41:22 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename MethodCounter allocate functions, refactor null checking. > > Looks good. Thanks, David! ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From dholmes at openjdk.java.net Wed Mar 31 00:08:23 2021 From: dholmes at openjdk.java.net (David Holmes) Date: Wed, 31 Mar 2021 00:08:23 GMT Subject: RFR: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 14:46:43 GMT, Stefan Karlsson wrote: > There's a mismatch in some toolchains about what part should provide the nullptr_t definition. This patch takes the easy way out and include cstddef and changes the two usages nullptr_t to std::nullptr_t. > > See the bug report for more details. > > We could have redefined nullptr_t to resolve this, but that would have required more extensive testing, so I left that as a potential future cleanup. > > I've tested this by compiling with clang on linux. I'm going to let GHA testing testing run, and will also run our tier1 testing. LGTM! Thanks, David ------------- Marked as reviewed by dholmes (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3269 From kbarrett at openjdk.java.net Wed Mar 31 01:08:29 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 31 Mar 2021 01:08:29 GMT Subject: RFR: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 14:46:43 GMT, Stefan Karlsson wrote: > There's a mismatch in some toolchains about what part should provide the nullptr_t definition. This patch takes the easy way out and include cstddef and changes the two usages nullptr_t to std::nullptr_t. > > See the bug report for more details. > > We could have redefined nullptr_t to resolve this, but that would have required more extensive testing, so I left that as a potential future cleanup. > > I've tested this by compiling with clang on linux. I'm going to let GHA testing testing run, and will also run our tier1 testing. Changes requested by kbarrett (Reviewer). src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 34: > 32: // declarations and a few frequently used utility functions. > 33: > 34: #include This should be in globalDefinitions.hpp, not globalDefinitions_gcc.hpp. There's no guarantee that the inclusion of on other platforms is sufficient to provide the std-qualified name. In fact, that's explicitly unspecified and deprecated behavior. (Unqualified nullptr_t being provided by is deprecated, BTW.) ------------- PR: https://git.openjdk.java.net/jdk/pull/3269 From iklam at openjdk.java.net Wed Mar 31 03:44:22 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 31 Mar 2021 03:44:22 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v4] In-Reply-To: <3goJejfzIosKifkn8_QwTB3YaoB5na1nbHfDARhNwvA=.6e3b2bcc-2359-4393-a9b6-8649687ea92a@github.com> References: <3goJejfzIosKifkn8_QwTB3YaoB5na1nbHfDARhNwvA=.6e3b2bcc-2359-4393-a9b6-8649687ea92a@github.com> Message-ID: On Tue, 30 Mar 2021 23:04:41 GMT, Coleen Phillimore wrote: >> This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. >> >> Tested with tier1-7. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Rename MethodCounter allocate functions, refactor null checking. LGTM. ------------- Marked as reviewed by iklam (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3207 From iklam at openjdk.java.net Wed Mar 31 04:08:13 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 31 Mar 2021 04:08:13 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v7] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 03:48:08 GMT, Yumin Qi wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request incrementally with one additional commit since the last revision: > > Remove CDS.getVMArguments, changed to use VM.getRuntimeVMArguments. Removed unused function from ClassLoader. Improved InstanceKlass::is_shareable() and related test. Added more test scenarios. src/java.base/share/classes/jdk/internal/misc/CDS.java line 311: > 309: // done, delete classlist file. > 310: if (fileList.exists()) { > 311: // fileList.delete(); Is the comment unintentional? ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From minqi at openjdk.java.net Wed Mar 31 04:39:10 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 31 Mar 2021 04:39:10 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v4] In-Reply-To: References: <9EU_DwWh3XcyBxJkxgPH1qzvbaa2hvWQYuccdRXWKj0=.c6816df0-6e73-45bc-9e52-caa70b0611fd@github.com> Message-ID: <_E_YlWz1Bg93lHfN9I50a1p7lm6Wtf9NjqSHSzqWdgM=.ddc13b21-69a7-40d3-9c35-918792584a29@github.com> On Fri, 19 Mar 2021 05:39:25 GMT, Ioi Lam wrote: >> Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Fix filter more flags to exclude in static dump, add more test cases >> - Merge branch 'master' into jdk-8259070 >> - Fix white space in CDS.java >> - Add function CDS.dumpSharedArchive in java to dump shared archive >> - 8259070: Add jcmd option to dump CDS > > Changes requested by iklam (Reviewer). @iklam Thanks. Yes, the comment out for the deletion is unintentional for test only and forgot to revert. I will revert it. Also I will merge with upstream. Since this repo has no merge from upstream for long time, it may cause conflicts. If too many conflicts to resolve I would like to withdraw this one and resubmit as a new PR. Thanks. ------------- PR: https://git.openjdk.java.net/jdk/pull/2737 From stuefe at openjdk.java.net Wed Mar 31 04:54:26 2021 From: stuefe at openjdk.java.net (Thomas Stuefe) Date: Wed, 31 Mar 2021 04:54:26 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v2] In-Reply-To: References: <8lcINJaJDOg62ESQch_n30qQTKXSauOH5qGJuD98T4I=.97409776-53ef-4308-bc1f-dffb6f2e907d@github.com> Message-ID: On Tue, 30 Mar 2021 22:46:30 GMT, Coleen Phillimore wrote: >>> I deleted this branch by mistake, now restored. >>> >>> > I'm not sure this is correct. Your new non-TRAPS Metaspace::allocate() would fail every time the GC threshold is touched. Where the old TRAPS version would break through the threshold and allocate successfully. >>> >>> I realize this. It's just an attempt to allocate and it's designed to be used during a safepoint for only this allocation. I could change this to only call the non-TRAPS version of MethodCounters if we're at a safepoint? Would that help? Then the only time we'll miss out on metaspace counters periodically is if they were created to set breakpoints in a safepoint. >> >> I think that would be better. I am unclear on what happened in this case before; did we also miss out on allocating the Counters? >> >>> >>> I'd hate for this special case to know more about metaspace, ala calling ClassLoaderMetaspace::expand_and_allocate. >> >> Even within Metaspace::allocate(no TRAPS)? Its in metaspace land, surely it would be fine to call expand there like this: >> >> MetaWord* Metaspace::allocate(ClassLoaderData* loader_data, size_t word_size, MetaspaceObj::Type type) { >> MetaWord* result = loader_data->metaspace_non_null()->allocate(...); >> if (!result) { >> MetaWord* result = loader_data->metaspace_non_null()->expand_and_allocate(...); >> } >> >> (Note that I will be gone into vacation shortly and I'm a bit short on time; I'm not sure I can finish this review. If you go with your approach, my only request would be to comment the prototypes for the two allocate functions a bit clearer and/or maybe rename one as allocate_no_exception or the other as allocate_with_exception) > >> I think that would be better. I am unclear on what happened in this case before; did we also miss out on allocating the Counters? > > Before this change, it was very unlikely that allocating metaspace counters in a breakpoint safepoint ran out of memory, so never threw the exception. Or else they did and returned NULL and all of the code around their allocation has handling for a null return. We're working on enforcing what should be a rule that only JavaThreads can throw exceptions and this was an exception to that. :) > >> Re: expand_and_allocate() > > I didn't want to expose internal metaspace functions or more handling for this special case, and that would prevent the nice sharing of most of the Metaspace::allocate code. It's allowed for method counters to return NULL here. If it's not long term, we should move the breakpoint counters to Method (but it would increase Method by a pointer size, which isn't good). > > I will rename the functions as requested. Have a nice vacation and thank you for your comments. Thanks for explaining! I am fine with the change in this case. ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From chagedorn at openjdk.java.net Wed Mar 31 06:37:18 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Wed, 31 Mar 2021 06:37:18 GMT Subject: RFR: 8263582: WB_IsMethodCompilable ignores compiler directives [v2] In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 16:14:05 GMT, Vladimir Kozlov wrote: >> Christian Hagedorn has updated the pull request incrementally with one additional commit since the last revision: >> >> fix typo > > Marked as reviewed by kvn (Reviewer). Thanks @vnkozlov and @neliasso for your reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From chagedorn at openjdk.java.net Wed Mar 31 06:37:20 2021 From: chagedorn at openjdk.java.net (Christian Hagedorn) Date: Wed, 31 Mar 2021 06:37:20 GMT Subject: Integrated: 8263582: WB_IsMethodCompilable ignores compiler directives In-Reply-To: References: Message-ID: <9i4F0Rh6cNhPdKc1ogTnAnensKAv0gGGJKzgysDanuk=.c6bbd27e-0ee7-4ab9-b07e-0bc877267d0c@github.com> On Thu, 25 Mar 2021 15:00:39 GMT, Christian Hagedorn wrote: > While playing around with `WB_IsMethodCompilable` together with `compileonly` I ran into some surprising results for methods that should never be compiled (not part of `compileonly`): `isMethodCompilable` returns true instead of false when such an excluded method was not yet tried to be compiled. > > The reason for it is that `WB_IsMethodCompilable` directly checks `CompilationPolicy::can_be_compiled()` which calls `Method::is_not_compilable()`. However, the `ExcludeOption` compiler directive is only evaluated lazily upon a compilation attempt. Therefore, if a method was not tried to be compiled, yet, `Method::is_not_compilable()` always returns false, regardless of any set compiler directive. > > I therefore suggest to additionally check the `ExcludeOption` in `WB_IsMethodCompilable`. I also cleaned up some wrong use of `CompLevel_any` and `CompLevel_all` as suggested by @veresov: `CompLevel_any` should only be used to query the state as in `is_*()` methods and `CompLevel_all` when changing the state is in `set_*()` methods. > > Thanks, > Christian This pull request has now been integrated. Changeset: ab6faa60 Author: Christian Hagedorn URL: https://git.openjdk.java.net/jdk/commit/ab6faa60 Stats: 141 lines in 5 files changed: 128 ins; 0 del; 13 mod 8263582: WB_IsMethodCompilable ignores compiler directives Reviewed-by: iveresov, kvn, neliasso ------------- PR: https://git.openjdk.java.net/jdk/pull/3195 From stefank at openjdk.java.net Wed Mar 31 07:11:52 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 31 Mar 2021 07:11:52 GMT Subject: RFR: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ [v2] In-Reply-To: References: Message-ID: > There's a mismatch in some toolchains about what part should provide the nullptr_t definition. This patch takes the easy way out and include cstddef and changes the two usages nullptr_t to std::nullptr_t. > > See the bug report for more details. > > We could have redefined nullptr_t to resolve this, but that would have required more extensive testing, so I left that as a potential future cleanup. > > I've tested this by compiling with clang on linux. I'm going to let GHA testing testing run, and will also run our tier1 testing. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Review Kim ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3269/files - new: https://git.openjdk.java.net/jdk/pull/3269/files/5a562c14..600c8b71 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3269&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3269&range=00-01 Stats: 4 lines in 2 files changed: 2 ins; 2 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3269.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3269/head:pull/3269 PR: https://git.openjdk.java.net/jdk/pull/3269 From stefank at openjdk.java.net Wed Mar 31 07:11:55 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 31 Mar 2021 07:11:55 GMT Subject: RFR: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ [v2] In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 01:05:21 GMT, Kim Barrett wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Review Kim > > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 34: > >> 32: // declarations and a few frequently used utility functions. >> 33: >> 34: #include > > This should be in globalDefinitions.hpp, not globalDefinitions_gcc.hpp. There's no guarantee that the inclusion of on other platforms is sufficient to provide the std-qualified name. In fact, that's explicitly unspecified and deprecated behavior. (Unqualified nullptr_t being provided by is deprecated, BTW.) You're right. I was blind-sided by the fact that this was only was supposed to only be a problem with clang, and started out by trying to keep using nullptr_t but fix the includes. I'm updating the patch. ------------- PR: https://git.openjdk.java.net/jdk/pull/3269 From akozlov at openjdk.java.net Wed Mar 31 09:37:20 2021 From: akozlov at openjdk.java.net (Anton Kozlov) Date: Wed, 31 Mar 2021 09:37:20 GMT Subject: Integrated: 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 In-Reply-To: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> References: <8z-yqACOKf8qU8N_NQbctwtwxojByYis8FJRfdBXxWE=.f9cb3eec-8219-4a8b-9791-ba6596667ca7@github.com> Message-ID: On Mon, 29 Mar 2021 11:39:31 GMT, Anton Kozlov wrote: > Please review a fix for compiler/debug/VerifyAdapterSharing.java failure on macos/aarch64 platform. The root cause is in missing W^X switch in JNI DestroyJavaVM. > > I reviewed the rest of the JNI Invoke Interface functions. DetachCurrentThread needs a similar change, although nothing fails immediately. So DetachCurrentThread is changed as a precaution. This pull request has now been integrated. Changeset: 8a4a9117 Author: Anton Kozlov Committer: Vladimir Kempik URL: https://git.openjdk.java.net/jdk/commit/8a4a9117 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod 8262894: [macos_aarch64] SIGBUS in Assembler::ld_st2 Co-authored-by: Mikael Vidstedt Reviewed-by: dholmes, gziemski ------------- PR: https://git.openjdk.java.net/jdk/pull/3241 From mdoerr at openjdk.java.net Wed Mar 31 09:40:22 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 31 Mar 2021 09:40:22 GMT Subject: RFR: 8261957: [PPC64] Support for Concurrent Thread-Stack Processing [v2] In-Reply-To: References: Message-ID: <5uYD6U2K8k9IsnnOG20SlnsmYIcS6LKaZb9AWG4ws2c=.79da41ea-afa6-4b4e-aff8-caac2ccb845f@github.com> On Tue, 30 Mar 2021 22:30:53 GMT, Niklas Radomski wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> cleanup MacroAssembler::safepoint_poll > > Marked as reviewed by nradomski (Author). Thanks for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/2841 From mdoerr at openjdk.java.net Wed Mar 31 09:40:24 2021 From: mdoerr at openjdk.java.net (Martin Doerr) Date: Wed, 31 Mar 2021 09:40:24 GMT Subject: Integrated: 8261957: [PPC64] Support for Concurrent Thread-Stack Processing In-Reply-To: References: Message-ID: On Fri, 5 Mar 2021 09:58:35 GMT, Martin Doerr wrote: > I'd like to support Concurrent Thread-Stack Processing on PPC64. This will be needed by ShenandoahGC and zGC when implemented. Maybe for other purposes in the future, too. > I'm using conditional trap instructions by default, so we don't need the extra stubs unless -XX:-UseSIGTRAP is used. > > Original change: https://github.com/openjdk/jdk/commit/b9873e18 This pull request has now been integrated. Changeset: 9061271b Author: Martin Doerr URL: https://git.openjdk.java.net/jdk/commit/9061271b Stats: 225 lines in 16 files changed: 174 ins; 10 del; 41 mod 8261957: [PPC64] Support for Concurrent Thread-Stack Processing Reviewed-by: lucy, nradomski ------------- PR: https://git.openjdk.java.net/jdk/pull/2841 From kbarrett at openjdk.java.net Wed Mar 31 10:05:09 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 31 Mar 2021 10:05:09 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v4] In-Reply-To: <34dwf1No_CG6jWti-W-fYMp63qgBVrYPgZrNKNLp-vs=.b5655d16-d4cf-4d71-b8d5-1175f7e58953@github.com> References: <34dwf1No_CG6jWti-W-fYMp63qgBVrYPgZrNKNLp-vs=.b5655d16-d4cf-4d71-b8d5-1175f7e58953@github.com> Message-ID: On Tue, 30 Mar 2021 19:11:39 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > Man Cao has updated the pull request incrementally with one additional commit since the last revision: > > Revise comments and move trailing padding. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/2986 From iwalulya at openjdk.java.net Wed Mar 31 10:39:41 2021 From: iwalulya at openjdk.java.net (Ivan Walulya) Date: Wed, 31 Mar 2021 10:39:41 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v4] In-Reply-To: <34dwf1No_CG6jWti-W-fYMp63qgBVrYPgZrNKNLp-vs=.b5655d16-d4cf-4d71-b8d5-1175f7e58953@github.com> References: <34dwf1No_CG6jWti-W-fYMp63qgBVrYPgZrNKNLp-vs=.b5655d16-d4cf-4d71-b8d5-1175f7e58953@github.com> Message-ID: On Tue, 30 Mar 2021 19:11:39 GMT, Man Cao wrote: >> Hi all, >> >> Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. >> >> The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. >> >> -Man > > Man Cao has updated the pull request incrementally with one additional commit since the last revision: > > Revise comments and move trailing padding. Marked as reviewed by iwalulya (Committer). ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From coleenp at openjdk.java.net Wed Mar 31 12:46:22 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 12:46:22 GMT Subject: RFR: 8264149 BreakpointInfo::set allocates metaspace object in VM thread [v4] In-Reply-To: References: <3goJejfzIosKifkn8_QwTB3YaoB5na1nbHfDARhNwvA=.6e3b2bcc-2359-4393-a9b6-8649687ea92a@github.com> Message-ID: On Wed, 31 Mar 2021 03:41:01 GMT, Ioi Lam wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename MethodCounter allocate functions, refactor null checking. > > LGTM. Thanks Ioi! ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Wed Mar 31 12:46:23 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 12:46:23 GMT Subject: Integrated: 8264149 BreakpointInfo::set allocates metaspace object in VM thread In-Reply-To: References: Message-ID: On Thu, 25 Mar 2021 21:47:46 GMT, Coleen Phillimore wrote: > This change creates a Metaspace::allocate function that doesn't pass TRAPS to be used by MethodCounters. TRAPS and exceptions shouldn't be thrown from non-JavaThreads. > > Tested with tier1-7. This pull request has now been integrated. Changeset: 40c32491 Author: Coleen Phillimore URL: https://git.openjdk.java.net/jdk/commit/40c32491 Stats: 110 lines in 10 files changed: 63 ins; 13 del; 34 mod 8264149: BreakpointInfo::set allocates metaspace object in VM thread Reviewed-by: dholmes, iklam ------------- PR: https://git.openjdk.java.net/jdk/pull/3207 From coleenp at openjdk.java.net Wed Mar 31 13:02:23 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 13:02:23 GMT Subject: RFR: 8264285: Clean the modification of ccstr JVM flags In-Reply-To: References: Message-ID: On Mon, 29 Mar 2021 21:35:52 GMT, Ioi Lam wrote: > There are two versions of JVMFlagAccess::ccstrAtPut() for modifying JVM flags of the ccstr type (i.e., strings). > > - One version requires the caller to free the old value, but some callers don't do that (writeableFlags.cpp). > - The other version frees the old value on behalf of the caller. However, this version is accessible only via FLAG_SET_XXX macros and is currently unused. So it's unclear whether it actually works. > > We should combine these two versions into a single function, fix problems in the callers, and add test cases. The old value should be freed automatically, because typically the caller isn't interested in the old value. > > Note that the FLAG_SET_XXX macros do not return the old value. Requiring the caller of FLAG_SET_XXX to free the old value would be tedious and error prone. I had a question but overall nice cleanup! src/hotspot/share/runtime/flags/debug_globals.hpp line 38: > 36: // have any MANAGEABLE flags of the ccstr type, but we really need to > 37: // make sure the implementation is correct (in terms of memory allocation) > 38: // just in case someone may add such a flag in the future. Could you have just added a develop flag to the manageable flags instead? src/hotspot/share/runtime/flags/jvmFlagAccess.cpp line 327: > 325: // The callers typically don't care what the old value is. > 326: // If the caller really wants to know the old value, read it (and make a copy if necessary) > 327: // before calling this API. good comment! ------------- Marked as reviewed by coleenp (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3254 From shade at openjdk.java.net Wed Mar 31 13:54:35 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 31 Mar 2021 13:54:35 GMT Subject: RFR: 8264513: Cleanup CardTableBarrierSetC2::post_barrier Message-ID: There are few stale comments after CMS removal. We can also clean up some coding, while we are at it. ------------- Commit messages: - 8264513: Cleanup CardTableBarrierSetC2::post_barrier Changes: https://git.openjdk.java.net/jdk/pull/3285/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3285&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264513 Stats: 20 lines in 1 file changed: 5 ins; 5 del; 10 mod Patch: https://git.openjdk.java.net/jdk/pull/3285.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3285/head:pull/3285 PR: https://git.openjdk.java.net/jdk/pull/3285 From kbarrett at openjdk.java.net Wed Mar 31 14:23:27 2021 From: kbarrett at openjdk.java.net (Kim Barrett) Date: Wed, 31 Mar 2021 14:23:27 GMT Subject: RFR: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ [v2] In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 07:11:52 GMT, Stefan Karlsson wrote: >> There's a mismatch in some toolchains about what part should provide the nullptr_t definition. This patch takes the easy way out and include cstddef and changes the two usages nullptr_t to std::nullptr_t. >> >> See the bug report for more details. >> >> We could have redefined nullptr_t to resolve this, but that would have required more extensive testing, so I left that as a potential future cleanup. >> >> I've tested this by compiling with clang on linux. I'm going to let GHA testing testing run, and will also run our tier1 testing. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Review Kim Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3269 From shade at openjdk.java.net Wed Mar 31 15:47:24 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 31 Mar 2021 15:47:24 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes Message-ID: This reworks the compiler support for blackholes. The key difference against the last version (#1203) is that blackholes are only acceptable as empty static methods, which both simplifies the implementation and eliminates a few compatibility questions. ------------- Commit messages: - Redo BlackholeIntrinsicTest to see if target blackhole methods were indeed intrinsified - Rename BlackholeStaticTest to BlackholeIntrinsicTest - BlackholeStaticTest should unlock blackholes - Do not print double-warning on blackhole already set - Add more checks for C2 intrinsic - Simplify intrinsic test and add warning test - Common the blackhole checks - Binding JVMCI through get_jvmci_method - Merge branch 'master' into JDK-8259316-blackholes-redo - Extend JVMCI with shouldBlackholeMethod - ... and 10 more: https://git.openjdk.java.net/jdk/compare/aefc1560...2fa49418 Changes: https://git.openjdk.java.net/jdk/pull/2024/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2024&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8259316 Stats: 1164 lines in 32 files changed: 1161 ins; 0 del; 3 mod Patch: https://git.openjdk.java.net/jdk/pull/2024.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2024/head:pull/2024 PR: https://git.openjdk.java.net/jdk/pull/2024 From shade at openjdk.java.net Wed Mar 31 15:47:25 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 31 Mar 2021 15:47:25 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes In-Reply-To: References: Message-ID: On Mon, 11 Jan 2021 10:18:15 GMT, Aleksey Shipilev wrote: > This reworks the compiler support for blackholes. The key difference against the last version (#1203) is that blackholes are only acceptable as empty static methods, which both simplifies the implementation and eliminates a few compatibility questions. Not yet, bot. ------------- PR: https://git.openjdk.java.net/jdk/pull/2024 From dnsimon at openjdk.java.net Wed Mar 31 15:47:26 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 31 Mar 2021 15:47:26 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes In-Reply-To: References: Message-ID: On Mon, 11 Jan 2021 10:18:15 GMT, Aleksey Shipilev wrote: > This reworks the compiler support for blackholes. The key difference against the last version (#1203) is that blackholes are only acceptable as empty static methods, which both simplifies the implementation and eliminates a few compatibility questions. src/hotspot/share/ci/ciMethod.cpp line 160: > 158: > 159: if (CompilerOracle::should_blackhole(h_m)) { > 160: h_m->set_intrinsic_id(vmIntrinsics::_blackhole); Wouldn't it be better to do this in `Method::init_intrinsic_id` so that JVMCI will also see the method as an intrinsic? Either that or a similar bit of logic should be added to `JVMCIRuntime::compile_method`. ------------- PR: https://git.openjdk.java.net/jdk/pull/2024 From shade at openjdk.java.net Wed Mar 31 15:47:27 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 31 Mar 2021 15:47:27 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes In-Reply-To: References: Message-ID: <484lNLQcHF4940FK-UQUMUjdBHposqrNJ1wPa4Jya3c=.8a6d86ba-a768-408b-b814-756fb8940a2e@github.com> On Fri, 19 Mar 2021 09:47:15 GMT, Doug Simon wrote: >> This reworks the compiler support for blackholes. The key difference against the last version (#1203) is that blackholes are only acceptable as empty static methods, which both simplifies the implementation and eliminates a few compatibility questions. > > src/hotspot/share/ci/ciMethod.cpp line 160: > >> 158: >> 159: if (CompilerOracle::should_blackhole(h_m)) { >> 160: h_m->set_intrinsic_id(vmIntrinsics::_blackhole); > > Wouldn't it be better to do this in `Method::init_intrinsic_id` so that JVMCI will also see the method as an intrinsic? Either that or a similar bit of logic should be added to `JVMCIRuntime::compile_method`. The way C1 and C2 see this as synthetic intrinsic is modeled after `vmIntrinsics::_compiledLambdaForm`. Unfortunately, `Method::init_intrinsic_id` does not have access to `CompilerOracle` (which is solvable), and current placement in `ciMethod` constructor guarantees the method is resolved. I wonder if JVMCI would benefit from the direct `shouldBlackholeMethod`, like it already has `shouldInlineMethod`? ------------- PR: https://git.openjdk.java.net/jdk/pull/2024 From dnsimon at openjdk.java.net Wed Mar 31 15:47:27 2021 From: dnsimon at openjdk.java.net (Doug Simon) Date: Wed, 31 Mar 2021 15:47:27 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes In-Reply-To: <484lNLQcHF4940FK-UQUMUjdBHposqrNJ1wPa4Jya3c=.8a6d86ba-a768-408b-b814-756fb8940a2e@github.com> References: <484lNLQcHF4940FK-UQUMUjdBHposqrNJ1wPa4Jya3c=.8a6d86ba-a768-408b-b814-756fb8940a2e@github.com> Message-ID: On Thu, 25 Mar 2021 15:06:40 GMT, Aleksey Shipilev wrote: >> src/hotspot/share/ci/ciMethod.cpp line 160: >> >>> 158: >>> 159: if (CompilerOracle::should_blackhole(h_m)) { >>> 160: h_m->set_intrinsic_id(vmIntrinsics::_blackhole); >> >> Wouldn't it be better to do this in `Method::init_intrinsic_id` so that JVMCI will also see the method as an intrinsic? Either that or a similar bit of logic should be added to `JVMCIRuntime::compile_method`. > > The way C1 and C2 see this as synthetic intrinsic is modeled after `vmIntrinsics::_compiledLambdaForm`. Unfortunately, `Method::init_intrinsic_id` does not have access to `CompilerOracle` (which is solvable), and current placement in `ciMethod` constructor guarantees the method is resolved. I wonder if JVMCI would benefit from the direct `shouldBlackholeMethod`, like it already has `shouldInlineMethod`? I think it would be better in `JVMCIEnv::get_jvmci_method` which is the JVMCI equivalent of `ciMethod::ciMethod`. ------------- PR: https://git.openjdk.java.net/jdk/pull/2024 From never at openjdk.java.net Wed Mar 31 15:47:27 2021 From: never at openjdk.java.net (Tom Rodriguez) Date: Wed, 31 Mar 2021 15:47:27 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes In-Reply-To: References: <484lNLQcHF4940FK-UQUMUjdBHposqrNJ1wPa4Jya3c=.8a6d86ba-a768-408b-b814-756fb8940a2e@github.com> Message-ID: On Thu, 25 Mar 2021 15:40:59 GMT, Doug Simon wrote: >> The way C1 and C2 see this as synthetic intrinsic is modeled after `vmIntrinsics::_compiledLambdaForm`. Unfortunately, `Method::init_intrinsic_id` does not have access to `CompilerOracle` (which is solvable), and current placement in `ciMethod` constructor guarantees the method is resolved. I wonder if JVMCI would benefit from the direct `shouldBlackholeMethod`, like it already has `shouldInlineMethod`? > > I think it would be better in `JVMCIEnv::get_jvmci_method` which is the JVMCI equivalent of `ciMethod::ciMethod`. It would be best if it was done in ``Method::init_intrinsic_id`` since that's where that field is supposed to be initialized. ------------- PR: https://git.openjdk.java.net/jdk/pull/2024 From shade at openjdk.java.net Wed Mar 31 15:47:27 2021 From: shade at openjdk.java.net (Aleksey Shipilev) Date: Wed, 31 Mar 2021 15:47:27 GMT Subject: RFR: 8259316: [REDO] C1/C2 compiler support for blackholes In-Reply-To: References: <484lNLQcHF4940FK-UQUMUjdBHposqrNJ1wPa4Jya3c=.8a6d86ba-a768-408b-b814-756fb8940a2e@github.com> Message-ID: <_zMC-g7IVIpNV2nv5BSOMaFes-3GTrxCSiYdG4fh7Is=.0bb11638-a40d-41a7-94b9-3489e00701ba@github.com> On Thu, 25 Mar 2021 15:40:59 GMT, Doug Simon wrote: >> The way C1 and C2 see this as synthetic intrinsic is modeled after `vmIntrinsics::_compiledLambdaForm`. Unfortunately, `Method::init_intrinsic_id` does not have access to `CompilerOracle` (which is solvable), and current placement in `ciMethod` constructor guarantees the method is resolved. I wonder if JVMCI would benefit from the direct `shouldBlackholeMethod`, like it already has `shouldInlineMethod`? > > I think it would be better in `JVMCIEnv::get_jvmci_method` which is the JVMCI equivalent of `ciMethod::ciMethod`. `Method::init_intrinsic_id` does not have access to `CompilerOracle` and it also does not check for method resolution (I think). I added the similar block to `JVMCIEnv::get_jvmci_method` as @dougxc suggested. ------------- PR: https://git.openjdk.java.net/jdk/pull/2024 From stefank at openjdk.java.net Wed Mar 31 16:47:17 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 31 Mar 2021 16:47:17 GMT Subject: RFR: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ [v2] In-Reply-To: References: Message-ID: <9Dw-bOub5HZu2bTCmUHwvXwkWXwi9oG_iSoiCNtbgGE=.1d0274c8-31e8-4918-bf86-d5e958b29428@github.com> On Wed, 31 Mar 2021 14:20:33 GMT, Kim Barrett wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Review Kim > > Looks good. Thanks for reviewing! ------------- PR: https://git.openjdk.java.net/jdk/pull/3269 From stefank at openjdk.java.net Wed Mar 31 16:47:18 2021 From: stefank at openjdk.java.net (Stefan Karlsson) Date: Wed, 31 Mar 2021 16:47:18 GMT Subject: Integrated: 8264346: nullptr_t undefined in global namespace for clang+libstdc++ In-Reply-To: References: Message-ID: On Tue, 30 Mar 2021 14:46:43 GMT, Stefan Karlsson wrote: > There's a mismatch in some toolchains about what part should provide the nullptr_t definition. This patch takes the easy way out and include cstddef and changes the two usages nullptr_t to std::nullptr_t. > > See the bug report for more details. > > We could have redefined nullptr_t to resolve this, but that would have required more extensive testing, so I left that as a potential future cleanup. > > I've tested this by compiling with clang on linux. I'm going to let GHA testing testing run, and will also run our tier1 testing. This pull request has now been integrated. Changeset: dec34470 Author: Stefan Karlsson URL: https://git.openjdk.java.net/jdk/commit/dec34470 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod 8264346: nullptr_t undefined in global namespace for clang+libstdc++ Reviewed-by: dholmes, kbarrett ------------- PR: https://git.openjdk.java.net/jdk/pull/3269 From manc at openjdk.java.net Wed Mar 31 18:32:28 2021 From: manc at openjdk.java.net (Man Cao) Date: Wed, 31 Mar 2021 18:32:28 GMT Subject: Integrated: 8263551: Provide shared lock-free FIFO queue implementation In-Reply-To: References: Message-ID: On Sat, 13 Mar 2021 10:41:44 GMT, Man Cao wrote: > Hi all, > > Could anyone review this change that is mainly code motion? It creates a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue, which will be used by JDK-8236485 in the future. > > The shared LockFreeQueue is similar to the existing LockFreeStack. The notable difference is that the LockFreeQueue has an additional template parameter for whether to use GlobalCounter::CriticalSection to avoid ABA problem. > > -Man This pull request has now been integrated. Changeset: e2ec997b Author: Man Cao URL: https://git.openjdk.java.net/jdk/commit/e2ec997b Stats: 783 lines in 5 files changed: 631 ins; 146 del; 6 mod 8263551: Provide shared lock-free FIFO queue implementation Create a generalized lock-free queue implementation based on G1DirtyCardQueueSet::Queue. Reviewed-by: kbarrett, iwalulya ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From manc at openjdk.java.net Wed Mar 31 18:32:27 2021 From: manc at openjdk.java.net (Man Cao) Date: Wed, 31 Mar 2021 18:32:27 GMT Subject: RFR: 8263551: Provide shared lock-free FIFO queue implementation [v4] In-Reply-To: References: <34dwf1No_CG6jWti-W-fYMp63qgBVrYPgZrNKNLp-vs=.b5655d16-d4cf-4d71-b8d5-1175f7e58953@github.com> Message-ID: <3KxN9XPg6g9yTrORdaowiF4MLdnBsMI13GWMJix-CuU=.23c97476-1319-460e-9044-af3fbe2a27ea@github.com> On Wed, 31 Mar 2021 10:35:59 GMT, Ivan Walulya wrote: >> Man Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Revise comments and move trailing padding. > > Marked as reviewed by iwalulya (Committer). Thank you for the reviews! ------------- PR: https://git.openjdk.java.net/jdk/pull/2986 From iklam at openjdk.java.net Wed Mar 31 19:01:48 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 31 Mar 2021 19:01:48 GMT Subject: RFR: 8264285: Clean the modification of ccstr JVM flags [v2] In-Reply-To: References: Message-ID: > There are two versions of JVMFlagAccess::ccstrAtPut() for modifying JVM flags of the ccstr type (i.e., strings). > > - One version requires the caller to free the old value, but some callers don't do that (writeableFlags.cpp). > - The other version frees the old value on behalf of the caller. However, this version is accessible only via FLAG_SET_XXX macros and is currently unused. So it's unclear whether it actually works. > > We should combine these two versions into a single function, fix problems in the callers, and add test cases. The old value should be freed automatically, because typically the caller isn't interested in the old value. > > Note that the FLAG_SET_XXX macros do not return the old value. Requiring the caller of FLAG_SET_XXX to free the old value would be tedious and error prone. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: relax flag attributions (ala JDK-7123237) ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3254/files - new: https://git.openjdk.java.net/jdk/pull/3254/files/7eca2343..673aaafc Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3254&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3254&range=00-01 Stats: 37 lines in 4 files changed: 0 ins; 36 del; 1 mod Patch: https://git.openjdk.java.net/jdk/pull/3254.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3254/head:pull/3254 PR: https://git.openjdk.java.net/jdk/pull/3254 From iklam at openjdk.java.net Wed Mar 31 19:05:14 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 31 Mar 2021 19:05:14 GMT Subject: RFR: 8264285: Clean the modification of ccstr JVM flags [v2] In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 12:58:50 GMT, Coleen Phillimore wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> relax flag attributions (ala JDK-7123237) > > src/hotspot/share/runtime/flags/debug_globals.hpp line 38: > >> 36: // have any MANAGEABLE flags of the ccstr type, but we really need to >> 37: // make sure the implementation is correct (in terms of memory allocation) >> 38: // just in case someone may add such a flag in the future. > > Could you have just added a develop flag to the manageable flags instead? I had to use a `product` flag due to the following code, which should have been removed as part of [JDK-8243208](https://bugs.openjdk.java.net/browse/JDK-8243208), but I was afraid to do so because I didn't have a test case. I.e., all of our diagnostic/manageable/experimental flags were `product` flags. With this PR, now I have a test case -- I changed `DummyManageableStringFlag` to a `notproduct` flag, and removed the following code. I am re-running tiers1-4 now. void JVMFlag::check_all_flag_declarations() { for (JVMFlag* current = &flagTable[0]; current->_name != NULL; current++) { int flags = static_cast(current->_flags); // Backwards compatibility. This will be relaxed/removed in JDK-7123237. int mask = JVMFlag::KIND_DIAGNOSTIC | JVMFlag::KIND_MANAGEABLE | JVMFlag::KIND_EXPERIMENTAL; if ((flags & mask) != 0) { assert((flags & mask) == JVMFlag::KIND_DIAGNOSTIC || (flags & mask) == JVMFlag::KIND_MANAGEABLE || (flags & mask) == JVMFlag::KIND_EXPERIMENTAL, "%s can be declared with at most one of " "DIAGNOSTIC, MANAGEABLE or EXPERIMENTAL", current->_name); assert((flags & KIND_NOT_PRODUCT) == 0 && (flags & KIND_DEVELOP) == 0, "%s has an optional DIAGNOSTIC, MANAGEABLE or EXPERIMENTAL " "attribute; it must be declared as a product flag", current->_name); } } } ------------- PR: https://git.openjdk.java.net/jdk/pull/3254 From thomas.stuefe at gmail.com Wed Mar 31 19:10:27 2021 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 31 Mar 2021 21:10:27 +0200 Subject: Request For Comment: Asynchronous Logging In-Reply-To: References: Message-ID: Hi Volker, Excellent summary. Thank you for starting the design discussion away from the PR thread at GH. I think this is a better place for this discussion. I think UL had been missing a feature like this. I think we should provide it. Different people have come forward with the same idea in the past, so I believe there is a real need. You captured all points succinctly and prepared the discussion well. My remarks are inline (where I don't write one I agree). On Tue, Mar 30, 2021 at 8:19 PM Volker Simonis wrote: > Hi, > > I'd like to (re)start a discussion on asynchronous logging [1,2,3,4]. > We are successfully using this feature productively at Amazon both in > jdk8 and jdk11 to reduce the tail latency of services which use > logging. We think that async logging is a useful addition to the > current logging framework which might be beneficial to a wider range > of users. The following write-up tries to capture the comments and > suggestions from the previous discussions we are aware of. > > Current state: > > - HotSpot uses the so called "Unified Logging" (UL) framework which > was introduced by JEP 158 [5] in JDK 9. Most logs have been > retrofitted to use UL since then (e.g. "JEP 271: Unified GC Logging" > [6]). > - The current UL implementation is based on the standard C buffered > stream I/O interface [7]. The LogFileStreamOutput class which writes > logs to abstract FILE streams is the only child of the abstract base > class LogOutput. LogFileStreamOutput has three child classes > LogStdoutOutput, LogStderrOutput and LogFileOutput which write to > stdout, stderr or an arbitrary file respectively. The initial UL JEP > 158 [5] envisioned logging to a socket but has not implemented it. At > least one such extension has been proposed since then [8]. > - UL synchronizes logs from different threads with the help of the > standard C flockfile()/funlockfile() [9] functions. But even without > this explicit locking, all the "stdio functions are thread-safe. This > is achieved by assigning to each FILE object a lockcount and (if the > lockcount is nonzero) an owning thread. For each library call, these > functions wait until the FILE object is no longer locked by a > different thread, then lock it, do the requested I/O, and unlock the > object again" [9]. A quick look at the glibc sources reveals that FILE > locking is implemented with the help of futex() [10] which breaks down > to s simple atomic compare and swap (CAS) on the fast path. > - Notice that UL "synchronizes" logs from different threads to avoid > log interleaving. But it does not "serialize" logs according to the > time at which they occurred. This is because the underlying stdio > functions do not guarantee a specific order for different threads > waiting on a locked FILE stream. E.g. if three log events A, B, C > occur in that order, the first will lock the output stream. If the log > events B and C both arrive while the stream is locked, it is > unspecified which of B and C will be logged first after A releases the > lock. > > Problem statement: > > - The amount of time a log event will block its FILE stream depends on > the underlying file system. This can range from a few nanoseconds for > in-memory file systems or milliseconds for physical discs under heavy > load up to several seconds in the worst case scenario for e.g. network > file systems. A blocked log output stream will block all concurrent > threads which try to write log messages at the same time. If logging > is done during a safepoint, this can significantly increase the > safepoint time (e.g. several parallel GC threads trying to log at the > same time). We can treat stdout/stderr as special files here without > loss of generality. > > Proposed solution: > > Extend UL with an asynchronous logging facility. Asynchronous logging > will be optional and disabled by default. It should have the following > properties: > - If disabled (the default) asynchronous logging should have no > observable impact on logging. > Additionally, if disabled, it should not cost anything. If enabled, it should cost as little as possible (eg. if logging is enabled but nobody logs). > - If enabled, log messages will be stored in an intermediate data > structure (e.g. a double ended queue). > - A service thread will concurrently read and remove log messages from > that data structure in a FIFO style and write them to the output > stream > - Storing log messages in the intermediate data structure should take > constant time and not longer than logging a message takes in the > traditional UL system (in general the time should be much shorter > because the actual I/O is deferred). > - Asynchronous logging trades memory overhead (i.e. the size of the > intermediate data structure) for log accuracy. This means that in the > unlikely case where the service thread which does the asynchronous > logging falls behind the log producing threads, some logs might be > lost. However, the probability for this to happen can be minimized by > increasing the configurable size of the intermediate data structure. > - The final output produced by asynchronous logging should be equal to > the output from normal logging if no messages had to be dropped. > +1. This means decorators have to be resolved at the log site, not in the flusher. > Notice that in contrast to the traditional unified logging, > asynchronous logging will give us the possibility to not only > synchronize log events, but to optionally also serialize them based on > their generation time if that's desired. This is because we are in > full control of the synchronization primitives for the intermediate > data structure which stores the logs. - If log messages had to be dropped, this should be logged in the log > output (e.g. "[..] 42 messages dropped due to async logging") > - Asynchronous logging should ideally be implemented in such a way > that it can be easily adapted by alternative log targets like for > example sockets in the future. > Additional requests: - no log output should be withheld in case of vm exit or crash - no log output should be unreasonably delayed - The logger side should use as little VM infrastructure as possible to prevent circularity. > > Alternative solutions: > - It has repeatedly been suggested to place the log files into a > memory file system but we don't think this is an optimal solution. > Main memory is often a constrained resource and we don't want log > files to compete with the JVM for it in such cases. > - It has also been argued to place the log files on a fast file system > which is only used for logging but in containerized environments file > system are often virtualized and the properties of the underlying > physical devices are not obvious. > - The load on the file system might be unpredictable due to other > applications on the same host. > - All these workarounds won't work if we eventually implement direct > logging to a network socket as suggested in [8]. > > Implementation details / POC: > > - A recent pull request [2] for JDK-8229517 [3] proposed to use a > simple deque implementation derived from HotSpot's LinkedListImpl > class for the intermediate data structure. It synchronizes access to > the queue with a MutexLocker which is internally implemented with > pthread_lock() and results in an atomic CAS on the fast path. So > performance-wise the locking itself is not different from the > flockfile()/funlockfile() functionality currently used by UL but > adding a log message to the deque should be constant as it basically > only requires a strdup(). And we could even eliminate the strdup() if > we'd pre-allocate a big enough array for holding the log messages as > proposed in the pull request [2]. > IIUC to be equivalent to flockfile the implementation would have to use one queue and one mutex *per file sink*, not a global queue/mutex as the patch proposed. Because otherwise you now introduce synchronization between log sites logging into different files, which before did not affect each other. > - The pull pull request [2] for JDK-8229517 [3] proposed to set the > async flag as an attribute of the Xlog option which feels more natural > because UL configuration is traditionally done within the Xlog option. > But we could just as well use a global -XX flag to enable async > logging? What are your preferences here? > I prefer to keep "async" a global option. I think we should expose as little freedom to the user as possible. I do not think there is a sensible scenario where one would wish to write to one file with async, to another file without async. Nevertheless, if we make this option target-specific this has to work, and perform, and be regression-tested in all its variations. Every option we roll out is a contract we have to fulfill. They find their way into all kinds of environments and user scripts, and once out there it is difficult to roll out incompatible changes. For instance, there is no mechanism to deprecate a part of an option. We have a mechanism for deprecating normal VM options, but Xlog is not a normal option. I am concerned with keeping UL maintainable, and that means keeping the implementation malleable. The more implementation details we expose in the form of options and functionality, the more our hands are tied if we want to change the implementation later. E.g. the implementation of a target-specific async option has to be aware of the existence of targets, and would prevent implementation of this feature in a layer which does not know about targets (e.g. deep down in File IO code). > - The pull pull request [2] for JDK-8229517 [3] (mis)uses the > WatcherThread as service thread to concurrently process the > intermediate data structure and write the log messages out to the log > stream. That should definitely be changed to an independent service > thread. > Yes. > - The pull pull request [2] for JDK-8229517 [3] initially proposed > that the "service thread" runs at a fixed interval to dump log > messages to the log streams. But reviewers commented that this should > better happen either continuously or based on the filling level of the > intermediate data structure. What are your preferences here? > I'd say dump continuously. I like the "flush on filling level" idea even less than the idea of periodic flushes. I do not like trace systems which arbitrarily keep output from me and need a flush to spit everything out. Or omit the last n lines if the VM crashes. - What are your preferences on the configuration of the intermediate > data structure? Should it be configured based on the maximum number of > log messages it can store or rather on the total size of the stored > log messages? I think that for normal runs this distinction probably > won't make a big difference because the size of log messages will > probably be the same on average so "number of log messages" should > always be proportional to "total size of log mesages". > > I prefer the configuration of the intermediate buffer to be as a memory size, not "number of entries". The latter does not carry any information (what entries? how large are they?). It also, again, exposes implementation details - in this case that there is a vector of entries. A memory size could initially, in a first implementation, translated roughly to deque size by just assuming an average log line length. Future implementations then have the freedom to change this, e.g. use a pre-allocated fixed sized buffer of the given length, or a more involved scheme. > 1. Before diving into more implementation details, I'd first like to > reach a general consensus that asynchronous logging is a useful > feature that's worth while adding to HotSpot. > > 2. Once we agree on that, we should agree on the desired properties of > asynchronous logging. I've tried to collect a basic set in the > "Proposed solution" section. > > 3. If that's done as well, we can discuss various implementation > details and finally prepare new pull requests. > > Thank you and best regards, > Volker > > [1] https://bugs.openjdk.java.net/browse/JDK-8229517 > [2] https://github.com/openjdk/jdk/pull/3135 > [3] > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-November/043427.html > [4] > https://mail.openjdk.java.net/pipermail/hotspot-dev/2019-August/039130.html > [5] https://openjdk.java.net/jeps/158 > [6] https://openjdk.java.net/jeps/271 > [7] https://man7.org/linux/man-pages/man3/stdio.3.html > [8] https://gist.github.com/YaSuenag/dacb6d94d8684915422232c7a08d5b5d > [9] https://man7.org/linux/man-pages/man3/flockfile.3.html > [10] https://man7.org/linux/man-pages/man2/futex.2.html Thanks, Thomas From coleenp at openjdk.java.net Wed Mar 31 19:52:35 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 19:52:35 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream Message-ID: This function is used to call the classfile parser for hidden or anonymous classes, and for use with jvmti RedefineClasses. The latter only calls KlassFactory::create_from_stream and skips the rest of the code in SystemDictionary::parse_stream. Renamed SystemDictionary::parse_stream -> resolve_hidden_class_from_stream resolve_from_stream -> resolve_class_from_stream and have SystemDictionary::resolve_from_stream() call the right version depending on ClassLoadInfo flags. Callers of resolve_from_stream now pass protection domain via. ClassLoadInfo. So the external API is resolve_from_stream. Tested with tier1 on 4 Oracle supported platforms. ------------- Commit messages: - Rename parse_stream Changes: https://git.openjdk.java.net/jdk/pull/3289/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=3289&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8264538 Stats: 132 lines in 7 files changed: 37 ins; 23 del; 72 mod Patch: https://git.openjdk.java.net/jdk/pull/3289.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3289/head:pull/3289 PR: https://git.openjdk.java.net/jdk/pull/3289 From coleenp at openjdk.java.net Wed Mar 31 19:57:57 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 19:57:57 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v2] In-Reply-To: References: Message-ID: > This function is used to call the classfile parser for hidden or anonymous classes, and for use with jvmti RedefineClasses. The latter only calls KlassFactory::create_from_stream and skips the rest of the code in SystemDictionary::parse_stream. > > Renamed SystemDictionary::parse_stream -> resolve_hidden_class_from_stream > resolve_from_stream -> resolve_class_from_stream > and have SystemDictionary::resolve_from_stream() call the right version depending on ClassLoadInfo flags. Callers of resolve_from_stream now pass protection domain via. ClassLoadInfo. > > So the external API is resolve_from_stream. > > Tested with tier1 on 4 Oracle supported platforms. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: fifix comment ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3289/files - new: https://git.openjdk.java.net/jdk/pull/3289/files/ba7532bd..8dfcb093 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3289&range=01 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3289&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.java.net/jdk/pull/3289.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3289/head:pull/3289 PR: https://git.openjdk.java.net/jdk/pull/3289 From lfoltan at openjdk.java.net Wed Mar 31 20:11:24 2021 From: lfoltan at openjdk.java.net (Lois Foltan) Date: Wed, 31 Mar 2021 20:11:24 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v2] In-Reply-To: References: Message-ID: <5XkcECxtI-f304vVc6TXjJNXHA5kpzcHORV8q8mPvzk=.da25a174-4c29-4f5a-a639-695043a06067@github.com> On Wed, 31 Mar 2021 19:57:57 GMT, Coleen Phillimore wrote: >> This function is used to call the classfile parser for hidden or anonymous classes, and for use with jvmti RedefineClasses. The latter only calls KlassFactory::create_from_stream and skips the rest of the code in SystemDictionary::parse_stream. >> >> Renamed SystemDictionary::parse_stream -> resolve_hidden_class_from_stream >> resolve_from_stream -> resolve_class_from_stream >> and have SystemDictionary::resolve_from_stream() call the right version depending on ClassLoadInfo flags. Callers of resolve_from_stream now pass protection domain via. ClassLoadInfo. >> >> So the external API is resolve_from_stream. >> >> Tested with tier1 on 4 Oracle supported platforms. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fifix comment Nice clean up Coleen. One minor comment. Thanks, Lois src/hotspot/share/prims/jvm.cpp line 950: > 948: InstanceKlass* ik = NULL; > 949: if (!is_hidden) { > 950: ClassLoadInfo cl_info(protection_domain); Minor comment, you could pull the creation of ClassLoadInfo out of this if statement since both the the if and the else sections create a ClassLoadInfo with pretty much the same information. ------------- Marked as reviewed by lfoltan (Reviewer). PR: https://git.openjdk.java.net/jdk/pull/3289 From hseigel at openjdk.java.net Wed Mar 31 20:25:29 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 31 Mar 2021 20:25:29 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v2] In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 19:57:57 GMT, Coleen Phillimore wrote: >> This function is used to call the classfile parser for hidden or anonymous classes, and for use with jvmti RedefineClasses. The latter only calls KlassFactory::create_from_stream and skips the rest of the code in SystemDictionary::parse_stream. >> >> Renamed SystemDictionary::parse_stream -> resolve_hidden_class_from_stream >> resolve_from_stream -> resolve_class_from_stream >> and have SystemDictionary::resolve_from_stream() call the right version depending on ClassLoadInfo flags. Callers of resolve_from_stream now pass protection domain via. ClassLoadInfo. >> >> So the external API is resolve_from_stream. >> >> Tested with tier1 on 4 Oracle supported platforms. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fifix comment src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1395: > 1393: cl_info, > 1394: THREAD); > 1395: Could you add a comment above line 1390 saying you can't call resolve_class_from_stream() here because the resulting class should not go in the system dictionary? ------------- PR: https://git.openjdk.java.net/jdk/pull/3289 From hseigel at openjdk.java.net Wed Mar 31 20:30:20 2021 From: hseigel at openjdk.java.net (Harold Seigel) Date: Wed, 31 Mar 2021 20:30:20 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v2] In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 19:57:57 GMT, Coleen Phillimore wrote: >> This function is used to call the classfile parser for hidden or anonymous classes, and for use with jvmti RedefineClasses. The latter only calls KlassFactory::create_from_stream and skips the rest of the code in SystemDictionary::parse_stream. >> >> Renamed SystemDictionary::parse_stream -> resolve_hidden_class_from_stream >> resolve_from_stream -> resolve_class_from_stream >> and have SystemDictionary::resolve_from_stream() call the right version depending on ClassLoadInfo flags. Callers of resolve_from_stream now pass protection domain via. ClassLoadInfo. >> >> So the external API is resolve_from_stream. >> >> Tested with tier1 on 4 Oracle supported platforms. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fifix comment src/hotspot/share/prims/jvmtiRedefineClasses.hpp line 305: > 303: // - How do we serialize the RedefineClasses() API without deadlocking? > 304: // > 305: // - KlassFactory::create_from_stream() was called with a NULL protection Maybe delete the comment that goes from lines 305 - 309 ? ------------- PR: https://git.openjdk.java.net/jdk/pull/3289 From minqi at openjdk.java.net Wed Mar 31 20:58:46 2021 From: minqi at openjdk.java.net (Yumin Qi) Date: Wed, 31 Mar 2021 20:58:46 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v8] In-Reply-To: References: Message-ID: > Hi, Please review > > Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with > `java -XX:DumpLoadedClassList= .... ` > to collect shareable class names and saved in file `` , then > `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` > With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. > The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. > New added jcmd option: > `jcmd VM.cds static_dump ` > or > `jcmd VM.cds dynamic_dump ` > To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. > The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. > > Tests: tier1,tier2,tier3,tier4 > > Thanks > Yumin Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Fix revert unintentionally comment, merge master. - Merge branch 'master' of ssh://github.com/yminqi/jdk into jdk-8259070 - Remove CDS.getVMArguments, changed to use VM.getRuntimeVMArguments. Removed unused function from ClassLoader. Improved InstanceKlass::is_shareable() and related test. Added more test scenarios. - Remove redundant check for if a class is shareable - Fix according to review comment and add more tests - Fix filter more flags to exclude in static dump, add more test cases - Merge branch 'master' into jdk-8259070 - Fix white space in CDS.java - Add function CDS.dumpSharedArchive in java to dump shared archive - 8259070: Add jcmd option to dump CDS ------------- Changes: https://git.openjdk.java.net/jdk/pull/2737/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=2737&range=07 Stats: 830 lines in 21 files changed: 758 ins; 58 del; 14 mod Patch: https://git.openjdk.java.net/jdk/pull/2737.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/2737/head:pull/2737 PR: https://git.openjdk.java.net/jdk/pull/2737 From coleenp at openjdk.java.net Wed Mar 31 21:41:40 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 21:41:40 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v2] In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 19:57:57 GMT, Coleen Phillimore wrote: >> This function is used to call the classfile parser for hidden or anonymous classes, and for use with jvmti RedefineClasses. The latter only calls KlassFactory::create_from_stream and skips the rest of the code in SystemDictionary::parse_stream. >> >> Renamed SystemDictionary::parse_stream -> resolve_hidden_class_from_stream >> resolve_from_stream -> resolve_class_from_stream >> and have SystemDictionary::resolve_from_stream() call the right version depending on ClassLoadInfo flags. Callers of resolve_from_stream now pass protection domain via. ClassLoadInfo. >> >> So the external API is resolve_from_stream. >> >> Tested with tier1 on 4 Oracle supported platforms. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > fifix comment Thank you for reviewing this Lois and Harold. Some replies attached. ------------- PR: https://git.openjdk.java.net/jdk/pull/3289 From coleenp at openjdk.java.net Wed Mar 31 21:41:39 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 21:41:39 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v3] In-Reply-To: References: Message-ID: > This function is used to call the classfile parser for hidden or anonymous classes, and for use with jvmti RedefineClasses. The latter only calls KlassFactory::create_from_stream and skips the rest of the code in SystemDictionary::parse_stream. > > Renamed SystemDictionary::parse_stream -> resolve_hidden_class_from_stream > resolve_from_stream -> resolve_class_from_stream > and have SystemDictionary::resolve_from_stream() call the right version depending on ClassLoadInfo flags. Callers of resolve_from_stream now pass protection domain via. ClassLoadInfo. > > So the external API is resolve_from_stream. > > Tested with tier1 on 4 Oracle supported platforms. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add and remove comments. ------------- Changes: - all: https://git.openjdk.java.net/jdk/pull/3289/files - new: https://git.openjdk.java.net/jdk/pull/3289/files/8dfcb093..cd49552a Webrevs: - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=3289&range=02 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=3289&range=01-02 Stats: 9 lines in 2 files changed: 2 ins; 7 del; 0 mod Patch: https://git.openjdk.java.net/jdk/pull/3289.diff Fetch: git fetch https://git.openjdk.java.net/jdk pull/3289/head:pull/3289 PR: https://git.openjdk.java.net/jdk/pull/3289 From coleenp at openjdk.java.net Wed Mar 31 21:41:41 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 21:41:41 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v2] In-Reply-To: <5XkcECxtI-f304vVc6TXjJNXHA5kpzcHORV8q8mPvzk=.da25a174-4c29-4f5a-a639-695043a06067@github.com> References: <5XkcECxtI-f304vVc6TXjJNXHA5kpzcHORV8q8mPvzk=.da25a174-4c29-4f5a-a639-695043a06067@github.com> Message-ID: On Wed, 31 Mar 2021 20:07:50 GMT, Lois Foltan wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fifix comment > > src/hotspot/share/prims/jvm.cpp line 950: > >> 948: InstanceKlass* ik = NULL; >> 949: if (!is_hidden) { >> 950: ClassLoadInfo cl_info(protection_domain); > > Minor comment, you could pull the creation of ClassLoadInfo out of this if statement since both the the if and the else sections create a ClassLoadInfo with pretty much the same information. That other ClassLoadInfo cl_info(protection_domain) you see is from another function, so I can't pull it out. The other side of the 'if' statement creates a ClassLoadInfo with all the hidden class goodies. ------------- PR: https://git.openjdk.java.net/jdk/pull/3289 From coleenp at openjdk.java.net Wed Mar 31 21:41:42 2021 From: coleenp at openjdk.java.net (Coleen Phillimore) Date: Wed, 31 Mar 2021 21:41:42 GMT Subject: RFR: 8264538: Rename SystemDictionary::parse_stream [v2] In-Reply-To: References: Message-ID: On Wed, 31 Mar 2021 20:22:08 GMT, Harold Seigel wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> fifix comment > > src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1395: > >> 1393: cl_info, >> 1394: THREAD); >> 1395: > > Could you add a comment above line 1390 saying you can't call resolve_class_from_stream() here because the resulting class should not go in the system dictionary? // Parse and create a class from the bytes, but this class isn't added // to the dictionary, so do not call resolve_from_stream. > src/hotspot/share/prims/jvmtiRedefineClasses.hpp line 305: > >> 303: // - How do we serialize the RedefineClasses() API without deadlocking? >> 304: // >> 305: // - KlassFactory::create_from_stream() was called with a NULL protection > > Maybe delete the comment that goes from lines 305 - 309 ? Good idea. The comment is really old and no longer relevant. ------------- PR: https://git.openjdk.java.net/jdk/pull/3289 From iklam at openjdk.java.net Wed Mar 31 22:53:29 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 31 Mar 2021 22:53:29 GMT Subject: RFR: 8264285: Clean the modification of ccstr JVM flags [v2] In-Reply-To: <71SWS17lpVrTS_4--6mimeyjCYjYzP_VO_lJ-rImnxg=.a75340ae-d17e-4f0d-8868-21d4449d64f6@github.com> References: <71SWS17lpVrTS_4--6mimeyjCYjYzP_VO_lJ-rImnxg=.a75340ae-d17e-4f0d-8868-21d4449d64f6@github.com> Message-ID: On Tue, 30 Mar 2021 03:44:26 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> relax flag attributions (ala JDK-7123237) > > src/hotspot/share/services/writeableFlags.cpp line 250: > >> 248: if (err == JVMFlag::SUCCESS) { >> 249: assert(value == NULL, "old value is freed automatically and not returned"); >> 250: } > > The whole block should be ifdef DEBUG. Since this whole block can be optimized out by the C compiler in product builds, I'd rather leave out the `#ifdef` to avoid clutter. ------------- PR: https://git.openjdk.java.net/jdk/pull/3254 From iklam at openjdk.java.net Wed Mar 31 23:30:42 2021 From: iklam at openjdk.java.net (Ioi Lam) Date: Wed, 31 Mar 2021 23:30:42 GMT Subject: RFR: 8259070: Add jcmd option to dump CDS [v8] In-Reply-To: References: Message-ID: <3yTFtGkXVbbpeMoqSZjS8Gu_iYUmDVFzUjqX6hCI0Ro=.942e390b-672f-4090-bf0a-59b937b23e60@github.com> On Wed, 31 Mar 2021 20:58:46 GMT, Yumin Qi wrote: >> Hi, Please review >> >> Added jcmd option for dumping CDS archive during application runtime. Before this change, user has to dump shared archive in two steps: first run application with >> `java -XX:DumpLoadedClassList= .... ` >> to collect shareable class names and saved in file `` , then >> `java -Xshare:dump -XX:SharedClassListFile= -XX:SharedArchiveFile= ...` >> With this change, user can use jcmd to dump CDS without going through above steps. Also user can choose a moment during the app runtime to dump an archive. >> The bug is associated with the CSR: https://bugs.openjdk.java.net/browse/JDK-8259798 which has been approved. >> New added jcmd option: >> `jcmd VM.cds static_dump ` >> or >> `jcmd VM.cds dynamic_dump ` >> To dump dynamic archive, requires start app with newly added flag `-XX:+RecordDynamicDumpInfo`, with this flag, some information related to dynamic dump like loader constraints will be recorded. Note the dumping process changed some object memory locations so for dumping dynamic archive, can only done once for a running app. For static dump, user can dump multiple times against same process. >> The file name is optional, if the file name is not supplied, the file name will take format of `java_pid_static.jsa` or `java_pid_dynamic.jsa` for static and dynamic respectively. The `` is the application process ID. >> >> Tests: tier1,tier2,tier3,tier4 >> >> Thanks >> Yumin > > Yumin Qi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - Fix revert unintentionally comment, merge master. > - Merge branch 'master' of ssh://github.com/yminqi/jdk into jdk-8259070 > - Remove CDS.getVMArguments, changed to use VM.getRuntimeVMArguments. Removed unused function from ClassLoader. Improved InstanceKlass::is_shareable() and related test. Added more test scenarios. > - Remove redundant check for if a class is shareable > - Fix according to review comment and add more tests > - Fix filter more flags to exclude in static dump, add more test cases > - Merge branch 'master' into jdk-8259070 > - Fix white space in CDS.java > - Add function CDS.dumpSharedArchive in java to dump shared archive > - 8259070: Add jcmd option to dump CDS Changes requested by iklam (Reviewer). src/hotspot/share/classfile/vmSymbols.hpp line 304: > 302: template(generateLambdaFormHolderClasses_signature, "([Ljava/lang/String;)[Ljava/lang/Object;") \ > 303: template(dumpSharedArchive, "dumpSharedArchive") \ > 304: template(dumpSharedArchive_signature, "(ZLjava/lang/String;)V") \ Need to align the "dumpSharedArchive" part with the previous line. src/hotspot/share/prims/jvm.cpp line 3745: > 3743: #if INCLUDE_CDS > 3744: assert(UseSharedSpaces && RecordDynamicDumpInfo, "already checked in arguments.cpp?"); > 3745: if (DynamicArchive::has_been_dumped_once()) { Maybe add a comment like this:? // During dynamic archive dumping, some of the data structures are overwritten so // we cannot dump the dynamic archive again. TODO: this should be fixed. src/hotspot/share/prims/jvm.cpp line 3754: > 3752: assert(ArchiveClassesAtExit == nullptr, "already checked in arguments.cpp?"); > 3753: Handle file_handle(THREAD, JNIHandles::resolve_non_null(archiveName)); > 3754: char* archive_name = java_lang_String::as_utf8_string(file_handle()); A ResourceMark is needed before calling java_lang_String::as_utf8_string(). In general, I think the code in jvm.cpp should only marshall the jobject argument (e.g., convert `jstring` to `char*`.). The main functionality of JVM_DumpDynamicArchive should be moved to dynamicArchive.cpp. Similarly, most of the work in JVM_DumpClassListToFile should be moved to metaspaceShared.cpp. src/hotspot/share/prims/jvm.cpp line 3759: > 3757: DynamicArchive::dump(); > 3758: } else { > 3759: THROW_MSG(vmSymbols::java_lang_RuntimeException(), Need to set ArchiveClassesAtExit to NULL before throwing the exception, since dynamic dump may not work anymore after the failure. test/hotspot/jtreg/runtime/cds/appcds/jcmd/LingeredTestApp.java line 28: > 26: public class LingeredTestApp extends LingeredApp { > 27: // Do not use default test.class.path in class path. > 28: public boolean useDefaultClasspath() { return false; } It's not obvious that you're changing the behavior of the base class by overriding a member function. It's better to have public LingeredTestApp() { setUseDefaultClasspath(false); } Also, the name of LingeredTestApp is kind of generic. How about renaming it to JCmdTestLingeredApp? ------------- PR: https://git.openjdk.java.net/jdk/pull/2737