From fyang at openjdk.org Sat Jul 1 00:32:58 2023 From: fyang at openjdk.org (Fei Yang) Date: Sat, 1 Jul 2023 00:32:58 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: On Fri, 30 Jun 2023 08:56:02 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> We recently had a bug where user were missing permissions to use this syscall. >> Which caused crashing on, according to hs_err on things like "addi x11, x24, 0" with SIGILL. >> If it fails it is even possible to execute valid but 'old' instruction which may not lead to a crash, instead the program misbehaves. >> >> To avoid this mess I suggest that we first test the syscall during vm init and we use it directly. >> This way we can make sure it never fails. >> >> Tested failing syscall with qemu, tested t1 in qemu, t1 on jh7110 in-progress. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > added back data barrier > > Signed-off-by: Robbin Ehn Overall LGTM. Two nits remain. src/hotspot/cpu/riscv/icache_riscv.cpp line 49: > 47: > 48: void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) { > 49: // Only riscv_flush_icache is supported as I-cache synhronization. Typo: s/synhronization/synchronization/ src/hotspot/os_cpu/linux_riscv/riscv_flush_icache.cpp line 62: > 60: bool RiscvFlushIcache::test() { > 61: long ret; > 62: intptr_t end = ((intptr_t)&ret) + 64; It looks a little bit odd to me here as `ret` is only 8 bytes. Maybe define a local array of 64 bytes and operate on that? ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14670#pullrequestreview-1507993451 PR Review Comment: https://git.openjdk.org/jdk/pull/14670#discussion_r1248353481 PR Review Comment: https://git.openjdk.org/jdk/pull/14670#discussion_r1248354948 From stuefe at openjdk.org Sat Jul 1 08:04:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 1 Jul 2023 08:04:55 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: On Fri, 30 Jun 2023 08:51:16 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> We recently had a bug where user were missing permissions to use this syscall. >> Which caused crashing on, according to hs_err on things like "addi x11, x24, 0" with SIGILL. >> If it fails it is even possible to execute valid but 'old' instruction which may not lead to a crash, instead the program misbehaves. >> >> To avoid this mess I suggest that we first test the syscall during vm init and we use it directly. >> This way we can make sure it never fails. >> >> Tested failing syscall with qemu, tested t1 in qemu, t1 on jh7110 in-progress. > > Pushed small update. @robehn Thanks for the thorough explanation! > So I feel more comfortable with the VM explicit emitting this syscall, so everyone can see exactly what we are doing. I agree, rather do the real thing explicitly than use - and then have to second-guess - a wrapper. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14670#issuecomment-1615627458 From rehn at openjdk.org Sat Jul 1 10:38:01 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sat, 1 Jul 2023 10:38:01 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: On Sat, 1 Jul 2023 00:28:54 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> added back data barrier >> >> Signed-off-by: Robbin Ehn > > src/hotspot/os_cpu/linux_riscv/riscv_flush_icache.cpp line 62: > >> 60: bool RiscvFlushIcache::test() { >> 61: long ret; >> 62: intptr_t end = ((intptr_t)&ret) + 64; > > It looks a little bit odd to me here as `ret` is only 8 bytes. Maybe define a local array of 64 bytes and operate on that? We are not flushing ret, but the one/two cachelines around ret, since we know they are valid. (we just created the current frame(s) on to it) Ret is only used to get an stack address in current frame, i.e.. it could have been char just as well. But I can change since this was obviously not clear. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14670#discussion_r1248763336 From stuefe at openjdk.org Sat Jul 1 10:57:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 1 Jul 2023 10:57:56 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v2] In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 19:24:01 GMT, Coleen Phillimore wrote: >> I noticed this cleanup in a patch that Axel shared with me that I thought should be pushed on its own as trivial. >> Tested with tier1 on Oracle supported platforms and looked for these on the others. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix ppc src/hotspot/cpu/ppc/icache_ppc.hpp line 47: > 45: // Align start address to an icache line boundary and transform > 46: // nbytes to an icache line count. > 47: const uint line_offset = (intptr_t)start & (line_size - 1); Drive by comment: This used to be a `uintptr_t` cast, now its `intptr_t`- why signed? Was that a deliberate decision? Side question, I always wondered about the widespread use of `intptr_t` in hotspot when casting pointers to integrals, e.g. in `p2i()`. Why a signed integer? E.g. if you then use it for bitwise ops (the major reason why one converts a pointer to an int) it seems more hassle to have to remember the rules for using bitwise operators on signed integers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14737#discussion_r1248769957 From rehn at openjdk.org Sat Jul 1 11:11:15 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sat, 1 Jul 2023 11:11:15 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v3] In-Reply-To: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: <4eyogulkaSvi1d-xVbPCAp_mwRSD5sHyfysJj2Gat2A=.abfd20ed-c21d-4150-b25c-e4f9a5b71868@github.com> > Hi, please consider. > > We recently had a bug where user were missing permissions to use this syscall. > Which caused crashing on, according to hs_err on things like "addi x11, x24, 0" with SIGILL. > If it fails it is even possible to execute valid but 'old' instruction which may not lead to a crash, instead the program misbehaves. > > To avoid this mess I suggest that we first test the syscall during vm init and we use it directly. > This way we can make sure it never fails. > > Tested failing syscall with qemu, tested t1 in qemu, t1 on jh7110 in-progress. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - merge update and nits - Merge branch 'master' into 8310656 - added back data barrier Signed-off-by: Robbin Ehn - 8310656: RISC-V: __builtin___clear_cache can fail silently. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14670/files - new: https://git.openjdk.org/jdk/pull/14670/files/6b6e2dd9..46b59c60 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14670&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14670&range=01-02 Stats: 5912 lines in 342 files changed: 3718 ins; 753 del; 1441 mod Patch: https://git.openjdk.org/jdk/pull/14670.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14670/head:pull/14670 PR: https://git.openjdk.org/jdk/pull/14670 From rehn at openjdk.org Sat Jul 1 11:11:17 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sat, 1 Jul 2023 11:11:17 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: On Fri, 30 Jun 2023 08:56:02 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> We recently had a bug where user were missing permissions to use this syscall. >> Which caused crashing on, according to hs_err on things like "addi x11, x24, 0" with SIGILL. >> If it fails it is even possible to execute valid but 'old' instruction which may not lead to a crash, instead the program misbehaves. >> >> To avoid this mess I suggest that we first test the syscall during vm init and we use it directly. >> This way we can make sure it never fails. >> >> Tested failing syscall with qemu, tested t1 in qemu, t1 on jh7110 in-progress. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > added back data barrier > > Signed-off-by: Robbin Ehn Merge and updated for: "[8311145](https://bugs.openjdk.org/browse/JDK-8311145) Remove check_with_errno duplicates". and those nits. Thanks @RealFYang @tstuefe ! @RealFYang Let me know if that flush address update is better! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14670#issuecomment-1615851309 From rehn at openjdk.org Sat Jul 1 11:11:18 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sat, 1 Jul 2023 11:11:18 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: On Sat, 1 Jul 2023 00:22:19 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> added back data barrier >> >> Signed-off-by: Robbin Ehn > > src/hotspot/cpu/riscv/icache_riscv.cpp line 49: > >> 47: >> 48: void ICacheStubGenerator::generate_icache_flush(ICache::flush_icache_stub_t* flush_icache_stub) { >> 49: // Only riscv_flush_icache is supported as I-cache synhronization. > > Typo: s/synhronization/synchronization/ Fixed, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14670#discussion_r1248772314 From rehn at openjdk.org Sat Jul 1 11:11:19 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sat, 1 Jul 2023 11:11:19 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: On Sat, 1 Jul 2023 10:34:48 GMT, Robbin Ehn wrote: >> src/hotspot/os_cpu/linux_riscv/riscv_flush_icache.cpp line 62: >> >>> 60: bool RiscvFlushIcache::test() { >>> 61: long ret; >>> 62: intptr_t end = ((intptr_t)&ret) + 64; >> >> It looks a little bit odd to me here as `ret` is only 8 bytes. Maybe define a local array of 64 bytes and operate on that? > > We are not flushing ret, but the one/two cachelines around ret, since we know they are valid. (we just created the current frame(s) on to it) > Ret is only used to get an stack address in current frame, i.e.. it could have been char just as well. > > But I can change since this was obviously not clear. Fixed, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14670#discussion_r1248772332 From fyang at openjdk.org Sat Jul 1 12:11:55 2023 From: fyang at openjdk.org (Fei Yang) Date: Sat, 1 Jul 2023 12:11:55 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: <974x2g5kDQjGbYMDCG4gXCNdckwgKvsXR3DAxYVGzOU=.01e5bd6b-d10d-471b-be16-66234a2a2eeb@github.com> On Sat, 1 Jul 2023 11:05:52 GMT, Robbin Ehn wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> added back data barrier >> >> Signed-off-by: Robbin Ehn > > Merge and updated for: > "[8311145](https://bugs.openjdk.org/browse/JDK-8311145) Remove check_with_errno duplicates". > and those nits. > > Thanks @RealFYang @tstuefe ! > > @RealFYang Let me know if that flush address update is better! @robehn : Yes, looks better to me now. Thank you for the update. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14670#issuecomment-1615875696 From coleenp at openjdk.org Sat Jul 1 15:05:13 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Sat, 1 Jul 2023 15:05:13 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v3] In-Reply-To: References: Message-ID: > I noticed this cleanup in a patch that Axel shared with me that I thought should be pushed on its own as trivial. > Tested with tier1 on Oracle supported platforms and looked for these on the others. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: This should be uintptr_t. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14737/files - new: https://git.openjdk.org/jdk/pull/14737/files/6725b46f..d28d9ead Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14737&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14737&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14737.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14737/head:pull/14737 PR: https://git.openjdk.org/jdk/pull/14737 From coleenp at openjdk.org Sat Jul 1 15:05:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Sat, 1 Jul 2023 15:05:15 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v2] In-Reply-To: References: Message-ID: On Sat, 1 Jul 2023 10:55:35 GMT, Thomas Stuefe wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix ppc > > src/hotspot/cpu/ppc/icache_ppc.hpp line 47: > >> 45: // Align start address to an icache line boundary and transform >> 46: // nbytes to an icache line count. >> 47: const uint line_offset = (intptr_t)start & (line_size - 1); > > Drive by comment: This used to be a `uintptr_t` cast, now its `intptr_t`- why signed? Was that a deliberate decision? > > Side question, I always wondered about the widespread use of `intptr_t` in hotspot when casting pointers to integrals, e.g. in `p2i()`. Why a signed integer? E.g. if you then use it for bitwise ops (the major reason why one converts a pointer to an int) it seems more hassle to have to remember the rules for using bitwise operators on signed integers. Hi Thomas, thanks for noticing. I messed this one up. Are you arguing for address_word to avoid this confusion? That's a lot of reeducation in the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14737#discussion_r1248865829 From stuefe at openjdk.org Sat Jul 1 15:06:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 1 Jul 2023 15:06:55 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v2] In-Reply-To: References: Message-ID: On Sat, 1 Jul 2023 14:59:48 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/ppc/icache_ppc.hpp line 47: >> >>> 45: // Align start address to an icache line boundary and transform >>> 46: // nbytes to an icache line count. >>> 47: const uint line_offset = (intptr_t)start & (line_size - 1); >> >> Drive by comment: This used to be a `uintptr_t` cast, now its `intptr_t`- why signed? Was that a deliberate decision? >> >> Side question, I always wondered about the widespread use of `intptr_t` in hotspot when casting pointers to integrals, e.g. in `p2i()`. Why a signed integer? E.g. if you then use it for bitwise ops (the major reason why one converts a pointer to an int) it seems more hassle to have to remember the rules for using bitwise operators on signed integers. > > Hi Thomas, thanks for noticing. I messed this one up. > > Are you arguing for address_word to avoid this confusion? That's a lot of reeducation in the code. Hi Coleen, no, I was just wondering if the decision was deliberate. I may have overlooked a reason for this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14737#discussion_r1248866457 From stuefe at openjdk.org Sat Jul 1 19:30:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 1 Jul 2023 19:30:06 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect Message-ID: Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) - that THPs are enabled if that page size is >0. Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). ------ About the patch: This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. The patch cleanly splits off the OS-side feature detection (see hugepages.hpp) from the layer that bases decisions on that detection (os::large_page_init()). Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). ------------- Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled always madvise [never] thomas at starfish $ cat /proc/meminfo | grep Hugepage Hugepagesize: 1048576 kB Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G: thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version [0.001s][info][pagesize] Using the default large page size: 1G [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G ... [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G With patch, we correctly refuse to use large pages (and we log more info): thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) [0.001s][info][pagesize] default pagesize: 1G [0.001s][info][pagesize] Transparent hugepage (THP) support: [0.001s][info][pagesize] mode: never [0.001s][warning][pagesize] UseLargePages disabled, no large pages configured and available on the system. ---------- Example 2: System has THPs enabled, but THP page size is just *2M*, whereas the system uses a static default hugepage size of *1G*: thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled always [madvise] never thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size 2097152 thomas at starfish $ cat /proc/meminfo | grep Hugepage Hugepagesize: 1048576 kB Without patch, THP page size is not correctly recognized as 2M. Instead, we again use 1G as page size for the heap: thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version [0.001s][info][pagesize] Using the default large page size: 1G [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G ... [0.010s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G With patch, we correctly identify the THP page size as 2M, and use that for the heap: thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) [0.001s][info][pagesize] default pagesize: 1G [0.001s][info][pagesize] Transparent hugepage (THP) support: [0.001s][info][pagesize] mode: madvise [0.001s][info][pagesize] pagesize: 2M [0.001s][info][pagesize] Large page support enabled. Usable page sizes: 4k, 2M [0.001s][info][pagesize] Default: 2M ... [0.010s][info][pagesize] Heap: min=8M max=512M base=0x00000000e0000000 size=512M page_size=2M ------------ Tests: GHAs all green. Local experiments on x64 Linux on machines with 1G pages succeeded. ------------- Commit messages: - remove whitespaces - Improve comments - JDK-8310233-Linux-THP-initialization-incorrect Changes: https://git.openjdk.org/jdk/pull/14739/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14739&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310233 Stats: 518 lines in 5 files changed: 387 ins; 100 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/14739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14739/head:pull/14739 PR: https://git.openjdk.org/jdk/pull/14739 From vkempik at openjdk.org Sun Jul 2 13:42:11 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Sun, 2 Jul 2023 13:42:11 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v3] In-Reply-To: <4eyogulkaSvi1d-xVbPCAp_mwRSD5sHyfysJj2Gat2A=.abfd20ed-c21d-4150-b25c-e4f9a5b71868@github.com> References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> <4eyogulkaSvi1d-xVbPCAp_mwRSD5sHyfysJj2Gat2A=.abfd20ed-c21d-4150-b25c-e4f9a5b71868@github.com> Message-ID: <2S-r7TWONVjw1Gs91lnD2YE-rJhxxJm_K7v96vwBlvI=.ddf06e63-ac6b-4a05-9c65-52d002eb1396@github.com> On Sat, 1 Jul 2023 11:11:15 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> We recently had a bug where user were missing permissions to use this syscall. >> Which caused crashing on, according to hs_err on things like "addi x11, x24, 0" with SIGILL. >> If it fails it is even possible to execute valid but 'old' instruction which may not lead to a crash, instead the program misbehaves. >> >> To avoid this mess I suggest that we first test the syscall during vm init and we use it directly. >> This way we can make sure it never fails. >> >> Tested failing syscall with qemu, tested t1 in qemu, t1 on jh7110 in-progress. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - merge update and nits > - Merge branch 'master' into 8310656 > - added back data barrier > > Signed-off-by: Robbin Ehn > - 8310656: RISC-V: __builtin___clear_cache can fail silently. We had an issue with __builtin___clear_cache and clang-compiled jvm linked with libstdc++ ( instead of libc++) where calling __builtin___clear_cache was resulting in the NOP. This effectively fixes that case as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14670#issuecomment-1616664877 From rehn at openjdk.org Sun Jul 2 16:18:09 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sun, 2 Jul 2023 16:18:09 GMT Subject: Integrated: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> Message-ID: On Tue, 27 Jun 2023 08:19:47 GMT, Robbin Ehn wrote: > Hi, please consider. > > We recently had a bug where user were missing permissions to use this syscall. > Which caused crashing on, according to hs_err on things like "addi x11, x24, 0" with SIGILL. > If it fails it is even possible to execute valid but 'old' instruction which may not lead to a crash, instead the program misbehaves. > > To avoid this mess I suggest that we first test the syscall during vm init and we use it directly. > This way we can make sure it never fails. > > Tested failing syscall with qemu, tested t1 in qemu, t1 on jh7110 in-progress. This pull request has now been integrated. Changeset: faf1b822 Author: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e Stats: 121 lines in 3 files changed: 116 ins; 1 del; 4 mod 8310656: RISC-V: __builtin___clear_cache can fail silently. Reviewed-by: luhenry, stuefe, fyang ------------- PR: https://git.openjdk.org/jdk/pull/14670 From rehn at openjdk.org Sun Jul 2 18:43:21 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sun, 2 Jul 2023 18:43:21 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. Message-ID: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> Hi all, This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. Thanks! ------------- Commit messages: - Backport faf1b822d03b726413d77a2b247dfbbf4db7d57e Changes: https://git.openjdk.org/jdk21/pull/90/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=90&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310656 Stats: 121 lines in 3 files changed: 116 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk21/pull/90.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/90/head:pull/90 PR: https://git.openjdk.org/jdk21/pull/90 From dnsimon at openjdk.org Sun Jul 2 21:51:59 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Sun, 2 Jul 2023 21:51:59 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v4] In-Reply-To: References: Message-ID: > In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. > > This PR proposes a `TraceClassLoadingCause` VM flag: > > > product(ccstr, TraceClassLoadingCause, nullptr, DIAGNOSTIC, \ > "Print a stack trace when loading a class whose fully" \ > "qualified name contains this string ("*" matches " \ > "any class).") \ > > > I would have liked to implement this using Unified Logging but UL has no support for filtering on the class names. > > Example usage: > > java -XX:+UnlockDiagnosticVMOptions -XX:TraceClassLoadingCause=Thread --version > Loading java.lang.Thread > Loading java.lang.Thread$FieldHolder > Loading java.lang.Thread$Constants > Loading java.lang.Thread$UncaughtExceptionHandler > Loading java.lang.ThreadGroup > Loading java.lang.BaseVirtualThread > Loading java.lang.VirtualThread > Loading java.lang.ThreadBuilders$BoundVirtualThread > Loading java.lang.Thread$State > at jdk.internal.misc.VM.toThreadState(java.base/VM.java:336) > at java.lang.Thread.threadState(java.base/Thread.java:2731) > at java.lang.Thread.isTerminated(java.base/Thread.java:2738) > at java.lang.Thread.getThreadGroup(java.base/Thread.java:1957) > at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:314) > at java.lang.System.initPhase1(java.base/System.java:2206) > Loading java.lang.Thread$ThreadIdentifiers > at java.lang.Thread.(java.base/Thread.java:734) > at java.lang.Thread.(java.base/Thread.java:1477) > at java.lang.ref.Reference$ReferenceHandler.(java.base/Reference.java:198) > at java.lang.ref.Reference.startReferenceHandlerThread(java.base/Reference.java:300) > at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:318) > at java.lang.System.initPhase1(java.base/System.java:2206) > Loading java.lang.ref.Finalizer$FinalizerThread > at java.lang.ref.Finalizer.startFinalizerThread(java.base/Finalizer.java:187) > at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:319) > at java.lang.System.initPhase1(java.base/System.java:2206) > ... Doug Simon has updated the pull request incrementally with one additional commit since the last revision: add class+load+cause and make its output independent of class+load ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14553/files - new: https://git.openjdk.org/jdk/pull/14553/files/535ba7ea..6947f359 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=02-03 Stats: 74 lines in 4 files changed: 59 ins; 12 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14553.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14553/head:pull/14553 PR: https://git.openjdk.org/jdk/pull/14553 From dnsimon at openjdk.org Sun Jul 2 22:01:05 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Sun, 2 Jul 2023 22:01:05 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v4] In-Reply-To: References: Message-ID: <2bq5KM_Xp0LgHViKC5wlMKTFCboT_4Isw19tjfVZKKY=.8243f6f9-c615-4c29-821b-48420f472b22@github.com> On Sun, 2 Jul 2023 21:51:59 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR proposes a `TraceClassLoadingCause` VM flag: >> >> >> product(ccstr, TraceClassLoadingCause, nullptr, DIAGNOSTIC, \ >> "Print a stack trace when loading a class whose fully" \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> I would have liked to implement this using Unified Logging but UL has no support for filtering on the class names. >> >> Example usage: >> >> java -XX:+UnlockDiagnosticVMOptions -XX:TraceClassLoadingCause=Thread --version >> Loading java.lang.Thread >> Loading java.lang.Thread$FieldHolder >> Loading java.lang.Thread$Constants >> Loading java.lang.Thread$UncaughtExceptionHandler >> Loading java.lang.ThreadGroup >> Loading java.lang.BaseVirtualThread >> Loading java.lang.VirtualThread >> Loading java.lang.ThreadBuilders$BoundVirtualThread >> Loading java.lang.Thread$State >> at jdk.internal.misc.VM.toThreadState(java.base/VM.java:336) >> at java.lang.Thread.threadState(java.base/Thread.java:2731) >> at java.lang.Thread.isTerminated(java.base/Thread.java:2738) >> at java.lang.Thread.getThreadGroup(java.base/Thread.java:1957) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:314) >> at java.lang.System.initPhase1(java.base/System.java:2206) >> Loading java.lang.Thread$ThreadIdentifiers >> at java.lang.Thread.(java.base/Thread.java:734) >> at java.lang.Thread.(java.base/Thread.java:1477) >> at java.lang.ref.Reference$ReferenceHandler.(java.base/Reference.java:198) >> at java.lang.ref.Reference.startReferenceHandlerThread(java.base/Reference.java:300) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:318) >> at java.lang.System.initPhase1(java.base/System.java:2206) >> Loading java.lang.ref.Finalizer$FinalizerThread >> at java.lang.ref.Finalizer.startFinalizerThread(java.base/Finalizer.java:187) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:319) >> at java.lang.Sy... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > add class+load+cause and make its output independent of class+load I've now put this feature under UL, activated by `-Xlog:class+load+cause` or `-Xlog:class+load+cause+native`. I've made its output independent of `-Xlog:class+load`. `java -XX:LogClassLoadingCauseFor=java.util.concurrent -Xlog:class+load+cause --version` [class_load_cause.txt](https://github.com/openjdk/jdk/files/11931534/class_load_cause.txt) `java -XX:LogClassLoadingCauseFor=java.util.concurrent -Xlog:class+load+cause+native --version` [class_load_cause_native.txt](https://github.com/openjdk/jdk/files/11931539/class_load_cause_native.txt) `java -XX:LogClassLoadingCauseFor=java.util.concurrent "-Xlog:class+load+cause*" --version` [class_load_cause*.txt](https://github.com/openjdk/jdk/files/11931543/class_load_cause.txt) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14553#issuecomment-1616845720 From dholmes at openjdk.org Sun Jul 2 22:31:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 2 Jul 2023 22:31:53 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v2] In-Reply-To: References: <_udPW4ZRrZI7JzWpd0452TFz7IIWOp_DyT4 2a391nto=.f8e9110f-3829-43d8-946b-02eed92c792d@github.com> <8z5FwQArdjT4W8mUDqBx5_GVIbvrSdAl2rl-cY_XibY=.91bfb772-40a8-4bf0-9bb7-977f243d5d22@github.com> Message-ID: On Fri, 30 Jun 2023 20:35:44 GMT, Doug Simon wrote: >> Usually, the user would need to specify both. Otherwise just having the cause without the `class+load` would be rather useless >> >> >> java -Xlog:class+load -Xlog:class+load+cause >> >> >> Which can be combined like: >> >> >> java -Xlog:class+load,class+load+cause >> >> >> or >> >> >> java -Xlog:class+load*=info >> >> >> (To see usages of `-Xlog`, you can grep for `[-]Xlog:` in the *.java files under test/hotspot/jtreg) >> >> >> In terms of implementing this PR, I would suggest handling the class+load and class+load+cause independently. Something like >> >> >> >> void InstanceKlass::print_class_load_logging(ClassLoaderData* loader_data, >> const ModuleEntry* module_entry, >> const ClassFileStream* cfs) const { >> if (ClassListWriter::is_enabled()) { >> ClassListWriter::write(this, cfs); >> } >> >> log_class_load(...); // handle -Xlog:class+load >> // (the rest of the old print_class_load_logging() function) >> log_class_load_cause(...); // handle -Xlog:class+load >> } >> >> >> The output would be something like >> >> >> [0.014s][info][class,load] java.lang.Thread$State source: shared objects file >> [0.014s][info][class,load,cause] at jdk.internal.misc.VM.toThreadState(java.base/VM.java:336) >> [0.014s][info][class,load,cause] at java.lang.Thread.threadState(java.base/Thread.java:2731) >> [0.014s][info][class,load,cause] at java.lang.Thread.isTerminated(java.base/Thread.java:2738) >> [0.014s][info][class,load,cause] at java.lang.Thread.getThreadGroup(java.base/Thread.java:1957) >> ... >> >> >> This way, you don't need to repeat the class name, which is already printed by `-Xlog:class+load` > >> Also, this is always JavaThread::current() because all the callers have the current JavaThread. No need for the null check or the cast. > > Does this mean this existing type check and cast is unnecessary: https://github.com/openjdk/jdk/blob/e8ff74c7e84ec2440a51fee1b4c45e87332807a0/src/hotspot/share/oops/instanceKlass.cpp#L3821-L3823 Yes. I originally raised the possibility of the current thread not being set yet, but then when I checked it would always be set. But you chose to to add in the check "just in case". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1249844732 From dholmes at openjdk.org Sun Jul 2 22:36:54 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 2 Jul 2023 22:36:54 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v2] In-Reply-To: References: <_udPW4ZRrZI7JzWpd0452TFz7IIWOp_DyT4 2a391nto=.f8e9110f-3829-43d8-946b-02eed92c792d@github.com> <8z5FwQArdjT4W8mUDqBx5_GVIbvrSdAl2rl-cY_XibY=.91bfb772-40a8-4bf0-9bb7-977f243d5d22@github.com> Message-ID: On Sun, 2 Jul 2023 22:29:16 GMT, David Holmes wrote: >>> Also, this is always JavaThread::current() because all the callers have the current JavaThread. No need for the null check or the cast. >> >> Does this mean this existing type check and cast is unnecessary: https://github.com/openjdk/jdk/blob/e8ff74c7e84ec2440a51fee1b4c45e87332807a0/src/hotspot/share/oops/instanceKlass.cpp#L3821-L3823 > > Yes. I originally raised the possibility of the current thread not being set yet, but then when I checked it would always be set. But you chose to to add in the check "just in case". > This way, you don't need to repeat the class name, which is already printed by -Xlog:class+load No this is bad form IMO. It only works if the person enabling logging knows that these two have to be specified together - and there is nothing that actually documents what the different log tags do. Using multi-level tags is tricky to get right. If you assume they are dependent then specifying one without the other leads to strange looking output because bits are missing. But if you specify both, or use wildcards, you get repetition. This problem arises when you try to use a tag as a "level of detail" rather than as a classification. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1249847278 From dholmes at openjdk.org Sun Jul 2 22:43:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 2 Jul 2023 22:43:05 GMT Subject: RFR: 8303086: SIGSEGV in JavaThread::is_interp_only_mode() In-Reply-To: <5PhlR3J_Cg-n5DSnX258xNjol34M6eqgTd3AZaoDNsc=.4c7d9382-9f5d-4bdd-b985-7ffc4149da67@github.com> References: <5PhlR3J_Cg-n5DSnX258xNjol34M6eqgTd3AZaoDNsc=.4c7d9382-9f5d-4bdd-b985-7ffc4149da67@github.com> Message-ID: On Fri, 30 Jun 2023 11:27:58 GMT, Serguei Spitsyn wrote: > The JVMTI function `SetEventNotificationMode` can set notification mode globally (`event_thread == nullptr`) for all threads or for a specific thread (`event_thread != nullptr`). To get a stable mount/unmount vision of virtual threads a JvmtiVTMSTransitionDisabler helper object is created : > `JvmtiVTMSTransitionDisabler disabler(event_thread);` > > In a case if `event_thread == nullptr` the VTMS transitions are disabled for all virtual thread, > otherwise they are disabled for a specific thread if it is virtual. > The call to `JvmtiEventController::set_user_enabled()` makes a call to `recompute_enabled()` at the end of its work to do a required bookkeeping. As part of this work, the `recompute_thread_enabled(state)` is called for each thread from the `ThreadsListHandle`, not only for the given `event_thread`: > > ThreadsListHandle tlh; > for (; state != nullptr; state = state->next()) { > any_env_thread_enabled |= recompute_thread_enabled(state); > } > > This can cause crashes as VTMS transitions for other virtual threads are allowed. > Crashes are observed in this small function: > > bool is_interp_only_mode() { > return _thread == nullptr ? _saved_interp_only_mode != 0 : _thread->is_interp_only_mode(); > } > > In a case `_thread != nullptr` then the call needs to be executed: `_thread->is_interp_only_mode()`. > But the filed `_thread` can be already changed to `nullptr` by a VTMS transition. > > The fix is to always disable all transitions. > Thanks to Dan and Patricio for great analysis of this crash! > > Testing: > - In progress: mach5 tiers 1-6 src/hotspot/share/prims/jvmtiEnv.cpp line 578: > 576: record_class_file_load_hook_enabled(); > 577: } > 578: JvmtiVTMSTransitionDisabler disabler; An explanatory comment would have been good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14728#discussion_r1249852707 From dholmes at openjdk.org Mon Jul 3 00:32:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 00:32:56 GMT Subject: RFR: 8310948: Fix ignored-qualifiers warning in Hotspot In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:22:43 GMT, Daniel Jeli?ski wrote: > Please review this attempt to fix ignored-qualifiers warning. > > Example warnings: > > src/hotspot/share/oops/method.hpp:413:19: warning: 'volatile' type qualifier on return type has no effect [-Wignored-qualifiers] > CompiledMethod* volatile code() const; > ^~~~~~~~~ > > > src/hotspot/share/jfr/periodic/jfrModuleEvent.cpp:65:20: warning: type qualifiers ignored on cast result type [-Wignored-qualifiers] > 65 | event.set_source((const ModuleEntry* const)from_module); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > The proposed fix removes the ignored qualifiers. > In a few AD files I replaced `const` with `constexpr` where I noticed that the method is returning a compile-time constant, and other platforms use `constexpr` on the same method. > > Release, debug and slowdebug builds on Aarch64 / x64 and Mac / Linux complete without errors. Cross-compile GHA builds also pass. I will approve this as-is but have to wonder whether many of these cases of const return types were intending to declare const functions? P.S. Forgot to say thanks for dealing with this! src/hotspot/cpu/aarch64/aarch64.ad line 2288: > 2286: //============================================================================= > 2287: > 2288: const bool Matcher::match_rule_supported(int opcode) { Have to wonder if these were all meant to be `bool Match:xxx() const {`? ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14674#pullrequestreview-1510051989 PR Comment: https://git.openjdk.org/jdk/pull/14674#issuecomment-1617042549 PR Review Comment: https://git.openjdk.org/jdk/pull/14674#discussion_r1249926982 From dholmes at openjdk.org Mon Jul 3 00:55:58 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 00:55:58 GMT Subject: RFR: 8311077: Fix -Wconversion warnings in jvmti code [v4] In-Reply-To: References: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> Message-ID: On Fri, 30 Jun 2023 13:10:06 GMT, Coleen Phillimore wrote: >> Please review change for mostly fixing return types in the constant pool and metadata to fix -Wconversion warnings in JVMTI code. The order of preference for changes are: 1. change the types to more distinct types (fields in the constant pool are u2 because that's their size in the classfile), 2. add direct int casts if the value has been checked in asserts above, and 3. checked_cast<> if not verified, and 4. added some pointer_delta_as_ints where needed. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > David's suggestions. src/hotspot/share/runtime/jfieldIDWorkaround.hpp line 92: > 90: result &= small_offset_mask; // cut off the hash bits > 91: } > 92: return result; Doesn't this trigger a warning as we go from unsigned to signed? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14710#discussion_r1249960932 From dholmes at openjdk.org Mon Jul 3 01:10:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 01:10:56 GMT Subject: RFR: 8311077: Fix -Wconversion warnings in jvmti code [v4] In-Reply-To: References: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> Message-ID: On Fri, 30 Jun 2023 13:10:06 GMT, Coleen Phillimore wrote: >> Please review change for mostly fixing return types in the constant pool and metadata to fix -Wconversion warnings in JVMTI code. The order of preference for changes are: 1. change the types to more distinct types (fields in the constant pool are u2 because that's their size in the classfile), 2. add direct int casts if the value has been checked in asserts above, and 3. checked_cast<> if not verified, and 4. added some pointer_delta_as_ints where needed. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > David's suggestions. As a general rule I'd prefer to see the lowest-level functions use the types of the actual thing they are exposing - so for anything CP related we return u1/u2/u4 - and then have higher-level code do any convenience conversions. But I realize this is tricky to deal with and there are alternatives and trade-offs in where these warnings get silenced. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14710#pullrequestreview-1510139930 From dholmes at openjdk.org Mon Jul 3 01:23:10 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 01:23:10 GMT Subject: RFR: 8311077: Fix -Wconversion warnings in jvmti code [v3] In-Reply-To: References: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> Message-ID: On Fri, 30 Jun 2023 12:49:50 GMT, Coleen Phillimore wrote: >> src/hotspot/share/prims/jvmtiRawMonitor.cpp line 385: >> >>> 383: OrderAccess::fence(); >>> 384: >>> 385: int save = _recursions; >> >> `_recursions` is `intx` > > JvmtiRawMonitor _recursions is an int. Maybe it shouldn't be. You could file an RFE to change that if it's wrong. > > > volatile int _recursions; // recursion count, 0 for first entry Sorry, yes was looking at the wrong `_recursions`. `int` is fine here, `intx` is odd as the max expected recursions should not depend on 32-bit versus 64-bit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14710#discussion_r1249983979 From david.holmes at oracle.com Mon Jul 3 01:23:51 2023 From: david.holmes at oracle.com (David Holmes) Date: Mon, 3 Jul 2023 11:23:51 +1000 Subject: RFR: 8311077: Fix -Wconversion warnings in jvmti code [v4] In-Reply-To: References: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> Message-ID: <655de4d5-a299-30e5-8633-c572fcad8cf0@oracle.com> On 3/07/2023 10:55 am, David Holmes wrote: > On Fri, 30 Jun 2023 13:10:06 GMT, Coleen Phillimore wrote: > >>> Please review change for mostly fixing return types in the constant pool and metadata to fix -Wconversion warnings in JVMTI code. The order of preference for changes are: 1. change the types to more distinct types (fields in the constant pool are u2 because that's their size in the classfile), 2. add direct int casts if the value has been checked in asserts above, and 3. checked_cast<> if not verified, and 4. added some pointer_delta_as_ints where needed. >>> Tested with tier1-4. >> >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> David's suggestions. > > src/hotspot/share/runtime/jfieldIDWorkaround.hpp line 92: > >> 90: result &= small_offset_mask; // cut off the hash bits >> 91: } >> 92: return result; > > Doesn't this trigger a warning as we go from unsigned to signed? Weird: I was sure I deleted this comment. Please ignore. I can't even find it in the PR so responding via email. David > ------------- > > PR Review Comment: https://git.openjdk.org/jdk/pull/14710#discussion_r1249960932 From dholmes at openjdk.org Mon Jul 3 01:49:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 01:49:53 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v3] In-Reply-To: References: Message-ID: On Sat, 1 Jul 2023 15:05:13 GMT, Coleen Phillimore wrote: >> I noticed this cleanup in a patch that Axel shared with me that I thought should be pushed on its own as trivial. >> Tested with tier1 on Oracle supported platforms and looked for these on the others. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > This should be uintptr_t. Seems okay. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14737#pullrequestreview-1510192346 From dholmes at openjdk.org Mon Jul 3 01:49:55 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 01:49:55 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v2] In-Reply-To: References: Message-ID: On Sat, 1 Jul 2023 15:03:48 GMT, Thomas Stuefe wrote: >> Hi Thomas, thanks for noticing. I messed this one up. >> >> Are you arguing for address_word to avoid this confusion? That's a lot of reeducation in the code. > > Hi Coleen, no, I was just wondering if the decision was deliberate. I may have overlooked a reason for this. @tstuefe `p2i()` is mainly used for printing, and the format uses hex so the signed-ness is immaterial. But in general, yes this seems an odd choice. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14737#discussion_r1250018263 From yyang at openjdk.org Mon Jul 3 02:02:05 2023 From: yyang at openjdk.org (Yi Yang) Date: Mon, 3 Jul 2023 02:02:05 GMT Subject: RFR: 8284877: Check type compatibility before looking up method from receiver's vtable [v2] In-Reply-To: References: Message-ID: On Mon, 19 Jun 2023 02:41:58 GMT, David Holmes wrote: >>> validate_call >> >> This check is exercised when jni_CallStaticVoidMethod is called, while aforementioned case is that JNI wrongly constructs an object and does a normal virtual call and gets correct result. > > @y1yang0 as far as I can see it is already being called via the following macro/wrapper: > > #define WRAPPER_CallMethod(ResultType, Result) \ > JNI_ENTRY_CHECKED(ResultType, \ > checked_jni_Call##Result##Method(JNIEnv *env, \ > jobject obj, \ > jmethodID methodID, \ > ...)) \ > functionEnter(thr); \ > va_list args; \ > IN_VM( \ > jniCheck::validate_call(thr, nullptr, methodID, obj); \ > ) \ > va_start(args,methodID); \ > ResultType result =UNCHECKED()->Call##Result##MethodV(env, obj, methodID, \ > args); \ > va_end(args); \ > thr->set_pending_jni_exception_check("Call"#Result"Method"); \ > functionExit(thr); \ > return result; \ > JNI_END \ @dholmes-ora > as far as I can see it is already being called via the following macro/wrapper: I mean, users constructs object by JNI and crash when interpreting `odpsFileStatus.getPath().toString()`, `odpsFileStatus.getPath().toString()` is not a JNI call. The difference between this JNI example and the previous Unsafe example is that in the latter case, users knew that there might be errors in their code because the JVM crashed, while in the former case, users didn't know their code was erroneous and the JVM also didn't throw any exceptions (prior to this patch). @dcubed-ojdk > If the VM changes are backed out, what does the new test do? If revert this patch, attached test causes JVM crash somewhat randomly. ------------- PR Comment: https://git.openjdk.org/jdk/pull/8241#issuecomment-1617109332 From dholmes at openjdk.org Mon Jul 3 02:15:01 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 02:15:01 GMT Subject: RFR: 8284877: Check type compatibility before looking up method from receiver's vtable [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 01:59:33 GMT, Yi Yang wrote: > I mean, users constructs object by JNI and crash when interpreting odpsFileStatus.getPath().toString(), odpsFileStatus.getPath().toString() is not a JNI call. Sorry I don't understand what you are referring to. Java code is checked by bytecode verification. If you are using Unsafe then it is up to you to use it correctly just like JNI. ------------- PR Comment: https://git.openjdk.org/jdk/pull/8241#issuecomment-1617117955 From yyang at openjdk.org Mon Jul 3 02:27:07 2023 From: yyang at openjdk.org (Yi Yang) Date: Mon, 3 Jul 2023 02:27:07 GMT Subject: RFR: 8284877: Check type compatibility before looking up method from receiver's vtable [v2] In-Reply-To: References: Message-ID: <5cRhxbprtjXAVknAdi8TFul85LNQ7mFI6mFFQsJi_2c=.085ea60e-8af5-497a-a5b1-3c62df90ea79@github.com> On Mon, 3 Jul 2023 02:12:36 GMT, David Holmes wrote: > > I mean, users constructs object by JNI and crash when interpreting odpsFileStatus.getPath().toString(), odpsFileStatus.getPath().toString() is not a JNI call. > > Sorry I don't understand what you are referring to. Java code is checked by bytecode verification. If you are using Unsafe then it is up to you to use it correctly just like JNI. Yes, the problem is Java code `odpsFileStatus.getPath().toString()` works even if `odpsFileStatus.getPath()` is String object (it was decalred as xxx.Path) ------------- PR Comment: https://git.openjdk.org/jdk/pull/8241#issuecomment-1617125429 From dholmes at openjdk.org Mon Jul 3 02:36:08 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 02:36:08 GMT Subject: RFR: 8284877: Check type compatibility before looking up method from receiver's vtable [v3] In-Reply-To: <1B37d_ZmC67RJpxgo-BpisYKwtymIrBvEK_BASxVWaM=.f1253c77-5746-412e-8ef7-65380e48bb6b@github.com> References: <1B37d_ZmC67RJpxgo-BpisYKwtymIrBvEK_BASxVWaM=.f1253c77-5746-412e-8ef7-65380e48bb6b@github.com> Message-ID: On Fri, 16 Jun 2023 02:26:23 GMT, Yi Yang wrote: >> Hi, can I have a review for this enhancement? This patch adds type compatibility check before method lookup for robustness. In some internal cases, serialization framework may improperly generate an object of wrong type, which leads JVM randomly crashes during method resolution. >> >> For example: >> >> invokevirtual selected method: receiver-class:java.util.ArrayList, resolved-class:com.taobao.forest.domain.util.LongMapSupportArrayList, resolved_method:com.taobao.forest.domain.util.LongMapSupportArrayList.toMap()Ljava/util/Map;, selected_method:0x458, vtable_index:56# >> >> The real type of receiver is ArrayList, while the resolved method is LongMapSupportArrayList.toMap. VM attempts to select method as if looking up from receiver's vtable via vtable index of resolved method(i.e. attempts to lookup `toMap()` from >> ArrayList), an invalid method or incorrect method would be selected, thus causing some strange crashes. >> >> I think it's reasonable to add a type compatibility check before method lookup. If such an incompatible call is found, JVM could throw an exception instead. > > Yi Yang has updated the pull request incrementally with one additional commit since the last revision: > > additional type check when -Xcheck:jni If you are saying you have plain Java code (no use of Unsafe or JNI) that is crashing then please provide a reproducer for that crash. ------------- PR Comment: https://git.openjdk.org/jdk/pull/8241#issuecomment-1617130547 From dholmes at openjdk.org Mon Jul 3 06:40:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 06:40:56 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v4] In-Reply-To: References: Message-ID: On Sun, 2 Jul 2023 21:51:59 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR proposes a `TraceClassLoadingCause` VM flag: >> >> >> product(ccstr, TraceClassLoadingCause, nullptr, DIAGNOSTIC, \ >> "Print a stack trace when loading a class whose fully" \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> I would have liked to implement this using Unified Logging but UL has no support for filtering on the class names. >> >> Example usage: >> >> java -XX:+UnlockDiagnosticVMOptions -XX:TraceClassLoadingCause=Thread --version >> Loading java.lang.Thread >> Loading java.lang.Thread$FieldHolder >> Loading java.lang.Thread$Constants >> Loading java.lang.Thread$UncaughtExceptionHandler >> Loading java.lang.ThreadGroup >> Loading java.lang.BaseVirtualThread >> Loading java.lang.VirtualThread >> Loading java.lang.ThreadBuilders$BoundVirtualThread >> Loading java.lang.Thread$State >> at jdk.internal.misc.VM.toThreadState(java.base/VM.java:336) >> at java.lang.Thread.threadState(java.base/Thread.java:2731) >> at java.lang.Thread.isTerminated(java.base/Thread.java:2738) >> at java.lang.Thread.getThreadGroup(java.base/Thread.java:1957) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:314) >> at java.lang.System.initPhase1(java.base/System.java:2206) >> Loading java.lang.Thread$ThreadIdentifiers >> at java.lang.Thread.(java.base/Thread.java:734) >> at java.lang.Thread.(java.base/Thread.java:1477) >> at java.lang.ref.Reference$ReferenceHandler.(java.base/Reference.java:198) >> at java.lang.ref.Reference.startReferenceHandlerThread(java.base/Reference.java:300) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:318) >> at java.lang.System.initPhase1(java.base/System.java:2206) >> Loading java.lang.ref.Finalizer$FinalizerThread >> at java.lang.ref.Finalizer.startFinalizerThread(java.base/Finalizer.java:187) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:319) >> at java.lang.Sy... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > add class+load+cause and make its output independent of class+load One nit but otherwise I think this looks okay. Isolating from class+load seems reasonable. Thanks for the sample output. src/hotspot/share/oops/instanceKlass.cpp line 3839: > 3837: if (log_cause_native || log_is_enabled(Info, class, load, cause)) { > 3838: JavaThread* current = JavaThread::current(); > 3839: ResourceMark rm; Nit: use `ResourceMark rm(current);` please src/hotspot/share/oops/instanceKlass.cpp line 3869: > 3867: info_stream.print_cr("Native stack when loading %s:", name); > 3868: > 3869: // Print each native stack line to the log `LogStream` extends `OutputStream` which has some support for indentation. I thought there should be a way to simplify this, but given you have to break into lines anyway the `\t` seems easier than using any of the built-in indentation functionality. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14553#pullrequestreview-1510357762 PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1250228466 PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1250316925 From djelinski at openjdk.org Mon Jul 3 06:49:58 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Mon, 3 Jul 2023 06:49:58 GMT Subject: RFR: 8310948: Fix ignored-qualifiers warning in Hotspot In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 00:19:56 GMT, David Holmes wrote: >> Please review this attempt to fix ignored-qualifiers warning. >> >> Example warnings: >> >> src/hotspot/share/oops/method.hpp:413:19: warning: 'volatile' type qualifier on return type has no effect [-Wignored-qualifiers] >> CompiledMethod* volatile code() const; >> ^~~~~~~~~ >> >> >> src/hotspot/share/jfr/periodic/jfrModuleEvent.cpp:65:20: warning: type qualifiers ignored on cast result type [-Wignored-qualifiers] >> 65 | event.set_source((const ModuleEntry* const)from_module); >> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> >> The proposed fix removes the ignored qualifiers. >> In a few AD files I replaced `const` with `constexpr` where I noticed that the method is returning a compile-time constant, and other platforms use `constexpr` on the same method. >> >> Release, debug and slowdebug builds on Aarch64 / x64 and Mac / Linux complete without errors. Cross-compile GHA builds also pass. > > src/hotspot/cpu/aarch64/aarch64.ad line 2288: > >> 2286: //============================================================================= >> 2287: >> 2288: const bool Matcher::match_rule_supported(int opcode) { > > Have to wonder if these were all meant to be `bool Match:xxx() const {`? Yes, I think that may have been the original intent. I'll add const on these functions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14674#discussion_r1250340625 From aboldtch at openjdk.org Mon Jul 3 07:06:54 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 3 Jul 2023 07:06:54 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v3] In-Reply-To: References: Message-ID: On Sat, 1 Jul 2023 15:05:13 GMT, Coleen Phillimore wrote: >> I noticed this cleanup in a patch that Axel shared with me that I thought should be pushed on its own as trivial. >> Tested with tier1 on Oracle supported platforms and looked for these on the others. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > This should be uintptr_t. Marked as reviewed by aboldtch (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14737#pullrequestreview-1510508553 From thartmann at openjdk.org Mon Jul 3 07:21:12 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 3 Jul 2023 07:21:12 GMT Subject: [jdk21] RFR: 8310829: guarantee(!HAS_PENDING_EXCEPTION) failed in ExceptionTranslation::doit Message-ID: Backport of [JDK-8310829](https://bugs.openjdk.java.net/browse/JDK-8310829). Applies cleanly. Thanks, Tobias ------------- Commit messages: - 8310829: guarantee(!HAS_PENDING_EXCEPTION) failed in ExceptionTranslation::doit Changes: https://git.openjdk.org/jdk21/pull/91/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=91&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310829 Stats: 176 lines in 6 files changed: 123 ins; 21 del; 32 mod Patch: https://git.openjdk.org/jdk21/pull/91.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/91/head:pull/91 PR: https://git.openjdk.org/jdk21/pull/91 From djelinski at openjdk.org Mon Jul 3 07:27:58 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Mon, 3 Jul 2023 07:27:58 GMT Subject: RFR: 8310948: Fix ignored-qualifiers warning in Hotspot In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 06:47:04 GMT, Daniel Jeli?ski wrote: >> src/hotspot/cpu/aarch64/aarch64.ad line 2288: >> >>> 2286: //============================================================================= >>> 2287: >>> 2288: const bool Matcher::match_rule_supported(int opcode) { >> >> Have to wonder if these were all meant to be `bool Match:xxx() const {`? > > Yes, I think that may have been the original intent. I'll add const on these functions. ...actually these methods are static, and static functions can't be const-qualified. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14674#discussion_r1250385599 From dholmes at openjdk.org Mon Jul 3 07:33:57 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 07:33:57 GMT Subject: RFR: 8310948: Fix ignored-qualifiers warning in Hotspot In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:24:56 GMT, Daniel Jeli?ski wrote: >> Yes, I think that may have been the original intent. I'll add const on these functions. > > ...actually these methods are static, and static functions can't be const-qualified. Ah okay :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14674#discussion_r1250393159 From pli at openjdk.org Mon Jul 3 07:37:22 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 07:37:22 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: > ## TL;DR > > This patch completely re-implements C2's experimental post loop vectorization for better stability, maintainability and performance. Compared with the original implementation, this new implementation adds a standalone loop phase in C2's ideal loop phases and can vectorize more post loops. The original implementation and all code related to multi-versioned post loops are deleted in this patch. More details about this patch can be found in the document replied in this pull request. Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: Address part of comments from Emanuel ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14581/files - new: https://git.openjdk.org/jdk/pull/14581/files/11fe4cd6..a58e04e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14581&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14581&range=00-01 Stats: 172 lines in 8 files changed: 63 ins; 20 del; 89 mod Patch: https://git.openjdk.org/jdk/pull/14581.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14581/head:pull/14581 PR: https://git.openjdk.org/jdk/pull/14581 From pli at openjdk.org Mon Jul 3 07:46:15 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 07:46:15 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 09:48:27 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/loopnode.cpp line 2280: > >> 2278: if (!stride_is_con()) { >> 2279: // Stride could be non-constant if a loop is vector masked >> 2280: return 0; > > Could this break the assumption anywhere else that `stride_con != 0`? > I fear that it may just silently succeed everywhere, or do checks like: > > if (stride_con() > 0) { > // assume positive > } else { > // assume negative (now wrong!) > } > > Might it be better to have an assert here, and do the `stride_is_con` checks at the call sites of `stride_con`? I have reverted this change and turned to update `CountedLoopNode::stride_con()` (and add asserts there) to mitigate this potential issue. That one is a call site of this function and "int" counted loop transformation directly calls there. Before my patch, that function may also return 0. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250410252 From pli at openjdk.org Mon Jul 3 07:46:17 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 07:46:17 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 10:36:49 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/loopnode.cpp line 4688: >> >>> 4686: for (LoopTreeIterator iter(_ltree_root); !iter.done(); iter.next()) { >>> 4687: IdealLoopTree* lpt = iter.current(); >>> 4688: if (lpt->is_counted() && lpt->is_innermost()) { >> >> Is this applied to all innermost counted loops? Or only post-loops? > > Ah, you do the check inside. Why not lift it out and assert inside? I have lift the checks out in commit 2 and added assertions inside. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250411661 From djelinski at openjdk.org Mon Jul 3 07:55:10 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Mon, 3 Jul 2023 07:55:10 GMT Subject: RFR: 8310948: Fix ignored-qualifiers warning in Hotspot In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:22:43 GMT, Daniel Jeli?ski wrote: > Please review this attempt to fix ignored-qualifiers warning. > > Example warnings: > > src/hotspot/share/oops/method.hpp:413:19: warning: 'volatile' type qualifier on return type has no effect [-Wignored-qualifiers] > CompiledMethod* volatile code() const; > ^~~~~~~~~ > > > src/hotspot/share/jfr/periodic/jfrModuleEvent.cpp:65:20: warning: type qualifiers ignored on cast result type [-Wignored-qualifiers] > 65 | event.set_source((const ModuleEntry* const)from_module); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > The proposed fix removes the ignored qualifiers. > In a few AD files I replaced `const` with `constexpr` where I noticed that the method is returning a compile-time constant, and other platforms use `constexpr` on the same method. > > Release, debug and slowdebug builds on Aarch64 / x64 and Mac / Linux complete without errors. Cross-compile GHA builds also pass. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14674#issuecomment-1617559670 From djelinski at openjdk.org Mon Jul 3 07:55:11 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Mon, 3 Jul 2023 07:55:11 GMT Subject: Integrated: 8310948: Fix ignored-qualifiers warning in Hotspot In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:22:43 GMT, Daniel Jeli?ski wrote: > Please review this attempt to fix ignored-qualifiers warning. > > Example warnings: > > src/hotspot/share/oops/method.hpp:413:19: warning: 'volatile' type qualifier on return type has no effect [-Wignored-qualifiers] > CompiledMethod* volatile code() const; > ^~~~~~~~~ > > > src/hotspot/share/jfr/periodic/jfrModuleEvent.cpp:65:20: warning: type qualifiers ignored on cast result type [-Wignored-qualifiers] > 65 | event.set_source((const ModuleEntry* const)from_module); > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > The proposed fix removes the ignored qualifiers. > In a few AD files I replaced `const` with `constexpr` where I noticed that the method is returning a compile-time constant, and other platforms use `constexpr` on the same method. > > Release, debug and slowdebug builds on Aarch64 / x64 and Mac / Linux complete without errors. Cross-compile GHA builds also pass. This pull request has now been integrated. Changeset: 055b4b42 Author: Daniel Jeli?ski URL: https://git.openjdk.org/jdk/commit/055b4b426cbc56d97e82219f3dd3aba1ebf977e4 Stats: 223 lines in 74 files changed: 0 ins; 0 del; 223 mod 8310948: Fix ignored-qualifiers warning in Hotspot Reviewed-by: kbarrett, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/14674 From pli at openjdk.org Mon Jul 3 07:59:10 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 07:59:10 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 10:43:34 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/loopnode.hpp line 143: > >> 141: if (is_vector_masked()) { >> 142: return false; >> 143: } > > Does this mean that the post-loop has a `CountedLoop` node, but it does not adhere to the counted-loop assumptions, such as having a `incr`, `limit`, `phi` etc? With the old post-loop-vectorization, the LoopNode would always fold away, so it would disappear after IGVN. But now it would stick around, right? Could that turn out to be a problem? After being vectorized, the post loop still has `phi`, `incr` and `limit` as before. In other words, the post loop is still a loop now. I think the only difference is that the loop stride value is not a constant any more (as we introduces the `VectorMaskTrueCountNode` for the new stride). The old implementation of post loop vectorization makes the vector-masked post loop run only once so it can optimize the `LoopNode` away. But we cannot do this now without doing multi-versioning. (Without the scalar post loop, loop may run insufficient iterations when the "atomic" post loop is not entered.) > src/hotspot/share/opto/loopnode.hpp line 775: > >> 773: >> 774: void collect_loop_core_nodes(PhaseIdealLoop* phase, Unique_Node_List& wq) const; >> 775: > > nit: why move it? This function was private before. I need to make it public so I can use it in `vmaskloop.cpp`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250428341 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250430133 From epeter at openjdk.org Mon Jul 3 07:59:12 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 07:59:12 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 11:24:59 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vectornode.hpp line 1826: > >> 1824: class LoopVectorMaskNode : public TypeNode { >> 1825: private: >> 1826: int _max_trips; > > Add comment: what is this for exactly? Maybe consider adding more elaborate specification/description above the 3 node classes. > > General code style: I think we are trying to get away from the `//--------------NodeName/FunctionName-------` tags, so no need to add them anymore. That is already much better. Could you please also explain what the inputs mean and do? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250426626 From pli at openjdk.org Mon Jul 3 08:17:10 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:17:10 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <6IkvVTm9e60qXwaID0EihRXlUielrryBWoTmYAp3PuU=.c624b13d-bc6d-4c79-86a6-72bda016b50f@github.com> References: <6IkvVTm9e60qXwaID0EihRXlUielrryBWoTmYAp3PuU=.c624b13d-bc6d-4c79-86a6-72bda016b50f@github.com> Message-ID: On Fri, 23 Jun 2023 10:49:50 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/superword.cpp line 179: > >> 177: assert(_packset.length() == 0, "packset must be empty"); >> 178: success = SLP_extract(); >> 179: if (PostLoopMultiversioning) { > > Could we now have an assert for `cl->is_main_loop()` at the beginning of `SuperWord::transform_loop`, and remove all checks for it in SuperWord? Unfortunately, I just tried updating this but found assertion failures. I see `SuperWord::transform_loop()` is also called in `IdealLoopTree::policy_unroll_slp_analysis()` which can pass a normal loop (the loop before iteration-split). I assume only main loops require unrolling analysis and don't understand why it could be a normal loop. Maybe that's bad code and we need refactor C2's unrolling analysis first. > src/hotspot/share/opto/superword.cpp line 632: > >> 630: cl->set_slp_pack_count(_packset.length()); >> 631: } >> 632: } else { > > Again: Could we now have an assert for `cl->is_main_loop()` at the beginning of `SuperWord::SLP_extract`, and remove all checks for it in SuperWord? Ok to do it here as `do_optimization` is false in the unrolling analysis phase. I've updated the code in commit 2. > src/hotspot/share/opto/superword.hpp line 251: > >> 249: int count_size(int size) { >> 250: return _stats[exact_log2(size)]; >> 251: } > > Add assert from `record_size`? Done, thanks! > src/hotspot/share/opto/superword.hpp line 666: > >> 664: IdealLoopTree* lpt() const { return _lpt; } >> 665: PhiNode* iv() const { >> 666: return _slp ? _slp->iv() : _lpt->_head->as_CountedLoop()->phi()->as_Phi(); > > I'd suggest either cache it directly from `_lpt->_head->as_CountedLoop()->phi()->as_Phi()`, or just query it directly. Reduce dependence on `_slp`. Good catch! What do you think of getting rid of `_slp` completely in `SWPointer` refactoring? > src/hotspot/share/opto/superword.hpp line 669: > >> 667: } >> 668: >> 669: void init(); > > This is just a helper function for the constructors, right? Maybe move it closer to them? Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250443717 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250447018 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250450494 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250452509 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250452658 From pli at openjdk.org Mon Jul 3 08:17:11 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:17:11 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: <7iru0xDm4lckuwyHvqGSld0_kWUVYSTg5BT3-rqP3Vw=.cab94902-98aa-40ed-ae13-d238380b6267@github.com> On Fri, 23 Jun 2023 11:02:39 GMT, Emanuel Peter wrote: >> If you are going to do that, I'd suggest doing this refactoring in a separate RFE. It would help in general with any future extension to auto-vectorization. > > Can we untangle it completely from SuperWord? it seems you have made it optional, so yes. And maybe we can also make the trace flags like `_slp->is_trace_alignment()` independent? It would be nice to also be able to trace this for non SuperWord-contexts like post-loop masked vectoriaztion, right? I will try to do this in another JBS and come back here later. >> After all, should the `VectorizeDebug` flag not apply everywhere? See `phase->C->directive()->VectorizeDebugOption`. > > I'd also move this to some static functions in a potential "autovectorization.hpp", and move `_vector_loop_debug` there, together with all its `is_trace...` accessors. I agree current code here is a bit ugly. I will try to make it better in `SWPointer` refactoring. >> Oh dear, I just saw the same pattern in: >> >> bool TypeNode::cmp(const Node& n) const { >> return !Type::cmp(_type, ((TypeNode&)n)._type); >> } >> >> We should try to avoid doing that. > > Even if all callers currently ensure that `n` has the correct type, I'd say it is still not a great idea to cast without checking, at least in debug. I searched all C2 code and saw a lot of such patterns. Perhaps doing this in another RFE? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250448780 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250450253 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250456476 From pli at openjdk.org Mon Jul 3 08:17:13 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:17:13 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:53:34 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/vectornode.hpp line 1826: >> >>> 1824: class LoopVectorMaskNode : public TypeNode { >>> 1825: private: >>> 1826: int _max_trips; >> >> Add comment: what is this for exactly? Maybe consider adding more elaborate specification/description above the 3 node classes. >> >> General code style: I think we are trying to get away from the `//--------------NodeName/FunctionName-------` tags, so no need to add them anymore. > > That is already much better. Could you please also explain what the inputs mean and do? Ok, will do that later ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250453061 From pli at openjdk.org Mon Jul 3 08:21:08 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:21:08 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <9J-XGP_2qSJT-EefUtvLMt1HzWHWgtvN3RmanPRDt0I=.71bc5695-7eab-4d29-8ff4-b20f28721247@github.com> References: <9J-XGP_2qSJT-EefUtvLMt1HzWHWgtvN3RmanPRDt0I=.71bc5695-7eab-4d29-8ff4-b20f28721247@github.com> Message-ID: On Fri, 23 Jun 2023 11:58:16 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.hpp line 85: > >> 83: >> 84: // Some node check utilities >> 85: bool is_loop_iv(Node* n) { return n == _iv; } > > General code style comment, applies everywhere: add more `const` everywhere. To arguments, and the functions themselves, wherever possible. Thanks for pointing out. I added some in commit 2. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250463493 From pli at openjdk.org Mon Jul 3 08:34:10 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:34:10 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 12:22:14 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 63: > >> 61: if (cl->is_vector_masked()) return; >> 62: // Skip non-post loop >> 63: if (!cl->is_post_loop()) return; > > Check before entering, and assert here. Done > src/hotspot/share/opto/vmaskloop.hpp line 95: > >> 93: } >> 94: return false; >> 95: } > > Do you not want to do this sort of implementation in `SWPointer` instead? There are already methods like `scaled_iv_plus_offset`, so it would fit in next to that, right? It doesn't fit well as functions in `SWPointer` can only be used for checking the pattern in indices. But this function may be used for checking the loop increment pattern which is not in array indices, perhaps `a[i] = b[i] * (i + 1)`. We don't have `SWPointer` constructed for this. I have rename the function to make the purpose clear. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250480876 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250480274 From pli at openjdk.org Mon Jul 3 08:34:13 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:34:13 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: <-Rjm33TqNYRuAQYLf6FL4rnNZOgLQvuDukt5Te-oXNM=.14d0efe5-c5f7-4a2e-8c51-bb18a6f19937@github.com> On Fri, 23 Jun 2023 12:08:34 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/vmaskloop.hpp line 97: >> >>> 95: } >>> 96: >>> 97: bool is_memory_phi(Node* n) { >> >> Looks like a helper method that could live in `node.hpp` or `cfgnode.hpp`. > > SuperWord also makes similar checks, you could refactor those too. Good suggestion. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250480720 From pli at openjdk.org Mon Jul 3 08:44:09 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:44:09 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 14:06:33 GMT, Emanuel Peter wrote: > Tests are building... > > I already am getting this, from our build system: `Toolchain: clang (clang/LLVM from Xcode 12.4)`, for the `macosx-aarch64-...` builds. > > ``` > .../src/hotspot/share/opto/vmaskloop.cpp:970:20: error: format string is not a string literal [-Werror,-Wformat-nonliteral] > tty->vprint_cr(format, ap); > ``` > > That means we won't get any test coverage on those platforms from this test run. Build issues are fixed in commit 2 by removing the `va_list` which is not actually used in current code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1617635167 From pli at openjdk.org Mon Jul 3 08:44:16 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 08:44:16 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: <63xntDgcTJN-51cfPjP1XsWdNLkeURQuWmE8hluHbIM=.84e6a8b9-66b1-4427-ab2c-355c0c621871@github.com> On Fri, 23 Jun 2023 12:24:57 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 71: > >> 69: if (cl->loopexit()->in(0) != cl) return; >> 70: // Skip if some loop operations are pinned to the backedge >> 71: if (cl->back_control()->outcnt() != 1) return; > > It would be interesting to have some trace flag that tells us why we bailed out here and did not do the post-loop vectorization. Unless of course it becomes too noisy. Great suggestion! Done. > src/hotspot/share/opto/vmaskloop.cpp line 104: > >> 102: _core_set.clear(); >> 103: _body_set.clear(); >> 104: _body_nodes.clear(); > > Would it make sense to somehow reserve the space, so that we do not allocate multiple times when growing these data structures later? Could you elaborate how to do such reservation in C2? Just allocation with some larger sizes at the beginning? Or any other examples to refer? > src/hotspot/share/opto/vmaskloop.cpp line 172: > >> 170: if (idx != -1) { >> 171: trace_msg(nullptr, "Loop has unreachable node while traversing from head"); >> 172: return false; > > Can this ever happen? Or could you add an assert here? Yes, it happened before. I will try to find a case. > src/hotspot/share/opto/vmaskloop.cpp line 214: > >> 212: } >> 213: } else if (in->is_Phi()) { >> 214: // 2) We don't support phi nodes except the iv phi of the loop > > Add: and memory phi's cannot be reached. Done > src/hotspot/share/opto/vmaskloop.cpp line 223: > >> 221: return true; >> 222: } else { >> 223: trace_msg(in, "Found unsupported memory load input"); > > This is a bit generic. Would be nice to have more specific info why it is "unsupported". See my example that hit it. Good suggestion! I have added more `trace_msg()` calls in `VectorMaskedLoop::mem_access_to_swpointer`. > src/hotspot/share/opto/vmaskloop.cpp line 269: > >> 267: Node_List* worklist = new Node_List(_arena); >> 268: if (!collect_statements_helper(store, MemNode::ValueIn, stmt, worklist)) { >> 269: return false; > > Why does the `store` need special handling here? Can you not just throw it on the `worklist`? Would be nice to have the code be shorter ;) Good catch! Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250481354 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250488640 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250483771 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250492279 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250494389 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250492173 From pli at openjdk.org Mon Jul 3 09:07:12 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 09:07:12 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> Message-ID: On Tue, 27 Jun 2023 16:58:52 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 550: >> >>> 548: // 2) Address is growing down (index scale * loop stride < 0) >>> 549: // 3) Memory access scale is different from data size >>> 550: // 4) The loop increment node is on the SWPointer's node stack >> >> Why should the `incr` not be on the node stack? > > Does that not prevent `a[i+1]` from being accepted? That's a really corner case. In C2's ideal graph, most loop statements eventually uses the loop induction variable `phi` node as a input. That's good. But, there is one exception that a loop statement has a sub-expression of `iv + stride`. In this kind of cases, IGVN may do common sub-expression elimination and the inputs may come from the loop increment node thereafter. As the final step of vector masked transformation replaces the loop increment node, the calculation for `iv + stride` will also be replaced as well and it causes mis-compilation. In current patch, I duplicate the loop increment pattern for update (that's why we have `is_loop_incr_pattern()`, see commit 2) to avoid this issue, but currently it only applies to the expression not in array indices, such as `a[i] = i + 1`. For the patterns like `a[i+1] = i`, I'm still looking for a better approach to handle. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250523973 From chagedorn at openjdk.org Mon Jul 3 09:13:08 2023 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Mon, 3 Jul 2023 09:13:08 GMT Subject: [jdk21] RFR: 8310829: guarantee(!HAS_PENDING_EXCEPTION) failed in ExceptionTranslation::doit In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:13:14 GMT, Tobias Hartmann wrote: > Backport of [JDK-8310829](https://bugs.openjdk.java.net/browse/JDK-8310829). Applies cleanly. > > Thanks, > Tobias Looks good. ------------- Marked as reviewed by chagedorn (Reviewer). PR Review: https://git.openjdk.org/jdk21/pull/91#pullrequestreview-1510737726 From pli at openjdk.org Mon Jul 3 09:15:20 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 09:15:20 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> References: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> Message-ID: On Fri, 23 Jun 2023 14:44:43 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 548: > >> 546: // Check supported memory access via SWPointer. It's not supported if >> 547: // 1) The constructed SWPointer is invalid >> 548: // 2) Address is growing down (index scale * loop stride < 0) > > Is that a limitation that could be removed in the future? Yes, at least on SVE2. For growing up memory accesses, we generate vector masks that indicate active lanes at lower parts of a vector. But it's opposite for growing down memory accesses where active lanes are at higher parts of a vector. Only SVE2 of AArch64 can generate vector masks in this way, current SVE(1) can not. I'm not sure whether x86 AVX-512 has the similar ability. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250534252 From pli at openjdk.org Mon Jul 3 09:23:15 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 09:23:15 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> References: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> Message-ID: On Fri, 23 Jun 2023 14:45:20 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 549: > >> 547: // 1) The constructed SWPointer is invalid >> 548: // 2) Address is growing down (index scale * loop stride < 0) >> 549: // 3) Memory access scale is different from data size > > I guess this could also be relaxed for strided accesses in the future? Exactly! I have tried supporting some basic strided accesses. The code is not included in this patch as it's not that beneficial on some CPUs and requires more C2 refactorings. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250541657 From pli at openjdk.org Mon Jul 3 09:23:12 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 09:23:12 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 14:54:27 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 317: >> >>> 315: >>> 316: // Find element basic type for each vectorization candidate node >>> 317: bool VectorMaskedLoop::find_vector_element_types() { >> >> This is very similar to `SuperWord::compute_vector_element_type`. It would be nice to extract it from both and have some shared utility, right? > > Or is there a clear reason why the two are too different? We need more investigation and discussions about this. Will discuss with you later. >> src/hotspot/share/opto/vmaskloop.cpp line 337: >> >>> 335: // For load node, check if it has the same vector element size with >>> 336: // the bottom type of the statement >>> 337: if (!same_element_size(mem_type, stmt_bottom_type)) { >> >> Can this limitation be removed in the future? > > Write: > Vector element size does not match of the store in the statement. Yes, we have tried supporting type conversions (between different type sizes) but current solution is not mature and not included in this patch. So this limitation is added here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250546161 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250545053 From pli at openjdk.org Mon Jul 3 09:36:10 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 09:36:10 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 14:56:25 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 363: > >> 361: // Otherwise, use signed subword type or the statement's bottom type >> 362: if (subword_stmt) { >> 363: set_elem_bt(node, get_signed_subword_bt(stmt_bottom_type)); > > Why are you taking only the signed subword type, and not unsigned (eg for char you take short)? Current SuperWord also does in this way (see `SuperWord::container_type()`). A main reason is that some matching rules on some backends (like x86) only matches signed subword type. AFAICR, it's good to removing this for AArch64. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250564508 From pli at openjdk.org Mon Jul 3 09:48:11 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 3 Jul 2023 09:48:11 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 23 Jun 2023 15:02:15 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 357: > >> 355: set_elem_bt(node, mem_type); >> 356: } else { >> 357: trace_msg(node, "Subword operand does not have precise type"); > > Not clear to me what this means. Precise type info about signedness means that we know exactly whether the data is signed or unsigned. For some operations, such as right shift, results are different for signed and unsigned operands, so C2 has to know the signedness. However, in any Java arithmetic operation, operands of Java subword types are promoted to int first. Sometimes, for example, if an intermediate result is a binary operation of both signed and unsigned, we don't have the precise type info, so we don't know how to vectorize it. (see below example where the signedness info is lost after a short and a char are added) for (int i = 0; i < SIZE; i++) { shorts[i] = (shorts[i] + chars[i]) >> 10; } > src/hotspot/share/opto/vmaskloop.cpp line 367: > >> 365: BasicType self_type = node->bottom_type()->array_element_basic_type(); >> 366: if (!same_element_size(self_type, stmt_bottom_type)) { >> 367: trace_msg(node, "Vector element size does not match"); > > does not match with what? size of store of statement? The message is updated. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250582207 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250582846 From jsjolen at openjdk.org Mon Jul 3 10:18:59 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 3 Jul 2023 10:18:59 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 16:26:43 GMT, Thomas Stuefe wrote: > Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: > - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) > - that THPs are enabled if that page size is >0. > > Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). > > ------ > > About the patch: > > This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. > > Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). > > ------------- > > Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: > > > thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled > always madvise [never] > thomas at starfish $ cat /proc/meminfo | grep Hugepage > Hugepagesize: 1048576 kB > > > Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Using the default large page size: 1G > [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G > ... > [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G > > > With patch, we correctly refuse to use large pages (and we log more info): > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) > [0.001s][info][pagesize] default pagesize: 1G > [0.001s][info][pagesize] Transparent hugepage (THP) support: > [0.001s][info][pagesize] mode: never > [0.001s][warning][pagesize] UseLargePages ... Hi Thomas, Thanks for these changes. I've some suggestions for the code, but the conceptual change is good. src/hotspot/os/linux/hugepages.cpp line 65: > 63: // > 64: // If we can't determine the value (e.g. /proc is not mounted, or the text > 65: // format has been changed), we'll set largest page size to 0 This is more of an API contract, move it up to `// Scan /proc/meminfo and return value of Hugepagesize` src/hotspot/os/linux/hugepages.cpp line 68: > 66: > 67: FILE *fp = os::fopen("/proc/meminfo", "r"); > 68: if (fp) { Nit: you could always switch to `if (fp == nullptr) { return 0; }`, reducing indentation for the body that will do actual work. src/hotspot/os/linux/hugepages.cpp line 84: > 82: } > 83: } > 84: } Perhaps this pattern is a bit simpler? ```c++ char* line = nullptr; size_t bufsize = 0; size_t nread = 0; int x = 0; while ((nread = getline(&line, &bufsize, fp)) != -1 ) { if (sscanf(line, "Hugepagesize: %d kB", &x) == 1) { /* ... */ } if (sscanf(line, "Hugepageisze: %d", &x) == 1) { /* ... */ } } free(line) // We only need to free once. src/hotspot/os/linux/hugepages.cpp line 93: > 91: // Given a file that contains a single (integral) number, return that number, 0 and false if failed. > 92: static bool read_number_file(const char* file, size_t* out) { > 93: (*out) = 0; Unnecessary parens src/hotspot/os/linux/hugepages.cpp line 160: > 158: LogStream ls(lt); > 159: print_on(&ls); > 160: } As `print_on` prints several lines you want to use `NonInterleavingLogStream`. Do this: ```c++ LogMessage(pagesize) lm; NonInterleavingLogStream ls{LogLevelType::Info, lm}; if (ls.is_enabled()) { print_on(&ls); } ------------- PR Review: https://git.openjdk.org/jdk/pull/14739#pullrequestreview-1510818174 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250593887 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250598663 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250590707 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250596689 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250614675 From jsjolen at openjdk.org Mon Jul 3 10:19:00 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 3 Jul 2023 10:19:00 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 10:07:17 GMT, Johan Sj?len wrote: >> Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) >> [0.001s][info][pagesize] default pagesize: 1G >> [0.001s][info][pagesize] Transparent hugepage (THP) suppor... > > src/hotspot/os/linux/hugepages.cpp line 160: > >> 158: LogStream ls(lt); >> 159: print_on(&ls); >> 160: } > > As `print_on` prints several lines you want to use `NonInterleavingLogStream`. > > Do this: > > ```c++ > LogMessage(pagesize) lm; > NonInterleavingLogStream ls{LogLevelType::Info, lm}; > if (ls.is_enabled()) { > print_on(&ls); > } I'm just using `{}`-init here, it's the same as using `()` so your choice on which to use. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250615279 From thartmann at openjdk.org Mon Jul 3 10:40:59 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 3 Jul 2023 10:40:59 GMT Subject: [jdk21] RFR: 8310829: guarantee(!HAS_PENDING_EXCEPTION) failed in ExceptionTranslation::doit In-Reply-To: References: Message-ID: <53DPSGZSvBq_pXpb-yMa-jcIA16QJQPhFa_4CiCop5E=.d489538a-7236-4073-8d84-8dac91f9f89a@github.com> On Mon, 3 Jul 2023 07:13:14 GMT, Tobias Hartmann wrote: > Backport of [JDK-8310829](https://bugs.openjdk.java.net/browse/JDK-8310829). Applies cleanly. > > Thanks, > Tobias Thanks, Christian! ------------- PR Comment: https://git.openjdk.org/jdk21/pull/91#issuecomment-1617875850 From thartmann at openjdk.org Mon Jul 3 10:41:00 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Mon, 3 Jul 2023 10:41:00 GMT Subject: [jdk21] Integrated: 8310829: guarantee(!HAS_PENDING_EXCEPTION) failed in ExceptionTranslation::doit In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:13:14 GMT, Tobias Hartmann wrote: > Backport of [JDK-8310829](https://bugs.openjdk.java.net/browse/JDK-8310829). Applies cleanly. > > Thanks, > Tobias This pull request has now been integrated. Changeset: 6de4e8f6 Author: Tobias Hartmann URL: https://git.openjdk.org/jdk21/commit/6de4e8f601852f9f4b96974dd210ccaf1f655145 Stats: 176 lines in 6 files changed: 123 ins; 21 del; 32 mod 8310829: guarantee(!HAS_PENDING_EXCEPTION) failed in ExceptionTranslation::doit Reviewed-by: chagedorn Backport-of: f6bdccb45caca0f69918a773a9ad9b2ad91b702f ------------- PR: https://git.openjdk.org/jdk21/pull/91 From lkorinth at openjdk.org Mon Jul 3 11:14:02 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 3 Jul 2023 11:14:02 GMT Subject: RFR: 8311086: Remove jtreg/gc/startup_warnings In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 12:14:50 GMT, Leo Korinth wrote: > It does not seem important to check that certain GCs are not deprecated. > > The test originally tested that we created deprecation warnings on features that are now removed. What is left is not worth having IMO. Thanks for your reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14709#issuecomment-1617962808 From lkorinth at openjdk.org Mon Jul 3 11:14:03 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 3 Jul 2023 11:14:03 GMT Subject: Integrated: 8311086: Remove jtreg/gc/startup_warnings In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 12:14:50 GMT, Leo Korinth wrote: > It does not seem important to check that certain GCs are not deprecated. > > The test originally tested that we created deprecation warnings on features that are now removed. What is left is not worth having IMO. This pull request has now been integrated. Changeset: 496f94b4 Author: Leo Korinth URL: https://git.openjdk.org/jdk/commit/496f94b48801dbaec24f1f107ebf8ee71780f522 Stats: 202 lines in 5 files changed: 0 ins; 202 del; 0 mod 8311086: Remove jtreg/gc/startup_warnings Reviewed-by: ayang, mli, kbarrett, tschatzl ------------- PR: https://git.openjdk.org/jdk/pull/14709 From rkennke at openjdk.org Mon Jul 3 12:37:42 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 3 Jul 2023 12:37:42 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v37] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 57 commits: - Merge branch 'master' into JDK-8139457 - Correctly handle oop array element aligment in 32bit builds; move method from Universe to Array - Require uncompressed oops to be 8-byte-aligned - Corresponding XGC fixes - Merge branch 'master' into JDK-8139457 - Fix calls to removed instanceOopDesc::header_size() - Add cast - Simplify aarch64 code - Simplify by moving gap clearing into initialize_header() - Merge branch 'master' into JDK-8139457 - ... and 47 more: https://git.openjdk.org/jdk/compare/ba974d5c...1d3db84e ------------- Changes: https://git.openjdk.org/jdk/pull/11044/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=36 Stats: 806 lines in 50 files changed: 526 ins; 164 del; 116 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From stuefe at openjdk.org Mon Jul 3 13:16:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 13:16:54 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 09:52:14 GMT, Johan Sj?len wrote: >> Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) >> [0.001s][info][pagesize] default pagesize: 1G >> [0.001s][info][pagesize] Transparent hugepage (THP) suppor... > > src/hotspot/os/linux/hugepages.cpp line 65: > >> 63: // >> 64: // If we can't determine the value (e.g. /proc is not mounted, or the text >> 65: // format has been changed), we'll set largest page size to 0 > > This is more of an API contract, move it up to `// Scan /proc/meminfo and return value of Hugepagesize` Again, unchanged code from before I'd like to keep unchanged to keep this patch minimally invasive. Mentally earmarked for future cleanups. > src/hotspot/os/linux/hugepages.cpp line 84: > >> 82: } >> 83: } >> 84: } > > Perhaps this pattern is a bit simpler? > > ```c++ > char* line = nullptr; > size_t bufsize = 0; > size_t nread = 0; > int x = 0; > while ((nread = getline(&line, &bufsize, fp)) != -1 ) { > if (sscanf(line, "Hugepagesize: %d kB", &x) == 1) { /* ... */ } > if (sscanf(line, "Hugepageisze: %d", &x) == 1) { /* ... */ } > } > free(line) // We only need to free once. Definitely simpler, but this code moved (almost) verbatim from the old `scan_default_large_page_size` in os_linux.cpp, and I would like to keep code changes to a minimum. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250877628 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250876290 From stuefe at openjdk.org Mon Jul 3 13:22:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 13:22:55 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 09:55:09 GMT, Johan Sj?len wrote: >> Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) >> [0.001s][info][pagesize] default pagesize: 1G >> [0.001s][info][pagesize] Transparent hugepage (THP) suppor... > > src/hotspot/os/linux/hugepages.cpp line 68: > >> 66: >> 67: FILE *fp = os::fopen("/proc/meminfo", "r"); >> 68: if (fp) { > > Nit: you could always switch to `if (fp == nullptr) { return 0; }`, reducing indentation for the body that will do actual work. Absolutely agree, for a future cleanup, see above. > src/hotspot/os/linux/hugepages.cpp line 93: > >> 91: // Given a file that contains a single (integral) number, return that number, 0 and false if failed. >> 92: static bool read_number_file(const char* file, size_t* out) { >> 93: (*out) = 0; > > Unnecessary parens I'd like to leave it. While I agree with you, Hotspot style guide says something like "write parentheses if order of ops is unclear". Most of the code in hotspot uses parentheses in assignments like this. I'd like to avoid mixing styles. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250886392 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250885629 From rkennke at openjdk.org Mon Jul 3 13:24:39 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 3 Jul 2023 13:24:39 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v38] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Build fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/1d3db84e..4ee4ca78 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=37 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=36-37 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From stuefe at openjdk.org Mon Jul 3 13:38:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 13:38:55 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect In-Reply-To: References: Message-ID: <7QVoA3Ew5bOqflalrTQdjFoY-3E--4BI5y9rI14UecQ=.40ca559b-af0a-4ef3-bc9f-7dac0a52e77d@github.com> On Mon, 3 Jul 2023 10:07:48 GMT, Johan Sj?len wrote: >> src/hotspot/os/linux/hugepages.cpp line 160: >> >>> 158: LogStream ls(lt); >>> 159: print_on(&ls); >>> 160: } >> >> As `print_on` prints several lines you want to use `NonInterleavingLogStream`. >> >> Do this: >> >> ```c++ >> LogMessage(pagesize) lm; >> NonInterleavingLogStream ls{LogLevelType::Info, lm}; >> if (ls.is_enabled()) { >> print_on(&ls); >> } > > I'm just using `{}`-init here, it's the same as using `()` so your choice on which to use. Okay. We are at initialization time, though; everything should still be single-threaded here. I dislike providing the log level twice, but LogTarget does not expose it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250907436 From stuefe at openjdk.org Mon Jul 3 13:41:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 13:41:54 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect In-Reply-To: <7QVoA3Ew5bOqflalrTQdjFoY-3E--4BI5y9rI14UecQ=.40ca559b-af0a-4ef3-bc9f-7dac0a52e77d@github.com> References: <7QVoA3Ew5bOqflalrTQdjFoY-3E--4BI5y9rI14UecQ=.40ca559b-af0a-4ef3-bc9f-7dac0a52e77d@github.com> Message-ID: On Mon, 3 Jul 2023 13:35:47 GMT, Thomas Stuefe wrote: >> I'm just using `{}`-init here, it's the same as using `()` so your choice on which to use. > > Okay. We are at initialization time, though; everything should still be single-threaded here. I dislike providing the log level twice, but LogTarget does not expose it. Wait, if I do this, I need to backport [8286180](https://bugs.openjdk.org/browse/JDK-8286180) before backporting this fix., right? :( Oh well, chances are this does not integrate cleanly anyway. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250911655 From stuefe at openjdk.org Mon Jul 3 13:56:09 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 13:56:09 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: References: Message-ID: > Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: > - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) > - that THPs are enabled if that page size is >0. > > Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). > > ------ > > About the patch: > > This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. > > Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). > > ------------- > > Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: > > > thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled > always madvise [never] > thomas at starfish $ cat /proc/meminfo | grep Hugepage > Hugepagesize: 1048576 kB > > > Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Using the default large page size: 1G > [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G > ... > [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G > > > With patch, we correctly refuse to use large pages (and we log more info): > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) > [0.001s][info][pagesize] default pagesize: 1G > [0.001s][info][pagesize] Transparent hugepage (THP) support: > [0.001s][info][pagesize] mode: never > [0.001s][warning][pagesize] UseLargePages ... Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - Merge branch 'JDK-8310233-Linux-THP-initialization-incorrect' of github.com:tstuefe/jdk into JDK-8310233-Linux-THP-initialization-incorrect - Add test case and modify log output slightly ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14739/files - new: https://git.openjdk.org/jdk/pull/14739/files/1e4dfcef..a65c346b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14739&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14739&range=00-01 Stats: 343 lines in 4 files changed: 317 ins; 19 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14739/head:pull/14739 PR: https://git.openjdk.org/jdk/pull/14739 From stuefe at openjdk.org Mon Jul 3 13:56:10 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 13:56:10 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: References: <7QVoA3Ew5bOqflalrTQdjFoY-3E--4BI5y9rI14UecQ=.40ca559b-af0a-4ef3-bc9f-7dac0a52e77d@github.com> Message-ID: <_j4bapWKtvCoEfs2PyxKmDJTqfAMuRePkPJX2yx2W-8=.6cedb0a3-817e-433e-b1f2-288844d2983c@github.com> On Mon, 3 Jul 2023 13:39:12 GMT, Thomas Stuefe wrote: >> Okay. We are at initialization time, though; everything should still be single-threaded here. I dislike providing the log level twice, but LogTarget does not expose it. > > Wait, if I do this, I need to backport [8286180](https://bugs.openjdk.org/browse/JDK-8286180) before backporting this fix., right? :( > > Oh well, chances are this does not integrate cleanly anyway. The more I think about this, the more doubts I have, sorry. Logging should not cost anything if disabled. But with your proposal, I pay for the construction of both LogMessage and NonInterleavingLogStream every time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1250923170 From stuefe at openjdk.org Mon Jul 3 13:56:30 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 13:56:30 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: References: Message-ID: <7iKNyKv_p6i5EFwZO659leoKimwYrJwQ1Og1KqZoHk4=.04f94689-92cf-47a4-b2f4-5c1fa760fcc3@github.com> On Mon, 3 Jul 2023 10:16:10 GMT, Johan Sj?len wrote: >> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8310233-Linux-THP-initialization-incorrect' of github.com:tstuefe/jdk into JDK-8310233-Linux-THP-initialization-incorrect >> - Add test case and modify log output slightly > > Hi Thomas, > > Thanks for these changes. I've some suggestions for the code, but the conceptual change is good. @jdksjolen Thanks a lot for your review! See inline remarks. Most of your code suggestions are good, but this patch just moved the static hugepage detection parts out of os_linux.cpp, and left them (mostly) alone otherwise, to avoid adding regressions. I'll keep your input in mind for the next round of cleanups. Wrt NonInterleavingLogStream, I decided to not do that. It is not needed, since we are single-threaded, and it without it I may still be lucky to get it integrated cleanly to at least 17. I added a new repro case. Please take a look! Thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1618333548 From stuefe at openjdk.org Mon Jul 3 14:13:10 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 3 Jul 2023 14:13:10 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v5] In-Reply-To: References: Message-ID: > I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make these interleavings easier to understand and more correct. > > ---- > > CDS narrow Klass handling plays a role for archived heap objects. > > When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. > We recompute those narrow Klass ids using the following encoding scheme: > - base = future assumed mapping start address > - shift = dump time (!) JVMs encoding shift (A) > see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 > > At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: > - encoding base is the range start address (mapping base) > - encoding shift is always zero > see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 > > The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) > > At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. > > ------------------- > > There are some small things wrong with the current code. That wrongness does not lead to errors but makes understanding the code difficult. > > Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: > > In `CompressedKlassPointers::initialize` there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). > > In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass range size*. The comment in ArchiveHeapWrit... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains six commits: - Add alternative for !INCLUDE_CDS_JAVA_HEAP path - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding - fix comment - Merge - -remove narrow_klass_xxx from FileMap - remove ArchiveHeapWriter::precomputed_narrow_klass_base_delta and replaced it with clear comments - changed runtime fail condition to asserts in FileMapInfo::can_use_heap_region() - fix-cleanup-CDS-nKlass-encoding ------------- Changes: https://git.openjdk.org/jdk/pull/14688/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14688&range=04 Stats: 147 lines in 8 files changed: 67 ins; 50 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14688/head:pull/14688 PR: https://git.openjdk.org/jdk/pull/14688 From epeter at openjdk.org Mon Jul 3 14:40:13 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:40:13 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: <6EUcNQbNFGXwDM1MlNBIRqWVm7PaY4nwSA_SWq3s4bM=.a84e82df-4ef6-410e-b6b8-cda32f0bc455@github.com> On Mon, 3 Jul 2023 07:54:38 GMT, Pengfei Li wrote: >> src/hotspot/share/opto/loopnode.hpp line 143: >> >>> 141: if (is_vector_masked()) { >>> 142: return false; >>> 143: } >> >> Does this mean that the post-loop has a `CountedLoop` node, but it does not adhere to the counted-loop assumptions, such as having a `incr`, `limit`, `phi` etc? With the old post-loop-vectorization, the LoopNode would always fold away, so it would disappear after IGVN. But now it would stick around, right? Could that turn out to be a problem? > > After being vectorized, the post loop still has `phi`, `incr` and `limit` as before. In other words, the post loop is still a loop now. I think the only difference is that the loop stride value is not a constant any more (as we introduces the `VectorMaskTrueCountNode` for the new stride). The old implementation of post loop vectorization makes the vector-masked post loop run only once so it can optimize the `LoopNode` away. But we cannot do this now without doing multi-versioning. (Without the scalar post loop, loop may run insufficient iterations when the "atomic" post loop is not entered.) I see. Maybe it would be cleaner to separate the "outside/after" loop uses of the `incr` with what happens inside the loop? If we do take the backedge, the stride is a known constant. Only if we exit do we need to add the unknown number of iterations with `VectorMaskTrueCountNode`. See also this comment https://github.com/openjdk/jdk/pull/14581/files#r1250973547 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250974056 From epeter at openjdk.org Mon Jul 3 14:40:18 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:40:18 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:37:22 GMT, Pengfei Li wrote: >> ## TL;DR >> >> This patch completely re-implements C2's experimental post loop vectorization for better stability, maintainability and performance. Compared with the original implementation, this new implementation adds a standalone loop phase in C2's ideal loop phases and can vectorize more post loops. The original implementation and all code related to multi-versioned post loops are deleted in this patch. More details about this patch can be found in the document replied in this pull request. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Address part of comments from Emanuel src/hotspot/share/opto/vmaskloop.cpp line 978: > 976: > 977: // Update loop increment/decrement to the vector mask true count > 978: Node* true_cnt = new VectorMaskTrueCountNode(root_vmask, TypeInt::INT); This seems expensive to have to use inside the loop. Is there a way we could move this outside the loop? Because if we do take the backedge then we know that we have to take the full `stride`, right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250973547 From epeter at openjdk.org Mon Jul 3 14:40:19 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:40:19 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 14:34:00 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 978: > >> 976: >> 977: // Update loop increment/decrement to the vector mask true count >> 978: Node* true_cnt = new VectorMaskTrueCountNode(root_vmask, TypeInt::INT); > > This seems expensive to have to use inside the loop. Is there a way we could move this outside the loop? Because if we do take the backedge then we know that we have to take the full `stride`, right? I guess you would have to separate out the loop-internal uses and the outside uses of the `incr`. The inside uses would use the `stride` (or is there an exception?) and the outside ones could use the `VectorMaskTrueCountNode`. Doing something like that could have better performance. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250976621 From epeter at openjdk.org Mon Jul 3 14:47:10 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:47:10 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: <6IkvVTm9e60qXwaID0EihRXlUielrryBWoTmYAp3PuU=.c624b13d-bc6d-4c79-86a6-72bda016b50f@github.com> Message-ID: On Mon, 3 Jul 2023 08:05:33 GMT, Pengfei Li wrote: >> src/hotspot/share/opto/superword.cpp line 179: >> >>> 177: assert(_packset.length() == 0, "packset must be empty"); >>> 178: success = SLP_extract(); >>> 179: if (PostLoopMultiversioning) { >> >> Could we now have an assert for `cl->is_main_loop()` at the beginning of `SuperWord::transform_loop`, and remove all checks for it in SuperWord? > > Unfortunately, I just tried updating this but found assertion failures. I see `SuperWord::transform_loop()` is also called in `IdealLoopTree::policy_unroll_slp_analysis()` which can pass a normal loop (the loop before iteration-split). I assume only main loops require unrolling analysis and don't understand why it could be a normal loop. Maybe that's bad code and we need refactor C2's unrolling analysis first. It would be great if we could untangle that a bit. Let me know what idea you come up with. It also sounds confusing that the "analysis" only `policy_unroll_slp_analysis` should call a method that is called "transform" like `transform_loop`. >> src/hotspot/share/opto/superword.hpp line 666: >> >>> 664: IdealLoopTree* lpt() const { return _lpt; } >>> 665: PhiNode* iv() const { >>> 666: return _slp ? _slp->iv() : _lpt->_head->as_CountedLoop()->phi()->as_Phi(); >> >> I'd suggest either cache it directly from `_lpt->_head->as_CountedLoop()->phi()->as_Phi()`, or just query it directly. Reduce dependence on `_slp`. > > Good catch! What do you think of getting rid of `_slp` completely in `SWPointer` refactoring? I think that would be optimal, if it is possible. I would maybe call it a `CLPointer`, for counted-loop-pointer? And only have a reference to the `_lpt` / `cl`. Eventually, we may want to even allow non-conted-loops, but that is really for the future. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250981304 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250984838 From epeter at openjdk.org Mon Jul 3 14:47:10 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:47:10 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <7iru0xDm4lckuwyHvqGSld0_kWUVYSTg5BT3-rqP3Vw=.cab94902-98aa-40ed-ae13-d238380b6267@github.com> References: <7iru0xDm4lckuwyHvqGSld0_kWUVYSTg5BT3-rqP3Vw=.cab94902-98aa-40ed-ae13-d238380b6267@github.com> Message-ID: On Mon, 3 Jul 2023 08:08:29 GMT, Pengfei Li wrote: >> Can we untangle it completely from SuperWord? it seems you have made it optional, so yes. And maybe we can also make the trace flags like `_slp->is_trace_alignment()` independent? It would be nice to also be able to trace this for non SuperWord-contexts like post-loop masked vectoriaztion, right? > > I will try to do this in another JBS and come back here later. That would be fantastic! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250982246 From epeter at openjdk.org Mon Jul 3 14:55:11 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:55:11 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <7iru0xDm4lckuwyHvqGSld0_kWUVYSTg5BT3-rqP3Vw=.cab94902-98aa-40ed-ae13-d238380b6267@github.com> References: <7iru0xDm4lckuwyHvqGSld0_kWUVYSTg5BT3-rqP3Vw=.cab94902-98aa-40ed-ae13-d238380b6267@github.com> Message-ID: On Mon, 3 Jul 2023 08:13:41 GMT, Pengfei Li wrote: >> Even if all callers currently ensure that `n` has the correct type, I'd say it is still not a great idea to cast without checking, at least in debug. > > I searched all C2 code and saw a lot of such patterns. Perhaps doing this in another RFE? Yes, please file an RFE. You can assign it to me if you don't want to do it yourself - I may find someone else to do it or do it myself eventually. But for new code please already use `as_LoopVectorMaskNode()`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250988000 From epeter at openjdk.org Mon Jul 3 14:55:14 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:55:14 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:37:22 GMT, Pengfei Li wrote: >> ## TL;DR >> >> This patch completely re-implements C2's experimental post loop vectorization for better stability, maintainability and performance. Compared with the original implementation, this new implementation adds a standalone loop phase in C2's ideal loop phases and can vectorize more post loops. The original implementation and all code related to multi-versioned post loops are deleted in this patch. More details about this patch can be found in the document replied in this pull request. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Address part of comments from Emanuel src/hotspot/share/opto/vmaskloop.cpp line 64: > 62: > 63: if (!cl->is_valid_counted_loop(T_INT)) { > 64: trace_msg(nullptr, "Loop is not a valid counted loop"); Would it help to dump the loop head here? Just that one knows which loop is being rejected here? src/hotspot/share/opto/vmaskloop.cpp line 68: > 66: } > 67: if (abs(cl->stride_con()) != 1) { > 68: trace_msg(nullptr, "Loop has unsupported stride value"); Dump loop head and the stride ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250994587 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250995022 From epeter at openjdk.org Mon Jul 3 14:55:17 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 14:55:17 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <63xntDgcTJN-51cfPjP1XsWdNLkeURQuWmE8hluHbIM=.84e6a8b9-66b1-4427-ab2c-355c0c621871@github.com> References: <63xntDgcTJN-51cfPjP1XsWdNLkeURQuWmE8hluHbIM=.84e6a8b9-66b1-4427-ab2c-355c0c621871@github.com> Message-ID: <-KAmqqEkhtq1UTcfF5xv1etrrhY0CjW7t07wVrSstJY=.1b1e8e20-f016-4cef-9c9a-157c189d1653@github.com> On Mon, 3 Jul 2023 08:31:20 GMT, Pengfei Li wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 71: >> >>> 69: if (cl->loopexit()->in(0) != cl) return; >>> 70: // Skip if some loop operations are pinned to the backedge >>> 71: if (cl->back_control()->outcnt() != 1) return; >> >> It would be interesting to have some trace flag that tells us why we bailed out here and did not do the post-loop vectorization. Unless of course it becomes too noisy. > > Great suggestion! Done. Thanks :) >> src/hotspot/share/opto/vmaskloop.hpp line 95: >> >>> 93: } >>> 94: return false; >>> 95: } >> >> Do you not want to do this sort of implementation in `SWPointer` instead? There are already methods like `scaled_iv_plus_offset`, so it would fit in next to that, right? > > It doesn't fit well as functions in `SWPointer` can only be used for checking the pattern in indices. But this function may be used for checking the loop increment pattern which is not in array indices, perhaps `a[i] = b[i] * (i + 1)`. We don't have `SWPointer` constructed for this. I have rename the function to make the purpose clear. Ah, you are right, that is a different use. Yes, better function name often does the trick ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250996321 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250991219 From epeter at openjdk.org Mon Jul 3 15:01:21 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 15:01:21 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <63xntDgcTJN-51cfPjP1XsWdNLkeURQuWmE8hluHbIM=.84e6a8b9-66b1-4427-ab2c-355c0c621871@github.com> References: <63xntDgcTJN-51cfPjP1XsWdNLkeURQuWmE8hluHbIM=.84e6a8b9-66b1-4427-ab2c-355c0c621871@github.com> Message-ID: On Mon, 3 Jul 2023 08:36:44 GMT, Pengfei Li wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 104: >> >>> 102: _core_set.clear(); >>> 103: _body_set.clear(); >>> 104: _body_nodes.clear(); >> >> Would it make sense to somehow reserve the space, so that we do not allocate multiple times when growing these data structures later? > > Could you elaborate how to do such reservation in C2? Just allocation with some larger sizes at the beginning? Or any other examples to refer? I think what I have seen people do is just to `map` a high enough index value with `nullptr`. A bit hacky, but it ensures that the arrays underneath get grown sufficiently immediately. It would be nice to have some kind of `reserve` methods... but I don't think we have that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251002205 From epeter at openjdk.org Mon Jul 3 15:01:25 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 15:01:25 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:37:22 GMT, Pengfei Li wrote: >> ## TL;DR >> >> This patch completely re-implements C2's experimental post loop vectorization for better stability, maintainability and performance. Compared with the original implementation, this new implementation adds a standalone loop phase in C2's ideal loop phases and can vectorize more post loops. The original implementation and all code related to multi-versioned post loops are deleted in this patch. More details about this patch can be found in the document replied in this pull request. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Address part of comments from Emanuel src/hotspot/share/opto/vmaskloop.hpp line 46: > 44: > 45: // Data structures for loop analysis > 46: Unique_Node_List _core_set; // Loop core nodes set for fast membership check If this is really only for membership test, and you never need the list of nodes, you could just use the `VectorSet`. Uses less memory. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1250999324 From epeter at openjdk.org Mon Jul 3 15:01:22 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 3 Jul 2023 15:01:22 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: <63xntDgcTJN-51cfPjP1XsWdNLkeURQuWmE8hluHbIM=.84e6a8b9-66b1-4427-ab2c-355c0c621871@github.com> Message-ID: <4iKhNaaEAPEsAz0-1S0VJhU1rzQ0CNcJm7IepcBaUU4=.bcdb0b39-8d0c-4e99-b1f3-3d916ab8d6be@github.com> On Mon, 3 Jul 2023 14:57:50 GMT, Emanuel Peter wrote: >> Could you elaborate how to do such reservation in C2? Just allocation with some larger sizes at the beginning? Or any other examples to refer? > > I think what I have seen people do is just to `map` a high enough index value with `nullptr`. A bit hacky, but it ensures that the arrays underneath get grown sufficiently immediately. It would be nice to have some kind of `reserve` methods... but I don't think we have that. Also: Use the `VectorSet` instead of the `Unique_Node_List` if you can. It uses less space ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251002744 From jsjolen at openjdk.org Mon Jul 3 16:06:58 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 3 Jul 2023 16:06:58 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: <_j4bapWKtvCoEfs2PyxKmDJTqfAMuRePkPJX2yx2W-8=.6cedb0a3-817e-433e-b1f2-288844d2983c@github.com> References: <7QVoA3Ew5bOqflalrTQdjFoY-3E--4BI5y9rI14UecQ=.40ca559b-af0a-4ef3-bc9f-7dac0a52e77d@github.com> <_j4bapWKtvCoEfs2PyxKmDJTqfAMuRePkPJX2yx2W-8=.6cedb0a3-817e-433e-b1f2-288844d2983c@github.com> Message-ID: On Mon, 3 Jul 2023 13:49:19 GMT, Thomas Stuefe wrote: >But with your proposal, I pay for the construction of both LogMessage and NonInterleavingLogStream every time. Yeah, I made a mental footnote that `LogTarget` should be able to be used to initialize these inside of an if-block. UL is still a bit painful about these things. My suggestion sounds like a massive pain, I'm happy if you don't do it :). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1251074040 From shade at openjdk.org Mon Jul 3 18:00:27 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 3 Jul 2023 18:00:27 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v38] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 13:24:39 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > Build fixes Another round of reviews: src/hotspot/cpu/arm/c1_MacroAssembler_arm.hpp line 50: > 48: > 49: void allocate_object(Register obj, Register tmp1, Register tmp2, Register tmp3, > 50: int header_size_in_bytes, int object_size, I don't see the related change in `C1_MacroAssembler::allocate_object` definition. At very least, it affects the assert that compares with `object_size` (in words?). src/hotspot/cpu/ppc/c1_MacroAssembler_ppc.cpp line 383: > 381: // Zero first 4 bytes, if start offset is not word aligned. > 382: if (!is_aligned(base_offset_in_bytes, BytesPerWord)) { > 383: assert(is_aligned(base_offset_in_bytes, BytesPerInt), "weird alignment"); "must be 4-byte aligned"? src/hotspot/cpu/riscv/c1_MacroAssembler_riscv.cpp line 201: > 199: int start_offset_in_bytes = hdr_size_in_bytes; > 200: if (!is_aligned(start_offset_in_bytes, BytesPerWord)) { > 201: assert(is_aligned(start_offset_in_bytes, BytesPerInt), "must be 32-bit-aligned"); "must be 4-byte aligned"? src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 315: > 313: // fields will be inconsistent. This could cause a refinement > 314: // thread to calculate the object size incorrectly. > 315: Copy::fill_to_bytes(new_obj, oopDesc::base_offset_in_bytes(), 0); Since this is a humongous allocation anyway, can we accept `fill_to_words` that would write 4 bytes past the object header, touching the object data? Keeps the code in the similar shape then. src/hotspot/share/gc/parallel/mutableNUMASpace.cpp line 27: > 25: #include "precompiled.hpp" > 26: #include "gc/parallel/mutableNUMASpace.hpp" > 27: #include "gc/shared/collectedHeap.inline.hpp" Is this still needed? There no other changes in this compilation unit. src/hotspot/share/gc/shared/collectedHeap.cpp line 412: > 410: int payload_offset = arrayOopDesc::base_offset_in_bytes(T_INT); > 411: if (!is_aligned(payload_offset, HeapWordSize)) { > 412: assert(is_aligned(payload_offset, BytesPerInt), "base offset must be 32-bit-aligned"); "must be 4-byte aligned"? src/hotspot/share/gc/shared/collectedHeap.cpp line 444: > 442: > 443: const size_t payload_size_bytes = words * HeapWordSize - arrayOopDesc::base_offset_in_bytes(T_INT); > 444: assert(is_aligned(payload_size_bytes, BytesPerInt), "must be int aligned"); "must be 4-byte aligned"? src/hotspot/share/gc/shared/jvmFlagConstraintsGC.cpp line 343: > 341: JVMFlag::printError(verbose, > 342: "MinTLABSize (" SIZE_FORMAT " bytes) must be " > 343: "less than or equal to ergonomic TLAB maximum (" SIZE_FORMAT " words)\n", Wait, let's not mix "bytes" and "words" in the same message. Users do not readily know what is the HeapWordSize on the machine, or even how bytes and words might be related. Is there a reason to even change these? Does the original code overflow? src/hotspot/share/gc/shared/memAllocator.cpp line 378: > 376: assert(mem != nullptr, "cannot initialize null object"); > 377: assert(_word_size * HeapWordSize >= hdr_size_bytes, "unexpected object size"); > 378: Copy::fill_to_bytes((char*)mem + hdr_size_bytes, _word_size * HeapWordSize - hdr_size_bytes); Performance-sensitive path, let's keep `Copy::fill_to_aligned_words` and compensate for misalignment with a single store, as we do everywhere else. src/hotspot/share/gc/shared/memAllocator.cpp line 403: > 401: assert(_length >= 0, "length should be non-negative"); > 402: if (_do_zero) { > 403: ArrayKlass* array_klass = ArrayKlass::cast(_klass); Do you need this `array_klass`? If this is for asserts, then it should be done as assert. src/hotspot/share/gc/shared/memAllocator.hpp line 81: > 79: HeapWord* mem_allocate(Allocation& allocation) const; > 80: > 81: MemRegion obj_memory_range(oop obj) const { I don't see any usages of `obj_memory_range` anywhere in current sources. I think it was removed by [JDK-8171221](https://bugs.openjdk.org/browse/JDK-8171221). I think I'll just remove it in a separate PR: [JDK-8311249](https://bugs.openjdk.org/browse/JDK-8311249). src/hotspot/share/gc/shared/threadLocalAllocBuffer.inline.hpp line 44: > 42: // successful thread-local allocation > 43: #ifdef ASSERT > 44: Copy::fill_to_words(obj, size, badHeapWordVal); So, this is safe after CMS removal, or for some other reason? I think this and other two removals near `Skip mangling the space corresponding to the object header` comment should be done in a separate PR to avoid cluttering this one. src/hotspot/share/interpreter/zero/bytecodeInterpreter.cpp line 1998: > 1996: // this area, and we still need to initialize it > 1997: if (DEBUG_ONLY(true ||) !ZeroTLAB) { > 1998: size_t hdr_size = instanceOopDesc::base_offset_in_bytes(); For Zero case, I think we can just `Copy::fill_to_words` the entire object: the overhead of dealing with header math is likely comparable to the overhead of additional header writes and the benefits the word-sized clearing brings. This can probably be separated in another PR that deals with zapping better. src/hotspot/share/oops/arrayOop.cpp line 32: > 30: // overflow. We also need to make sure that this will not overflow a size_t on > 31: // 32 bit platforms when we convert it to a byte size. > 32: int32_t arrayOopDesc::max_array_length(BasicType type) { Looks like used from some hot paths. If you don't want it in header, can it be in `.inline.hpp` instead? src/hotspot/share/oops/arrayOop.hpp line 76: > 74: } > 75: > 76: // Header size computation. Any reasons why this is moved? Visibility change? Would it be uglier to do in place? src/hotspot/share/opto/memnode.cpp line 2098: > 2096: // test to (klass != Serializable && klass != Cloneable). > 2097: assert(Opcode() == Op_LoadI, "must load an int from _layout_helper"); > 2098: jint min_size = Klass::instance_layout_helper(checked_cast(heap_word_size(oopDesc::base_offset_in_bytes())), false); This is not the first time we do the whole `checked_cast(heap_word_size(oopDesc::base_offset_in_bytes())` dance. Maybe it should be a helper method straight in `oopDesc` then? src/hotspot/share/runtime/continuationFreezeThaw.cpp line 1289: > 1287: // zero out fields (but not the stack) > 1288: const size_t hs = oopDesc::base_offset_in_bytes(); > 1289: Copy::fill_to_bytes((char*)mem + hs, vmClasses::StackChunk_klass()->size_helper() * HeapWordSize - hs); Likely performance-sensitive path. Can we do the same "adjust if only aligned by 4 bytes, then do aligned words copy", as in other places? src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Array.java line 68: > 66: return !VM.getVM().isCompressedOopsEnabled(); > 67: } > 68: } I think `isCompressedOopsEnabled` already does the right thing, so: Suggestion: if (type == BasicType.T_OBJECT || type == BasicType.T_ARRAY) { return !VM.getVM().isCompressedOopsEnabled(); } test/hotspot/gtest/oops/test_objArrayOop.cpp line 42: > 40: { 256, false, true, 32 }, // 20 byte header, 4 byte oops > 41: { 256, true, false, 32 }, // 16 byte header, 8 byte oops > 42: { 256, true, true, 32 }, // 16 byte header, 4 byte oops Misses the comment ", 256-byte align"? Maybe we should at least add 16-byte alignment to cover the most frequent `ObjectAlignmentInBytes` override. test/hotspot/jtreg/gtest/ArrayTest.java line 2: > 1: /* > 2: * Copyright (c) 2022 Amazon.com Inc. or its affiliates. All rights reserved. Here and in other files, copyright header is not in our usual form. :) test/hotspot/jtreg/gtest/ArrayTest.java line 25: > 23: */ > 24: > 25: /* The file should probably be called `ArrayTests.java` to match another new `ObjArrayTests.java`? test/hotspot/jtreg/gtest/ArrayTest.java line 34: > 32: * @modules java.base/jdk.internal.misc > 33: * java.xml > 34: * @run main/native GTestWrapper --gtest_filter=arrayOop::base_offset -XX:+UseCompressedClassPointers -XX:+UseCompressedOops In all test configs here: why `arrayOop::base_offset`, and not just `arrayOop`? This would miss running new test cases. Note that `ObjArrayTests.java` does class-only filter already. ------------- PR Review: https://git.openjdk.org/jdk/pull/11044#pullrequestreview-1511336722 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1250985281 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1250994093 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1250998877 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251003812 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251002799 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251008534 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251010657 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251142484 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251143425 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251144713 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251147524 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251130202 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251132354 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251158897 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251156823 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251137566 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251139426 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251027722 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251018122 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251014984 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251014320 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251013282 From shade at openjdk.org Mon Jul 3 18:00:28 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 3 Jul 2023 18:00:28 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v19] In-Reply-To: References: <0Y1Bhvn-wJJdPJ83fSeICTbMQzdW4wLvEnaZhm8J6h8=.08240fe1-c05a-45f2-8975-02b1d12db63b@github.com> Message-ID: On Mon, 13 Feb 2023 18:05:03 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/shared/collectedHeap.cpp line 257: >> >>> 255: const size_t elements_per_word = HeapWordSize / sizeof(jint); >>> 256: int base_offset_in_ints = arrayOopDesc::base_offset_in_ints(T_INT); >>> 257: _filler_array_max_size = align_object_size((base_offset_in_ints + max_len) / elements_per_word); >> >> Isn't this expression susceptible to overflow, like the removed comment in `CollectedHeap::max_tlab_size` (below) states? I.e. max_len is probably very close to SIZE_MAX on 32-bit platforms, and adding the base offset gets dangerously close there. Not to mention the positive side of signed `int` domain is lower than SIZE_MAX to beging with? I think you need to keep doing the division `max_len / elements_per_word` first. > > As you say, the positive side of int32_t is much smaller than SIZE_MAX and thus we are basically guaranteed to not overflow here. Also, arrayOopDesc::max_array_length() is specifically designed to prevent such overflows (and even the more likely overflowing of size_t when converting. I am going to add corresponding assert there. All right. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251006987 From shade at openjdk.org Mon Jul 3 18:00:29 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 3 Jul 2023 18:00:29 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v38] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 17:09:43 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Build fixes > > src/hotspot/share/gc/shared/threadLocalAllocBuffer.inline.hpp line 44: > >> 42: // successful thread-local allocation >> 43: #ifdef ASSERT >> 44: Copy::fill_to_words(obj, size, badHeapWordVal); > > So, this is safe after CMS removal, or for some other reason? I think this and other two removals near `Skip mangling the space corresponding to the object header` comment should be done in a separate PR to avoid cluttering this one. https://github.com/openjdk/jdk/commit/0917ad432eb3d01c104f03973c5c7ff52c6dfefe suggests skipping the header during the zapping was added for G1. So why it is safe to remove? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251156072 From rkennke at openjdk.org Mon Jul 3 18:30:12 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 3 Jul 2023 18:30:12 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v38] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 14:44:47 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Build fixes > > src/hotspot/cpu/arm/c1_MacroAssembler_arm.hpp line 50: > >> 48: >> 49: void allocate_object(Register obj, Register tmp1, Register tmp2, Register tmp3, >> 50: int header_size_in_bytes, int object_size, > > I don't see the related change in `C1_MacroAssembler::allocate_object` definition. At very least, it affects the assert that compares with `object_size` (in words?). Hrmpf, this is actually wrong. This method accepts the header size in *words*. We do, in-fact, jump through some hoops to get it word-sized: __ allocate_object(dst, scratch1, scratch2, scratch3, scratch4, --> checked_cast(heap_word_size(instanceOopDesc::base_offset_in_bytes())), instance_size, klass_reg, !klass->is_initialized(), slow_path); However, it is a discrepancy vs allocate_array() which (now) accepts byte-sized header-size. OTOH, none of the implementations of allocate_object() seem to actually use this information, except for the assert, which seems quite bogus. Also, for objects, the header-size (or base-offset) would always be the same, there seems no need to pass it around. Maybe this warrants a larger refactoring to get rid of that header_size argument to begin with? Maybe as separate PR? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251178920 From mgronlun at openjdk.org Mon Jul 3 18:40:22 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 3 Jul 2023 18:40:22 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress Message-ID: Greetings, please help review this fix to some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes are in the JIRA issue. Testing: jdk_jfr, stress testing. Thanks Markus ------------- Commit messages: - 8303134 Changes: https://git.openjdk.org/jdk/pull/14761/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8303134 Stats: 814 lines in 36 files changed: 679 ins; 71 del; 64 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From mgronlun at openjdk.org Mon Jul 3 19:56:57 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 3 Jul 2023 19:56:57 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v2] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix to some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes are in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: thread state ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/641a67e8..aa498f7e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=00-01 Stats: 8 lines in 1 file changed: 3 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From dnsimon at openjdk.org Mon Jul 3 21:31:56 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 3 Jul 2023 21:31:56 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v4] In-Reply-To: References: Message-ID: On Sun, 2 Jul 2023 21:51:59 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR proposes a `TraceClassLoadingCause` VM flag: >> >> >> product(ccstr, TraceClassLoadingCause, nullptr, DIAGNOSTIC, \ >> "Print a stack trace when loading a class whose fully" \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> I would have liked to implement this using Unified Logging but UL has no support for filtering on the class names. >> >> Example usage: >> >> java -XX:+UnlockDiagnosticVMOptions -XX:TraceClassLoadingCause=Thread --version >> Loading java.lang.Thread >> Loading java.lang.Thread$FieldHolder >> Loading java.lang.Thread$Constants >> Loading java.lang.Thread$UncaughtExceptionHandler >> Loading java.lang.ThreadGroup >> Loading java.lang.BaseVirtualThread >> Loading java.lang.VirtualThread >> Loading java.lang.ThreadBuilders$BoundVirtualThread >> Loading java.lang.Thread$State >> at jdk.internal.misc.VM.toThreadState(java.base/VM.java:336) >> at java.lang.Thread.threadState(java.base/Thread.java:2731) >> at java.lang.Thread.isTerminated(java.base/Thread.java:2738) >> at java.lang.Thread.getThreadGroup(java.base/Thread.java:1957) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:314) >> at java.lang.System.initPhase1(java.base/System.java:2206) >> Loading java.lang.Thread$ThreadIdentifiers >> at java.lang.Thread.(java.base/Thread.java:734) >> at java.lang.Thread.(java.base/Thread.java:1477) >> at java.lang.ref.Reference$ReferenceHandler.(java.base/Reference.java:198) >> at java.lang.ref.Reference.startReferenceHandlerThread(java.base/Reference.java:300) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:318) >> at java.lang.System.initPhase1(java.base/System.java:2206) >> Loading java.lang.ref.Finalizer$FinalizerThread >> at java.lang.ref.Finalizer.startFinalizerThread(java.base/Finalizer.java:187) >> at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:319) >> at java.lang.Sy... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > add class+load+cause and make its output independent of class+load The `runtime/logging/RedefineClasses.java` test fails as a result of these changes. This seems to happen for either of the following reasons: 1. The test times out due to native stack logging being so slow. The test uses `-Xlog:all=trace:file=all.log` so will indiscriminately log `class+load+cause+native` for every class loaded and the test seems to load a lot of classes. 2. The test crashes the VM as `VMError::print_native_stack` is apparently not safe to call in all class loading contexts. The attached [hs_err_pid46945.log](https://github.com/openjdk/jdk/files/11941187/hs_err_pid46945.log) shows one such example. Unless someone can suggest a reliable way to workaround both these issues, I propose to drop `class+load+cause+native` logging. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14553#issuecomment-1619169099 From dholmes at openjdk.org Mon Jul 3 22:56:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 3 Jul 2023 22:56:53 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v4] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 21:28:43 GMT, Doug Simon wrote: > The test uses -Xlog:all=trace:file=all.log so will indiscriminately log class+load+cause+native for every class loaded and the test seems to load a lot of classes. Are you treating not setting `LogClassLoadingCauseFor` as being a wildcard for "all"? That'[s not what you had earlier and seems wrong - this needs to be opt-in. > The test crashes the VM as VMError::print_native_stack is apparently not safe to call in all class loading contexts. That is a distinct bug in itself. Probably print_native_stack needs additional safety/state checks. I will file a bug for that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14553#issuecomment-1619232854 From eliu at openjdk.org Tue Jul 4 01:03:20 2023 From: eliu at openjdk.org (Eric Liu) Date: Tue, 4 Jul 2023 01:03:20 GMT Subject: RFR: 8306136: [vectorapi] Intrinsics of VectorMask.laneIsSet() [v2] In-Reply-To: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> References: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> Message-ID: > VectorMask.laneIsSet() [1] is implemented based on VectorMask.toLong() [2], and it's performance highly depends on the intrinsification of toLong(). However, if `toLong()` is failed to intrinsify, on some architectures or unsupported species, it's much more expensive than pure getBits(). Besides, some CPUs (e.g. with Arm Neon) may not have efficient instructions to implementation toLong(), so we propose to intrinsify VectorMask.laneIsSet separately. > > This patch optimize laneIsSet() by calling the existing intrinsic method VectorSupport.extract(), which actually does not introduce new intrinsic method. The C2 compiler intrinsification logic to support _VectorExtract has also been extended to better support laneIsSet(). It tries to extract the mask's lane value with an ExtractUB node if the hardware backend supports it. While on hardware without ExtractUB backend support , c2 will still try to generate toLong() related nodes, which behaves the same as before the patch. > > Key changes in this patch: > > 1. Reuse intrinsic `VectorSupport.extract()` in Java side. No new intrinsic method is introduced. > 2. In compiler, `ExtractUBNode` is generated if backend support is. If not, the original "toLong" pattern is generated if it's implemented. Otherwise, it uses the default Java `getBits[i]` rather than the expensive and complicated toLong() based implementation. > 3. Enable `ExtractUBNode` on AArch64 to extract the lane value for a vector mask in compiler, together with changing its bottom type to TypeInt::BOOL. This helps optimize the conditional selection generated by > > ``` > > public boolean laneIsSet(int i) { > return VectorSupport.extract(..., defaultImpl) == 1L; > } > > ``` > > [Test] > hotspot:compiler/vectorapi and jdk/incubator/vector passed. > > [Performance] > > Below shows the performance gain on 128-bit vector size Neon machine. For 64 and 128 SPECIES, the improvment caused by this intrinsics. For other SPECIES which can not be intrinfied, performance gain comes from the default Java implementation changes, i.e. getBits[i] vs. toLong(). > > > Benchmark Gain (after/before) > microMaskLaneIsSetByte128_con 2.47 > microMaskLaneIsSetByte128_var 1.82 > microMaskLaneIsSetByte256_con 3.01 > microMaskLaneIsSetByte256_var 3.04 > microMaskLaneIsSetByte512_con 4.83 > microMaskLaneIsSetByte512_var 4.86 > microMaskLaneIsSetByte64_con 1.57 > microMaskLaneIsSetByte64_var 1.18... Eric Liu has updated the pull request incrementally with one additional commit since the last revision: Bug fix Change-Id: Ib223c4048b29875a62a27d6081ad36a125dec144 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14200/files - new: https://git.openjdk.org/jdk/pull/14200/files/da5a7e72..ebb07165 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14200&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14200&range=00-01 Stats: 98 lines in 2 files changed: 0 ins; 2 del; 96 mod Patch: https://git.openjdk.org/jdk/pull/14200.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14200/head:pull/14200 PR: https://git.openjdk.org/jdk/pull/14200 From eliu at openjdk.org Tue Jul 4 01:06:53 2023 From: eliu at openjdk.org (Eric Liu) Date: Tue, 4 Jul 2023 01:06:53 GMT Subject: RFR: 8306136: [vectorapi] Intrinsics of VectorMask.laneIsSet() In-Reply-To: References: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> Message-ID: <9LCSs6T1M7js3pnYd-fFmWnuLyywmojCbxDcG5Eqey4=.d57bee35-6501-4f0a-a924-f218f3d845ed@github.com> On Wed, 21 Jun 2023 19:04:39 GMT, Paul Sandoz wrote: >> VectorMask.laneIsSet() [1] is implemented based on VectorMask.toLong() [2], and it's performance highly depends on the intrinsification of toLong(). However, if `toLong()` is failed to intrinsify, on some architectures or unsupported species, it's much more expensive than pure getBits(). Besides, some CPUs (e.g. with Arm Neon) may not have efficient instructions to implementation toLong(), so we propose to intrinsify VectorMask.laneIsSet separately. >> >> This patch optimize laneIsSet() by calling the existing intrinsic method VectorSupport.extract(), which actually does not introduce new intrinsic method. The C2 compiler intrinsification logic to support _VectorExtract has also been extended to better support laneIsSet(). It tries to extract the mask's lane value with an ExtractUB node if the hardware backend supports it. While on hardware without ExtractUB backend support , c2 will still try to generate toLong() related nodes, which behaves the same as before the patch. >> >> Key changes in this patch: >> >> 1. Reuse intrinsic `VectorSupport.extract()` in Java side. No new intrinsic method is introduced. >> 2. In compiler, `ExtractUBNode` is generated if backend support is. If not, the original "toLong" pattern is generated if it's implemented. Otherwise, it uses the default Java `getBits[i]` rather than the expensive and complicated toLong() based implementation. >> 3. Enable `ExtractUBNode` on AArch64 to extract the lane value for a vector mask in compiler, together with changing its bottom type to TypeInt::BOOL. This helps optimize the conditional selection generated by >> >> ``` >> >> public boolean laneIsSet(int i) { >> return VectorSupport.extract(..., defaultImpl) == 1L; >> } >> >> ``` >> >> [Test] >> hotspot:compiler/vectorapi and jdk/incubator/vector passed. >> >> [Performance] >> >> Below shows the performance gain on 128-bit vector size Neon machine. For 64 and 128 SPECIES, the improvment caused by this intrinsics. For other SPECIES which can not be intrinfied, performance gain comes from the default Java implementation changes, i.e. getBits[i] vs. toLong(). >> >> >> Benchmark Gain (after/before) >> microMaskLaneIsSetByte128_con 2.47 >> microMaskLaneIsSetByte128_var 1.82 >> microMaskLaneIsSetByte256_con 3.01 >> microMaskLaneIsSetByte256_var 3.04 >> microMaskLaneIsSetByte512_con 4.83 >> microMaskLaneIsSetByte512_var 4.86 >> microMaskLaneIsSetByt... > > Getting crashes on linux-x64-debug when using these VM options: > > -XX:UseAVX=3 -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting > > For a test tag of `tier3-vector-avx512` when running tests for `open/test/jdk/:jdk_vector`. > > Relevant bits from the HS error log file: > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/opt/mach5/mesos/work_dir/slaves/cd627e65-f015-4fb1-a1d2-b6c9b8127f98-S9618/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/24abb99d-ff0d-447b-9153-bb7048d6d487/runs/d1fc7a0c-634f-4df4-9d5f-6a464bb177b9/workspace/open/src/hotspot/share/opto/vectorIntrinsics.cpp:2586), pid=1090716, tid=1090731 > # assert(!Matcher::has_predicated_vectors()) failed: should be > # > ... > Current CompileTask: > C2: 9402 989 b jdk.incubator.vector.Int256Vector$Int256Mask::laneIsSet (38 bytes) > > Stack: [0x00007fa325bfc000,0x00007fa325cfd000], sp=0x00007fa325cf78f0, free space=1006k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x1815626] LibraryCallKit::inline_vector_extract()+0xc26 (vectorIntrinsics.cpp:2586) > V [libjvm.so+0x121cc94] LibraryIntrinsic::generate(JVMState*)+0x1c4 (library_call.cpp:117) > V [libjvm.so+0x853141] CallGenerator::do_late_inline_helper()+0x9b1 (callGenerator.cpp:695) > V [libjvm.so+0x9ea704] Compile::inline_incrementally_one()+0xd4 (compile.cpp:2015) > V [libjvm.so+0x9eb873] Compile::inline_incrementally(PhaseIterGVN&)+0x273 (compile.cpp:2098) > V [libjvm.so+0x9eebc7] Compile::Optimize()+0x427 (compile.cpp:2233) > V [libjvm.so+0x9f1f75] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1aa5 (compile.cpp:839) > V [libjvm.so+0x84bc04] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x3c4 (c2compiler.cpp:118) > V [libjvm.so+0x9fdf10] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xa00 (compileBroker.cpp:2265) > V [libjvm.so+0x9fed98] CompileBroker::compiler_thread_loop()+0x618 (compileBroker.cpp:1944) > V [libjvm.so+0xeb6dec] JavaThread::thread_main_inner()+0xcc (javaThread.cpp:719) > V [libjvm.so+0x17970aa] Thread::call_run()+0xba (thread.cpp:217) > V [libjvm.so+0x149715c] thread_native_entry(Thread*)+0x11c (os_linux.cpp:775) > ... > model name : Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology c... @PaulSandoz Sorry for the delay. I removed that assertion. Since in some cases, even predicated feature supported, vectormask can not be in used. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14200#issuecomment-1619314778 From pli at openjdk.org Tue Jul 4 01:34:23 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 01:34:23 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: <-UengrhToQL0qKVGetApNHkjRfUPMo8pEte_gtvCK5g=.b9b70067-9a40-445a-b37b-6a4ddee35be5@github.com> References: <-UengrhToQL0qKVGetApNHkjRfUPMo8pEte_gtvCK5g=.b9b70067-9a40-445a-b37b-6a4ddee35be5@github.com> Message-ID: On Mon, 26 Jun 2023 06:45:19 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 424: > >> 422: int vopc = 0; >> 423: if (node->is_Mem()) { >> 424: vopc = node->is_Store() ? Op_StoreVectorMasked : Op_LoadVectorMasked; > > Mabye just for good measure: add an assert that it can only be a Load or a Store. Done in commit 2 > src/hotspot/share/opto/vmaskloop.cpp line 429: > >> 427: } >> 428: if (vopc == 0 || >> 429: !Matcher::match_rule_supported_vector_masked(vopc, vlen, bt)) { > > Do all nodes need to be maskable? Or is it enough if only load/store are maskable? Only load, store and reduction operations need to be masked. We previously supported reductions but that's excluded from this patch - only load and store now. I've updated the code in commit 2. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251380343 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251380835 From pli at openjdk.org Tue Jul 4 01:45:14 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 01:45:14 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: <4T2ExJCtPF7g-Os7mQ7cqG2cEXN-jILfitBb-kySlzY=.f57c136c-edc3-48a4-b653-3a5767dcb60a@github.com> On Tue, 27 Jun 2023 17:49:43 GMT, Emanuel Peter wrote: > General question: Do you have any tests with varying loop limit, and check that you stop exactly at the right iteration? Would be even more interesting with mixed type examples. Just to see that you do not over/under duplicate the vectors. Yes, we previously tested this with a lot of fuzzer tests. We did find issues before but they are all fixed now. (Previously we also supported reductions, and it's a bit tricky to duplicate reductions.) > src/hotspot/share/opto/vmaskloop.cpp line 403: > >> 401: int opc = node->Opcode(); >> 402: BasicType bt = elem_bt(node); >> 403: int vlen = Matcher::max_vector_size(bt); > > Theoretically, different `bt` can have different `Matcher::vector_width_in_bytes`. So `vlen` would not always correspond to `MaxVectorSize / element_size`. It just means that here you would end up checking for a shorter length than maybe expected? But maybe that is intended, it depends on how you generate the nodes later. I think it's good, at least for AArch64 SVE. Do you mean that other architecture may prefer using shorter vectors for better performances? (say, using 256-bit on AVX-512?) Does setting a smaller `MaxVectorSize` help? > src/hotspot/share/opto/vmaskloop.cpp line 442: > >> 440: // nodes to bail out for complex loops >> 441: bool VectorMaskedLoop::analyze_loop_body_nodes() { >> 442: VectorSet tracked(_arena); > > This is probably a good case where you could use `ResourceMark rm;` and just put the `VectorSet` on the default resource arena. Updated here, and in another place. thanks. > src/hotspot/share/opto/vmaskloop.cpp line 465: > >> 463: for (int idx = 0; idx < n_nodes; idx++) { >> 464: Node* node = _body_nodes.at(idx); >> 465: if ((node->is_Mem() && node->as_Mem()->is_Store())) { > > Suggestion: > > if ((node->is_Store())) { Done > src/hotspot/share/opto/vmaskloop.cpp line 474: > >> 472: if (!in_body(out)) { >> 473: trace_msg(node, "Node has out-of-loop user found"); >> 474: return false; > > Can this be handled in the future with a extract node? I guess you would have to extract it from a variable element, as the last iteration is not always the same. Probably, but I haven't tried it so far. Will do it in the future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1619339468 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251383442 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251383769 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251383913 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251384357 From pli at openjdk.org Tue Jul 4 02:17:22 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 02:17:22 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> Message-ID: On Wed, 28 Jun 2023 10:24:58 GMT, Emanuel Peter wrote: >> I have an example here: >> >> public class Test { >> static int RANGE = 1024; >> >> public static void main(String[] strArr) { >> byte a[] = new byte[RANGE]; >> long b[] = new long[RANGE]; >> test0(a, b); >> } >> >> static void test0(byte[] a, long[] b) { >> for (int i = 0; i < RANGE; i++) { >> a[i]++; >> b[i]++; >> } >> } >> } >> >> `./java -Xcomp -XX:-TieredCompilation -XX:+TraceNewVectors -XX:+TraceLoopOpts -XX:+UnlockExperimentalVMOptions -XX:+UseMaskedLoop -XX:+TraceMaskedLoop -XX:CompileCommand=compileonly,Test::test0 Test.java` >> This are the masks: >> >> Generated vector masks in vmask tree >> Lane_size = 1 >> 3710 LoopVectorMask === _ 367 26 [[ 3711 3712 ]] #vectormask[64]:{byte} >> Lane_size = 2 >> 3711 ExtractLowMask === _ 3710 [[ 3713 3714 ]] #vectormask[32]:{short} >> 3712 ExtractHighMask === _ 3710 [[ 3715 3716 ]] #vectormask[32]:{short} >> Lane_size = 4 >> 3713 ExtractLowMask === _ 3711 [[ 3717 3718 ]] #vectormask[16]:{int} >> 3714 ExtractHighMask === _ 3711 [[ 3719 3720 ]] #vectormask[16]:{int} >> 3715 ExtractLowMask === _ 3712 [[ 3721 3722 ]] #vectormask[16]:{int} >> 3716 ExtractHighMask === _ 3712 [[ 3723 3724 ]] #vectormask[16]:{int} >> Lane_size = 8 >> 3717 ExtractLowMask === _ 3713 [[ ]] #vectormask[8]:{long} >> 3718 ExtractHighMask === _ 3713 [[ ]] #vectormask[8]:{long} >> 3719 ExtractLowMask === _ 3714 [[ ]] #vectormask[8]:{long} >> 3720 ExtractHighMask === _ 3714 [[ ]] #vectormask[8]:{long} >> 3721 ExtractLowMask === _ 3715 [[ ]] #vectormask[8]:{long} >> 3722 ExtractHighMask === _ 3715 [[ ]] #vectormask[8]:{long} >> 3723 ExtractLowMask === _ 3716 [[ ]] #vectormask[8]:{long} >> 3724 ExtractHighMask === _ 3716 [[ ]] #vectormask[8]:{long} >> >> That is indeed `15` masks. Hmm. Maybe that is the best one can do. And maybe it is not all that bad. But again, would be interesting to see the benchmarks for that case. > > Aha, maybe here we could just get away with 1 vmask for `byte`, and then directly extract 8 vmasks for `long`, since we do not need the ones in the middle? You'd have to generalize your `Extract(High/Low)Mask`. We just benchmarked this "byte + long" case and saw some performance regressions after vectorization. Yes, too many mask operations are expensive. GCC does this in a better way: For adjacent data sizes (larger = 2 * smaller), it extracts two halves of the vector mask, but for non-adjacent data sizes, it re-generates vector masks without extraction. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251401597 From pli at openjdk.org Tue Jul 4 02:22:23 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 02:22:23 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 17:34:42 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 735: > >> 733: vnode = new StoreVectorMaskedNode(ctrl, mem, addr, val, at, mask); >> 734: } >> 735: } else if (VectorNode::is_convert_opcode(opc)) { > > Ok, this does work for same size conversions: > `./java -Xcomp -XX:-TieredCompilation -XX:+TraceNewVectors -XX:+TraceLoopOpts -XX:+UnlockExperimentalVMOptions -XX:+UseMaskedLoop -XX:+TraceMaskedLoop -XX:CompileCommand=compileonly,Test::test0 -XX:+TraceSuperWord Test.java` > > public class Test { > static int RANGE = 1024; > > public static void main(String[] strArr) { > double a[] = new double[RANGE]; > long b[] = new long[RANGE]; > test0(a, b); > } > > static void test0(double[] a, long[] b) { > for (int i = 0; i < RANGE; i++) { > b[i] = (long)a[i]; > } > } > } > > Good to see some conversion is possible. But if I replace double with float, I get `Vector element size does not match`. Can that limitation be lifted? We tried to do that but found some obstacles (perhaps you are aware of). > If you started implementing type conversion for different size types, you'd have to extract_lo/hi or pack the vectors. That would be an invasive change to the current implementation. As you said in your overall feedback, conversions between types of different data sizes requires vector pack/unpack which has conflict with existing semantics of current C2 type conversion nodes. We are still considering how to do it. It would be good if you have better suggestions or can help us with this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251403951 From pli at openjdk.org Tue Jul 4 02:29:09 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 02:29:09 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> Message-ID: On Tue, 27 Jun 2023 17:47:33 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 785: >> >>> 783: } >>> 784: >>> 785: // Duplicate vectorized operations with given vector element size >> >> Got to here today. There should probably be some comment higher up that you first replace scalars with one vector each, and then duplicate them for the larger types that need multiple vectors. >> >> I'm also concerned that there may be some platforms where the max vector width in bytes is not the same for all types. But maybe all platforms that support masked register ops also all have the same vector width in bytes for all types? > > Assume we only allow `32` bit registers for `int`, but `64` bits for doubles. Now you'd be assuming that there need to be double as many `double` vectors as `int` vectors. But actually, they need the same amount of vectors, because vectors of both sizes fit exactly `8` elements. More comments are added. I can only say this way is good on current AArch64. As we don't have enough knowledge of other architectures, we may need some help if we need to change this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405212 From pli at openjdk.org Tue Jul 4 02:29:14 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 02:29:14 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 10:20:25 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 790: > >> 788: // Compute vector duplication count and the vmask tree level >> 789: int dup_cnt = lane_size / _size_stats.smallest_size(); >> 790: int level = exact_log2(dup_cnt); > > Rename `level` to something more expressive. Maybe just `vmask_tree_level`. Also in all other methods. Otherwise it is not quite clear what it is supposed to be. Renamed > src/hotspot/share/opto/vmaskloop.cpp line 798: > >> 796: if (type2aelembytes(statement_bottom_type(stmt)) != lane_size) { >> 797: continue; >> 798: } > > You could assert here, that the max vector size for bt is as expected. It's done. > src/hotspot/share/opto/vmaskloop.cpp line 874: > >> 872: void VectorMaskedLoop::adjust_vector_node(Node* vn, Node_List* vmask_tree, >> 873: int level, int mask_off) { >> 874: Node* vmask = vmask_tree->at((1 << level) + mask_off); > > Again, rename `level`. Maybe it could be `vmask_tree_level` and `vmask_tree_level_offset`? Here I finally understood what you mean by the two variables `level` and `mask_off`. Also renamed. > src/hotspot/share/opto/vmaskloop.cpp line 876: > >> 874: Node* vmask = vmask_tree->at((1 << level) + mask_off); >> 875: int lane_size = type2aelembytes(Matcher::vector_element_basic_type(vmask)); >> 876: uint vector_size_in_bytes = Matcher::max_vector_size(T_BYTE); > > Can you add an assert that this is the same as `Matcher::vector_width_in_bytes(Matcher::vector_element_basic_type(vmask))` ? Asserts added. > src/hotspot/share/opto/vmaskloop.cpp line 884: > >> 882: Node* ptr = vn->in(MemNode::Address); >> 883: Node* base = ptr->in(AddPNode::Base); >> 884: int mem_scale = Matcher::max_vector_size(T_BYTE); > > Duplicate of `vector_size_in_bytes`? Aha, it's duplicated now. Previously we did some strided access support so we added this. Removed now. > src/hotspot/share/opto/vmaskloop.cpp line 893: > >> 891: // 2) For populate index, update start index for non-zero mask offset >> 892: if (mask_off != 0) { >> 893: int v_stride = vector_size_in_bytes / lane_size * _cl->stride_con(); > > Is there any test for PopulateIndex with stride that is not `1`? For now I guess only `-1` would even be allowed. Good question, I will check it then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405695 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405307 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405747 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405983 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251406373 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251406879 From pli at openjdk.org Tue Jul 4 02:29:15 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 02:29:15 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: <0s9ixJcCQIRzJ3h4tpPwVeC7HmYbdDqhd3V6BWZDUTg=.f2b9dad5-73f2-4a34-b24d-639f4fe3de9e@github.com> Message-ID: On Wed, 28 Jun 2023 10:16:28 GMT, Emanuel Peter wrote: >> I just added some shorts, so that the int and float would be duplicated ;) > > Suggested solution: track the last memory state per slice, just like I recently did in `SuperWord::schedule_reorder_memops` with `current_state_in_slice`. I'm not quite familiar with memory slice. Will do more investigation and come back later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251405644 From pli at openjdk.org Tue Jul 4 02:40:15 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 02:40:15 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 10:41:12 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > src/hotspot/share/opto/vmaskloop.cpp line 89: > >> 87: cl->mark_loop_vectorized(); >> 88: cl->mark_vector_masked(); >> 89: _phase->C->set_max_vector_size(MaxVectorSize); > > What is this for? On AArch64 with SVE, we pre-initialize an all-true register (p7) only in compiled methods where `MaxVectorSize` is set. So if a compiled method implicitly uses ptrue (because it's vectorized), we need to set the `MaxVectorSize` to guarantee p7 is initialized. This usage starts since the initial SVE patch (in 2020). We know this is not a perfect solution and @fg1417 is currently investigating if we have better solutions. > src/hotspot/share/opto/vmaskloop.cpp line 531: > >> 529: if (!addp->is_AddP() || !operates_on_array_of_type(addp, mem_type)) { >> 530: return nullptr; >> 531: } > > I guess this prevents you from having `Unsafe` use type mismatched loads/stores. But it also prevents vectorization in cases where one might just store shorts into an int array using `Unsafe`. This saves you a lot of headaches. You probably don't lose too much for not vectorizing those cases. Exactly. Is there any case in real applications that may store shorts into an int array? > src/hotspot/share/opto/vmaskloop.cpp line 642: > >> 640: >> 641: // Helper method for finding or creating a vector input at specified index >> 642: Node* VectorMaskedLoop::get_vector_input(Node* node, uint idx) { > > This is analogous to `SuperWord::vector_opd`. Can we not refactor things so that we can share the code? I like refactoring, but it may require big effort. Shall we discuss it later in our future conversations? > src/hotspot/share/opto/vmaskloop.cpp line 939: > >> 937: Node* root_vmask = vmask_tree->at(1); >> 938: >> 939: // Replace vectorization candidate nodes to vector nodes > > Expand explanation. Say that you are for now only generating a single vector node per scalar node. And that the duplication afterwards makes sure that all scalar nodes are "widened" to the same number of elements. The smalles type using a single vector, larger types using multiple (duplicated) vectors per scalar node. Thanks for suggestion. Done in commit 2. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251410392 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251410844 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251411540 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251411689 From pli at openjdk.org Tue Jul 4 02:44:17 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 02:44:17 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 10:58:02 GMT, Emanuel Peter wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > test/hotspot/jtreg/compiler/vectorization/runner/ArrayCopyTest.java line 82: > >> 80: @IR(applyIfCPUFeature = {"sve", "true"}, >> 81: applyIf = {"UseMaskedLoop", "true"}, >> 82: counts = {IRNode.LOOP_VECTOR_MASK, ">0"}) > > We could also do this: > If the CPU features do not support the features for `UseMaskedLoop`, then just put it back to `false`. That way, we do not have to check for the required cpu features. Because when the flag it `true`, we know the platform must also support the corresponding masked instructions. Yes, thanks for this. I will clean up all the IR rules after JDK-8311130 and JDK-8309697 are done. > test/hotspot/jtreg/compiler/vectorization/runner/ArrayInvariantFillTest.java line 69: > >> 67: @Test >> 68: @IR(applyIfCPUFeatureOr = {"asimd", "true", "sse2", "true"}, >> 69: applyIf = {"OptimizeFill", "false"}, > > This seems unrelated. Why did you have to add this? Will cleanup this in JDK-8309697. > test/hotspot/jtreg/compiler/vectorization/runner/VectorizationTestRunner.java line 84: > >> 82: TestFramework irTest = new TestFramework(klass); >> 83: // Add extra VM options to enable more auto-vectorization chances >> 84: irTest.addFlags("-XX:-OptimizeFill"); > > Aha, you removed this too. Fair enough. But since the runner is currently requiring everything to be `flagless`, now I cannot actually force `-XX:-OptimizeFill` from the outside. And that means that potentially the tests are never actually run with `OptimizeFill` off, and we never actually can check the IR rules. We lose test coverage. That makes me a bit nervous. > > Suggestion: if tests actually require the flag off to execute the IR rule, then we should have two scenarios, one where the flag is on, and one when it is off. Again, will cleanup this in JDK-8309697. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251412833 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251413032 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251413480 From haosun at openjdk.org Tue Jul 4 04:53:11 2023 From: haosun at openjdk.org (Hao Sun) Date: Tue, 4 Jul 2023 04:53:11 GMT Subject: RFR: 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest Message-ID: Three groups of runtime routines, i.e. arraycopy, copy and fill, are tested inside function `initialize_final_stubs()`. The test runs every time the debug VM is started. I think it's a usual convention that it's better not to run functional tests on startup. Hence, this patch proposes to move the stub test under `test/hotspot/gtest`. It's one copy-paste patch, except the following two minor changes: 1) Remove `ASSERT` condition check, and the gtest case will be run for release build as well. 2) Use the gtest helper `ASSERT_TRUE()` to replace the `assert()`. Note that the downside is that we won't catch stub implementation errors immediately on startup. Test: 1) Cross compilations on arm32/s390/ppc/riscv passed. 2) tier1 test passed on Linux/AArch64, Linux/x86_64 and macOS/Apple silicon. Note that hotspot/gtest is run in tier1. 3) VM builds with several different options (e.g., zero build, release build, fastdebug build, client variant, no C1/c2) passed on Linux/AArch64 and Linux/x86_64. 4) I manually injected errors in arraycopy/copy/fill stubs and verified `make test TEST="gtest:StubRoutines"` fails or not. But this check only worked for array fill stub. For the remaining two stubs, VM build (`make images`) failed firstly before running the gtest, because the built VM with these "erroneous" stubs were executed to do something such as loading Java core library code. ------------- Commit messages: - 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest Changes: https://git.openjdk.org/jdk/pull/14765/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14765&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310355 Stats: 272 lines in 2 files changed: 153 ins; 119 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14765.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14765/head:pull/14765 PR: https://git.openjdk.org/jdk/pull/14765 From haosun at openjdk.org Tue Jul 4 04:53:12 2023 From: haosun at openjdk.org (Hao Sun) Date: Tue, 4 Jul 2023 04:53:12 GMT Subject: RFR: 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 04:45:31 GMT, Hao Sun wrote: > Three groups of runtime routines, i.e. arraycopy, copy and fill, are tested inside function `initialize_final_stubs()`. The test runs every time the debug VM is started. > > I think it's a usual convention that it's better not to run functional tests on startup. Hence, this patch proposes to move the stub test under `test/hotspot/gtest`. > > It's one copy-paste patch, except the following two minor changes: > > 1) Remove `ASSERT` condition check, and the gtest case will be run for release build as well. > > 2) Use the gtest helper `ASSERT_TRUE()` to replace the `assert()`. > > Note that the downside is that we won't catch stub implementation errors immediately on startup. > > Test: > > 1) Cross compilations on arm32/s390/ppc/riscv passed. > > 2) tier1 test passed on Linux/AArch64, Linux/x86_64 and macOS/Apple silicon. Note that hotspot/gtest is run in tier1. > > 3) VM builds with several different options (e.g., zero build, release build, fastdebug build, client variant, no C1/c2) passed on Linux/AArch64 and Linux/x86_64. > > 4) I manually injected errors in arraycopy/copy/fill stubs and verified `make test TEST="gtest:StubRoutines"` fails or not. But this check only worked for array fill stub. For the remaining two stubs, VM build (`make images`) failed firstly before running the gtest, because the built VM with these "erroneous" stubs were executed to do something such as loading Java core library code. test/hotspot/gtest/runtime/test_stubRoutines.cpp line 2: > 1: /* > 2: * Copyright (c) 1997, 2023, Oracle and/or its affiliates. All rights reserved. Most of this file is copied from `stubRoutines.cpp`, hence I continue to use the copyright year from `stubRoutines.cpp`. But I'm not 100% sure if it is correct. Please let me know if it is not. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14765#discussion_r1251476506 From stuefe at openjdk.org Tue Jul 4 05:16:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 05:16:56 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: References: <7QVoA3Ew5bOqflalrTQdjFoY-3E--4BI5y9rI14UecQ=.40ca559b-af0a-4ef3-bc9f-7dac0a52e77d@github.com> <_j4bapWKtvCoEfs2PyxKmDJTqfAMuRePkPJX2yx2W-8=.6cedb0a3-817e-433e-b1f2-288844d2983c@github.com> Message-ID: On Mon, 3 Jul 2023 16:03:51 GMT, Johan Sj?len wrote: > > But with your proposal, I pay for the construction of both LogMessage and NonInterleavingLogStream every time. > > Yeah, I made a mental footnote that `LogTarget` should be able to be used to initialize these inside of an if-block. UL is still a bit painful about these things. My suggestion sounds like a massive pain, I'm happy if you don't do it :). Thank you @jdksjolen ! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1251492288 From stuefe at openjdk.org Tue Jul 4 05:24:12 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 05:24:12 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v6] In-Reply-To: References: Message-ID: > I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make these interleavings easier to understand and more correct. > > ---- > > CDS narrow Klass handling plays a role for archived heap objects. > > When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. > We recompute those narrow Klass ids using the following encoding scheme: > - base = future assumed mapping start address > - shift = dump time (!) JVMs encoding shift (A) > see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 > > At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: > - encoding base is the range start address (mapping base) > - encoding shift is always zero > see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 > > The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) > > At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. > > ------------------- > > There are some small things wrong with the current code. That wrongness does not lead to errors but makes understanding the code difficult. > > Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: > > In `CompressedKlassPointers::initialize` there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). > > In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass range size*. The comment in ArchiveHeapWrit... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Fix Windows build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14688/files - new: https://git.openjdk.org/jdk/pull/14688/files/4de3995d..4dbfcc9e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14688&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14688&range=04-05 Stats: 16 lines in 4 files changed: 7 ins; 6 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14688/head:pull/14688 PR: https://git.openjdk.org/jdk/pull/14688 From stuefe at openjdk.org Tue Jul 4 05:31:09 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 05:31:09 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v7] In-Reply-To: References: Message-ID: > I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make these interleavings easier to understand and more correct. > > ---- > > CDS narrow Klass handling plays a role for archived heap objects. > > When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. > We recompute those narrow Klass ids using the following encoding scheme: > - base = future assumed mapping start address > - shift = dump time (!) JVMs encoding shift (A) > see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 > > At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: > - encoding base is the range start address (mapping base) > - encoding shift is always zero > see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 > > The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) > > At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. > > ------------------- > > There are some small things wrong with the current code. That wrongness does not lead to errors but makes understanding the code difficult. > > Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: > > In `CompressedKlassPointers::initialize` there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). > > In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass range size*. The comment in ArchiveHeapWrit... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding - revert accidental change - Fix Windows build - Add alternative for !INCLUDE_CDS_JAVA_HEAP path - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding - fix comment - Merge - -remove narrow_klass_xxx from FileMap - remove ArchiveHeapWriter::precomputed_narrow_klass_base_delta and replaced it with clear comments - changed runtime fail condition to asserts in FileMapInfo::can_use_heap_region() - fix-cleanup-CDS-nKlass-encoding ------------- Changes: https://git.openjdk.org/jdk/pull/14688/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14688&range=06 Stats: 149 lines in 8 files changed: 67 ins; 50 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/14688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14688/head:pull/14688 PR: https://git.openjdk.org/jdk/pull/14688 From rkennke at openjdk.org Tue Jul 4 07:03:11 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 4 Jul 2023 07:03:11 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v38] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 15:17:29 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Build fixes > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Array.java line 68: > >> 66: return !VM.getVM().isCompressedOopsEnabled(); >> 67: } >> 68: } > > I think `isCompressedOopsEnabled` already does the right thing, so: > > Suggestion: > > if (type == BasicType.T_OBJECT || type == BasicType.T_ARRAY) { > return !VM.getVM().isCompressedOopsEnabled(); > } On 32-bit, isCompressedOopsEnabled() would be false, so the method returns true, which means we would 8-byte align reference array elements? That doesn't seem right. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251582196 From dnsimon at openjdk.org Tue Jul 4 07:12:13 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 4 Jul 2023 07:12:13 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v5] In-Reply-To: References: Message-ID: > In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. > > This PR proposes a `TraceClassLoadingCause` VM flag: > > > product(ccstr, TraceClassLoadingCause, nullptr, DIAGNOSTIC, \ > "Print a stack trace when loading a class whose fully" \ > "qualified name contains this string ("*" matches " \ > "any class).") \ > > > I would have liked to implement this using Unified Logging but UL has no support for filtering on the class names. > > Example usage: > > java -XX:+UnlockDiagnosticVMOptions -XX:TraceClassLoadingCause=Thread --version > Loading java.lang.Thread > Loading java.lang.Thread$FieldHolder > Loading java.lang.Thread$Constants > Loading java.lang.Thread$UncaughtExceptionHandler > Loading java.lang.ThreadGroup > Loading java.lang.BaseVirtualThread > Loading java.lang.VirtualThread > Loading java.lang.ThreadBuilders$BoundVirtualThread > Loading java.lang.Thread$State > at jdk.internal.misc.VM.toThreadState(java.base/VM.java:336) > at java.lang.Thread.threadState(java.base/Thread.java:2731) > at java.lang.Thread.isTerminated(java.base/Thread.java:2738) > at java.lang.Thread.getThreadGroup(java.base/Thread.java:1957) > at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:314) > at java.lang.System.initPhase1(java.base/System.java:2206) > Loading java.lang.Thread$ThreadIdentifiers > at java.lang.Thread.(java.base/Thread.java:734) > at java.lang.Thread.(java.base/Thread.java:1477) > at java.lang.ref.Reference$ReferenceHandler.(java.base/Reference.java:198) > at java.lang.ref.Reference.startReferenceHandlerThread(java.base/Reference.java:300) > at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:318) > at java.lang.System.initPhase1(java.base/System.java:2206) > Loading java.lang.ref.Finalizer$FinalizerThread > at java.lang.ref.Finalizer.startFinalizerThread(java.base/Finalizer.java:187) > at java.lang.ref.Reference$1.startThreads(java.base/Reference.java:319) > at java.lang.System.initPhase1(java.base/System.java:2206) > ... Doug Simon has updated the pull request incrementally with one additional commit since the last revision: require LogClassLoadingCauseFor to class load cause logging to produce output ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14553/files - new: https://git.openjdk.org/jdk/pull/14553/files/6947f359..24f8539a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=03-04 Stats: 13 lines in 3 files changed: 8 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14553.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14553/head:pull/14553 PR: https://git.openjdk.org/jdk/pull/14553 From dnsimon at openjdk.org Tue Jul 4 07:12:14 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 4 Jul 2023 07:12:14 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v4] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 22:53:58 GMT, David Holmes wrote: > Are you treating not setting LogClassLoadingCauseFor as being a wildcard for "all"? That's not what you had earlier and seems wrong - this needs to be opt-in. Done: 24f8539a780aa6f6284c0fd7d4eee34d416cb544 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14553#issuecomment-1619637154 From rkennke at openjdk.org Tue Jul 4 07:28:12 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 4 Jul 2023 07:28:12 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v38] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 17:29:06 GMT, Aleksey Shipilev wrote: >> Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: >> >> Build fixes > > src/hotspot/share/gc/shared/jvmFlagConstraintsGC.cpp line 343: > >> 341: JVMFlag::printError(verbose, >> 342: "MinTLABSize (" SIZE_FORMAT " bytes) must be " >> 343: "less than or equal to ergonomic TLAB maximum (" SIZE_FORMAT " words)\n", > > Wait, let's not mix "bytes" and "words" in the same message. Users do not readily know what is the HeapWordSize on the machine, or even how bytes and words might be related. > > Is there a reason to even change these? Does the original code overflow? Yeah, IIRC I changed this to deal with overflow (on 32-bit) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1251608169 From dholmes at openjdk.org Tue Jul 4 07:38:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 4 Jul 2023 07:38:56 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v5] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 07:12:13 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > require LogClassLoadingCauseFor to class load cause logging to produce output Updates look good. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14553#pullrequestreview-1512258334 From pli at openjdk.org Tue Jul 4 08:50:14 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 4 Jul 2023 08:50:14 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization In-Reply-To: <8KPkr2loby3RVIrYQBiXWv3Ph2E0saSLVDBMFHi88LQ=.b1ffb28d-54a8-4dcc-9472-e53b055a72ee@github.com> References: <8KPkr2loby3RVIrYQBiXWv3Ph2E0saSLVDBMFHi88LQ=.b1ffb28d-54a8-4dcc-9472-e53b055a72ee@github.com> Message-ID: On Thu, 29 Jun 2023 10:54:29 GMT, Emanuel Peter wrote: >> Hi @eme64, >> >> I guess you have done your first round of review. @fg1417 and I really appreciate all your constructive inputs. By reading your comments, I believe you have reviewed this patch in very detail. Thanks again! >> >> What I am doing now: >> >> - I'm trying to fix the issues which I think can be fixed immediately. >> - I'm trying to answer all your simple questions ASAP. >> >> For your request of big refactoring work, I feel like I personally may not have enough time and capability to complete it in a short time. We may need some discussion about it. But it's great to know more about your "hybrid vectorizer" plan from your feedback. It looks like a grand plan, and requires significant effort and cooperation. I strongly agree that we need some conversation to discuss where we should move forward and what we can cooperate. Could you give us a moment to digest your idea before we schedule a conversation? >> >> BTW: What's your preferred time for a conversation? We are based in Shanghai (GMT+8) > > Hi @pfustc ! > > I'm grad you appreciate my review. > >> For your request of big refactoring work, I feel like I personally may not have enough time and capability to complete it in a short time. > > Are you under some time constraint? No pressure from my side, take the time you need. > > I would very much love to have a conversation over a video call with you. I think that would be beneficial for all of us. The problem from our side (Oracle) are intellectual property concerns. OpenJDK emails and PR's are all under the Oracle Contributor Agreement. So there I'm free to have conversations. I'm trying to figure out if we can have a similar frame for a video call, sadly it may take a few weeks or months to get that sorted, as many people are on summer vacation. > > Please take some time to digest the feedback. This is a big change set, it will take a while to be ready for integration at any rate. And again, I would really urge you to consider some refactoring of SuperWord in a separate RFE before this change here. > > I'm looking forward to more collaboration - over PR comments, emails, and hopefully eventually video calls as well ? > Emanuel Hi @eme64, In commit 2, I have fixed all simple issues according to your comments and marked them "resolved". And we may spend more time then on the remaining unresolved issues. Now I'd like to answer more questions in your overall feedback. > You could not mask all instructions, just loads and stores. But do you really need to mask all other instructions too? I guess not if they do not have side-effects, right? Adding avx/avx2 would unlock this feature for many more intel machines. Besides loads and stores, vector reductions also need to be masked because they do have side-effect (only active lanes should be involved in reduction operations). But yes, reduction support is already excluded from this patch because of performance. Perhaps in the future we can consider transforming reductions to non-reductions (just like what you did recently in SuperWord) to get better performance. In commit 2, I have updated this code according to your suggestion. Thanks for it! Regarding avx/avx2 support, I'm afraid we don't have enough knowledge and test resources of x86. We may need Intel's help if we want to do this. > Indexing arrays: From my experiments, I have to conclude that you only allow simple indexing of the form a[i], no offset or scaling a[i*2 + 3]. Scaling support is excluded from this patch because they're actually "strided accesses". We cannot get better performance for them with current gather/scatter nodes in C2. But, offset support is in (except the only special case of `iv + stride` which you experimented). You may try other cases like `a[i + 2]` or `a[i - 3]` and see they are vectorized. > Why not use this for main-loops? As I have mentioned above, we have experimented more than what we do in this patch, including reductions, strided accesses, type conversions and for normal (unsplit) loops. Indeed, at the beginning, we used this for normal loops and did vectorization before C2's loop iteration split - this is the ideal SVE-style vectorization on AArch64 as the generated code is very elegant. But unfortunately, that performance result is not as good as we expected, and it added a lot of complexity because we need to be make it compatible with C2's loop strip-mining. Later, we turned to use this for post loops. It does reduce a lot of complicity and show better performance (at least on all SVE CPUs we have) now. Using this for main loops is still in our long-term plan, but not in short-term because current SuperWord can do it well. > What about a hybrid vectorizer? While working on this patch, we were also thinking about how to make this co-exist with SuperWord. That's a big question! But unfortunately, current SuperWord code is quite convoluted with heavy historical burden and very few people in the JDK community were interested in this before. And for a long time, we have been hoping someone would refactor it. But before you, nobody seems to have the interest or ambition to do it. We now are very glad to know that you have such "hybrid" plan. But I think, how to refactor current SuperWord code also depends on what the "hybrid vectorizer" eventually looks like. We may need more discussions in the future about the direction of refactoring. Looking forward to having a video call with you. Thanks, Pengfei ------------- PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1619814602 From mbaesken at openjdk.org Tue Jul 4 11:55:08 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 4 Jul 2023 11:55:08 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file Message-ID: There are a number of important environment variables influencing how fontconfig works. See for example https://man.archlinux.org/man/fonts-conf.5 Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. ------------- Commit messages: - JDK-8311285 Changes: https://git.openjdk.org/jdk/pull/14767/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14767&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311285 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14767.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14767/head:pull/14767 PR: https://git.openjdk.org/jdk/pull/14767 From epeter at openjdk.org Tue Jul 4 12:01:17 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Jul 2023 12:01:17 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: <2Pgw8cvXHt5MbQOSiD9C_pIIyE5peaxJDbwI_w-9XJY=.1b09b1e6-e4be-48ea-b054-9d2dac2dbf30@github.com> On Mon, 3 Jul 2023 09:20:19 GMT, Pengfei Li wrote: >> Or is there a clear reason why the two are too different? > > We need more investigation and discussions about this. Will discuss with you later. Sounds good. >> Write: >> Vector element size does not match of the store in the statement. > > Yes, we have tried supporting type conversions (between different type sizes) but current solution is not mature and not included in this patch. So this limitation is added here. Ok, fine. Leave that for the future. >> Does that not prevent `a[i+1]` from being accepted? > > That's a really corner case. In C2's ideal graph, most loop statements eventually uses the loop induction variable `phi` node as a input. That's good. But, there is one exception that a loop statement has a sub-expression of `iv + stride`. In this kind of cases, IGVN may do common sub-expression elimination and the inputs may come from the loop increment node thereafter. As the final step of vector masked transformation replaces the loop increment node, the calculation for `iv + stride` will also be replaced as well and it causes mis-compilation. In current patch, I duplicate the loop increment pattern for update (that's why we have `is_loop_incr_pattern()`, see commit 2) to avoid this issue, but currently it only applies to the expression not in array indices, such as `a[i] = i + 1`. For the patterns like `a[i+1] = i`, I'm still looking for a better approach to handle. Ok yes. You will have to find a way to separate out all these kinds of cases: Using incr on backedge Using incr as "i + stride" inside loop Using incr outside/after loop -> final value + 1 -> count iterations BTW: what happens if you use the `phi` directly after/outside the loop? Do you properly count the final value? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251930727 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251930451 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251926315 From epeter at openjdk.org Tue Jul 4 12:01:20 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Jul 2023 12:01:20 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 09:33:18 GMT, Pengfei Li wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 363: >> >>> 361: // Otherwise, use signed subword type or the statement's bottom type >>> 362: if (subword_stmt) { >>> 363: set_elem_bt(node, get_signed_subword_bt(stmt_bottom_type)); >> >> Why are you taking only the signed subword type, and not unsigned (eg for char you take short)? > > Current SuperWord also does in this way (see `SuperWord::container_type()`). A main reason is that some matching rules on some backends (like x86) only matches signed subword type. AFAICR, it's good to removing this for AArch64. Ok. This sounds like we should probably refactor the backend accordingly. That would simplify things for loop vectorizer / SuperWord. >> src/hotspot/share/opto/vmaskloop.cpp line 548: >> >>> 546: // Check supported memory access via SWPointer. It's not supported if >>> 547: // 1) The constructed SWPointer is invalid >>> 548: // 2) Address is growing down (index scale * loop stride < 0) >> >> Is that a limitation that could be removed in the future? > > Yes, at least on SVE2. For growing up memory accesses, we generate vector masks that indicate active lanes at lower parts of a vector. But it's opposite for growing down memory accesses where active lanes are at higher parts of a vector. Only SVE2 of AArch64 can generate vector masks in this way, current SVE(1) can not. I'm not sure whether x86 AVX-512 has the similar ability. There must surely be some way. The only question is what is the cheapest way to do it, ie with the fewest number of instructions. >> src/hotspot/share/opto/vmaskloop.cpp line 549: >> >>> 547: // 1) The constructed SWPointer is invalid >>> 548: // 2) Address is growing down (index scale * loop stride < 0) >>> 549: // 3) Memory access scale is different from data size >> >> I guess this could also be relaxed for strided accesses in the future? > > Exactly! I have tried supporting some basic strided accesses. The code is not included in this patch as it's not that beneficial on some CPUs and requires more C2 refactorings. Great, you should probably leave that to a future RFE anyway. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251931841 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251929419 PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251930025 From jsjolen at openjdk.org Tue Jul 4 12:03:00 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Jul 2023 12:03:00 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 13:56:09 GMT, Thomas Stuefe wrote: >> Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) >> [0.001s][info][pagesize] default pagesize: 1G >> [0.001s][info][pagesize] Transparent hugepage (THP) suppor... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch 'JDK-8310233-Linux-THP-initialization-incorrect' of github.com:tstuefe/jdk into JDK-8310233-Linux-THP-initialization-incorrect > - Add test case and modify log output slightly Some of the methods seems to leak file descriptors, other than that tests looks good. test/hotspot/jtreg/runtime/os/HugePageConfiguration.java line 92: > 90: Scanner scanner; > 91: try { > 92: scanner = new Scanner(new File("/proc/meminfo")); Nit: You can use try-with-resources here. test/hotspot/jtreg/runtime/os/HugePageConfiguration.java line 128: > 126: try { > 127: String file = "/sys/kernel/mm/transparent_hugepage/enabled"; > 128: BufferedReader reader = new BufferedReader(new FileReader(file)); This leaks the file descriptor (never closed), use try-with-resources. test/hotspot/jtreg/runtime/os/HugePageConfiguration.java line 149: > 147: try { > 148: String file = "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size"; > 149: BufferedReader reader = new BufferedReader(new FileReader(file)); This leaks the file descriptor (never closed), use try-with-resources. ------------- PR Review: https://git.openjdk.org/jdk/pull/14739#pullrequestreview-1512720575 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1251919080 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1251924342 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1251925985 From epeter at openjdk.org Tue Jul 4 12:05:18 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 4 Jul 2023 12:05:18 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 09:44:53 GMT, Pengfei Li wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 357: >> >>> 355: set_elem_bt(node, mem_type); >>> 356: } else { >>> 357: trace_msg(node, "Subword operand does not have precise type"); >> >> Not clear to me what this means. > > Precise type info about signedness means that we know exactly whether the data is signed or unsigned. For some operations, such as right shift, results are different for signed and unsigned operands, so C2 has to know the signedness. However, in any Java arithmetic operation, operands of Java subword types are promoted to int first. Sometimes, for example, if an intermediate result is a binary operation of both signed and unsigned, we don't have the precise type info, so we don't know how to vectorize it. (see below example where the signedness info is lost after a short and a char are added) > > for (int i = 0; i < SIZE; i++) { > shorts[i] = (shorts[i] + chars[i]) >> 10; > } That is annoying. Do you think we can do something about this in the future, or is this just a fundamental restriction of Java / C2? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1251934470 From jsjolen at openjdk.org Tue Jul 4 12:24:10 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Jul 2023 12:24:10 GMT Subject: RFR: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS [v2] In-Reply-To: References: Message-ID: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> > Hi, > > Please consider this small enhancement. `sp` takes a count argument, this was never used by `indent`, let's just use it. Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: Remove SP_USE_TABS ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14502/files - new: https://git.openjdk.org/jdk/pull/14502/files/d9ee4c2f..329e8324 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14502&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14502&range=00-01 Stats: 10 lines in 1 file changed: 0 ins; 9 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14502.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14502/head:pull/14502 PR: https://git.openjdk.org/jdk/pull/14502 From jsjolen at openjdk.org Tue Jul 4 12:24:11 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Jul 2023 12:24:11 GMT Subject: RFR: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS In-Reply-To: References: Message-ID: On Thu, 15 Jun 2023 20:50:11 GMT, Johan Sj?len wrote: > Hi, > > Please consider this small enhancement. `sp` takes a count argument, this was never used by `indent`, let's just use it. Hi, Sorry for the slow response from me on this. @dholmes-ora, let's remove SP_USE_TABS in this PR. I'll change the ticket description. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14502#issuecomment-1620144229 From stuefe at openjdk.org Tue Jul 4 12:46:00 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 12:46:00 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 11:47:54 GMT, Johan Sj?len wrote: >> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8310233-Linux-THP-initialization-incorrect' of github.com:tstuefe/jdk into JDK-8310233-Linux-THP-initialization-incorrect >> - Add test case and modify log output slightly > > test/hotspot/jtreg/runtime/os/HugePageConfiguration.java line 92: > >> 90: Scanner scanner; >> 91: try { >> 92: scanner = new Scanner(new File("/proc/meminfo")); > > Nit: You can use try-with-resources here. Makes sense ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1251981584 From rkennke at openjdk.org Tue Jul 4 12:51:02 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 4 Jul 2023 12:51:02 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v39] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Address @shipilev's comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/4ee4ca78..4ea61a18 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=38 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=37-38 Stats: 227 lines in 23 files changed: 113 ins; 84 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From stuefe at openjdk.org Tue Jul 4 12:55:03 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 12:55:03 GMT Subject: RFR: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS [v2] In-Reply-To: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> References: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> Message-ID: On Tue, 4 Jul 2023 12:24:10 GMT, Johan Sj?len wrote: >> Hi, >> >> Please consider this small enhancement. `sp` takes a count argument, this was never used by `indent`, let's just use it. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Remove SP_USE_TABS Good. src/hotspot/share/utilities/ostream.cpp line 190: > 188: > 189: void outputStream::sp(int count) { > 190: if (count < 0) return; Is this needed? ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14502#pullrequestreview-1512819743 PR Review Comment: https://git.openjdk.org/jdk/pull/14502#discussion_r1251984841 From jsjolen at openjdk.org Tue Jul 4 12:59:00 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Jul 2023 12:59:00 GMT Subject: RFR: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS [v2] In-Reply-To: References: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> Message-ID: On Tue, 4 Jul 2023 12:46:19 GMT, Thomas Stuefe wrote: >> Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove SP_USE_TABS > > src/hotspot/share/utilities/ostream.cpp line 190: > >> 188: >> 189: void outputStream::sp(int count) { >> 190: if (count < 0) return; > > Is this needed? The `count < 0` check? Yes, as Shipilev mentioned: >Looks okay. I thought what would happen if _indentation < _position, but sp() seems to handle count < 0 fine. So it's nice that we don't have to care at the call site. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14502#discussion_r1251996343 From stuefe at openjdk.org Tue Jul 4 13:02:22 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 13:02:22 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v3] In-Reply-To: References: Message-ID: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> > Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: > - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) > - that THPs are enabled if that page size is >0. > > Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). > > ------ > > About the patch: > > This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. > > Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). > > ------------- > > Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: > > > thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled > always madvise [never] > thomas at starfish $ cat /proc/meminfo | grep Hugepage > Hugepagesize: 1048576 kB > > > Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Using the default large page size: 1G > [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G > ... > [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G > > > With patch, we correctly refuse to use large pages (and we log more info): > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) > [0.001s][info][pagesize] default pagesize: 1G > [0.001s][info][pagesize] Transparent hugepage (THP) support: > [0.001s][info][pagesize] mode: never > [0.001s][warning][pagesize] UseLargePages ... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Feedback johan ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14739/files - new: https://git.openjdk.org/jdk/pull/14739/files/a65c346b..69c39c8a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14739&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14739&range=01-02 Stats: 24 lines in 1 file changed: 3 ins; 4 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/14739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14739/head:pull/14739 PR: https://git.openjdk.org/jdk/pull/14739 From stuefe at openjdk.org Tue Jul 4 13:02:23 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 13:02:23 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v2] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 11:59:58 GMT, Johan Sj?len wrote: >> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch 'JDK-8310233-Linux-THP-initialization-incorrect' of github.com:tstuefe/jdk into JDK-8310233-Linux-THP-initialization-incorrect >> - Add test case and modify log output slightly > > Some of the methods seems to leak file descriptors, other than that tests looks good. Thanks @jdksjolen! I changed the test accordingly. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1620203966 From stuefe at openjdk.org Tue Jul 4 13:14:53 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 13:14:53 GMT Subject: RFR: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS [v2] In-Reply-To: References: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> Message-ID: <2sDQfVXeaA9xhnn-Qi9qlaUgdLe3PicPBebsPiuT4GY=.ee332c6a-4e03-4520-8b75-3d3a47dcbda5@github.com> On Tue, 4 Jul 2023 12:56:32 GMT, Johan Sj?len wrote: >> src/hotspot/share/utilities/ostream.cpp line 190: >> >>> 188: >>> 189: void outputStream::sp(int count) { >>> 190: if (count < 0) return; >> >> Is this needed? > > The `count < 0` check? Yes, as Shipilev mentioned: > >>Looks okay. I thought what would happen if _indentation < _position, but sp() seems to handle count < 0 fine. > > So it's nice that we don't have to care at the call site. Ah, okay. Missed that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14502#discussion_r1252014298 From jsjolen at openjdk.org Tue Jul 4 13:14:56 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 4 Jul 2023 13:14:56 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v3] In-Reply-To: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Tue, 4 Jul 2023 13:02:22 GMT, Thomas Stuefe wrote: >> Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) >> [0.001s][info][pagesize] default pagesize: 1G >> [0.001s][info][pagesize] Transparent hugepage (THP) suppor... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback johan Thank you Thomas! These changes look good to me now, I'm approving this. ------------- Marked as reviewed by jsjolen (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14739#pullrequestreview-1512865635 From stuefe at openjdk.org Tue Jul 4 13:28:53 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 4 Jul 2023 13:28:53 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v7] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 05:31:09 GMT, Thomas Stuefe wrote: >> I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make these interleavings easier to understand and more correct. >> >> ---- >> >> CDS narrow Klass handling plays a role for archived heap objects. >> >> When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. >> We recompute those narrow Klass ids using the following encoding scheme: >> - base = future assumed mapping start address >> - shift = dump time (!) JVMs encoding shift (A) >> see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 >> >> At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: >> - encoding base is the range start address (mapping base) >> - encoding shift is always zero >> see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 >> >> The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) >> >> At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. >> >> ------------------- >> >> There are some small things wrong with the current code. That wrongness does not lead to errors but makes understanding the code difficult. >> >> Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: >> >> In `CompressedKlassPointers::initialize` there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). >> >> In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possi... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - revert accidental change > - Fix Windows build > - Add alternative for !INCLUDE_CDS_JAVA_HEAP path > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - fix comment > - Merge > - -remove narrow_klass_xxx from FileMap > - remove ArchiveHeapWriter::precomputed_narrow_klass_base_delta and replaced it with clear comments > - changed runtime fail condition to asserts in FileMapInfo::can_use_heap_region() > - fix-cleanup-CDS-nKlass-encoding x86 test error unrelated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14688#issuecomment-1620251142 From dnsimon at openjdk.org Tue Jul 4 22:39:05 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 4 Jul 2023 22:39:05 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v6] In-Reply-To: References: Message-ID: > In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. > > This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: > > > product(ccstr, LogClassLoadingCauseFor, nullptr, \ > "Apply -Xlog:class+load+cause* to classes whose fully " \ > "qualified name contains this string ("*" matches " \ > "any class).") \ > > > Example usage: > > java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version > [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: > [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) > [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) > [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystemProvider.(java.b... Doug Simon has updated the pull request incrementally with two additional commits since the last revision: - use OS specific native stack printing in class load cause native stack logging - add tests for class load cause logging ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14553/files - new: https://git.openjdk.org/jdk/pull/14553/files/24f8539a..c39447a7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=04-05 Stats: 26 lines in 2 files changed: 23 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14553.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14553/head:pull/14553 PR: https://git.openjdk.org/jdk/pull/14553 From dnsimon at openjdk.org Tue Jul 4 23:23:07 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 4 Jul 2023 23:23:07 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: > In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. > > This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: > > > product(ccstr, LogClassLoadingCauseFor, nullptr, \ > "Apply -Xlog:class+load+cause* to classes whose fully " \ > "qualified name contains this string ("*" matches " \ > "any class).") \ > > > Example usage: > > java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version > [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: > [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) > [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) > [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystemProvider.(java.b... Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: use OS specific native stack printing in class load cause native stack logging ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14553/files - new: https://git.openjdk.org/jdk/pull/14553/files/c39447a7..e6ffca32 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=05-06 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14553.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14553/head:pull/14553 PR: https://git.openjdk.org/jdk/pull/14553 From dholmes at openjdk.org Wed Jul 5 01:57:08 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 01:57:08 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v4] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 07:06:17 GMT, Doug Simon wrote: >>> The test uses -Xlog:all=trace:file=all.log so will indiscriminately log class+load+cause+native for every class loaded and the test seems to load a lot of classes. >> >> Are you treating not setting `LogClassLoadingCauseFor` as being a wildcard for "all"? That's not what you had earlier and seems wrong - this needs to be opt-in. >> >>> The test crashes the VM as VMError::print_native_stack is apparently not safe to call in all class loading contexts. >> >> That is a distinct bug in itself. Probably print_native_stack needs additional safety/state checks. I will file a bug for that. >> >> Edit: filed [JDK-8311255](https://bugs.openjdk.org/browse/JDK-8311255) > >> Are you treating not setting LogClassLoadingCauseFor as being a wildcard for "all"? That's not what you had earlier and seems wrong - this needs to be opt-in. > > Done: 24f8539a780aa6f6284c0fd7d4eee34d416cb544 > @dougxc Please do not rebase or force-push to an active PR as it invalidates existing review comments. Yes @dougxc please do not do this. There is no need, skara will flatten the final commit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14553#issuecomment-1620906870 From dholmes at openjdk.org Wed Jul 5 02:15:03 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 02:15:03 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 23:23:07 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > use OS specific native stack printing in class load cause native stack logging Changes requested by dholmes (Reviewer). src/hotspot/share/oops/instanceKlass.cpp line 3867: > 3865: // We have printed the native stack in platform-specific code, > 3866: // so nothing else to do in this case. > 3867: } else { Why did you do this? Did you observe problems on Windows? This is too low-level for this code. If we need to do this then it needs to be pushed down to a lower-level and encapsulated more cleanly. We do not do this in `JavaThread::print_jni_stack()`. I need to revoke my approval while this is resolved. ------------- PR Review: https://git.openjdk.org/jdk/pull/14553#pullrequestreview-1513559020 PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1252466414 From dholmes at openjdk.org Wed Jul 5 02:26:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 02:26:59 GMT Subject: RFR: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS [v2] In-Reply-To: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> References: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> Message-ID: On Tue, 4 Jul 2023 12:24:10 GMT, Johan Sj?len wrote: >> Hi, >> >> Please consider this small enhancement. `sp` takes a count argument, this was never used by `indent`, let's just use it. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Remove SP_USE_TABS Looks good. As an aside I was disappointed to discover that the indentation facility is far less useful than it might be. I had hoped you could set the indentation and then have all following log output indented, until you reset it. But indentation only happens when you explicitly call `indent()` so you can't call e.g. `val->print_on(stream)` and have the caller control indentation (without `print_on` directly supporting it). ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14502#pullrequestreview-1513566060 From dholmes at openjdk.org Wed Jul 5 02:57:04 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 02:57:04 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. src/hotspot/share/utilities/vmError.cpp line 109: > 107: "PATH", "USERNAME", > 108: > 109: // UNIX platforms, fontconfig related The XDG values are not specifically fontconfig related. src/hotspot/share/utilities/vmError.cpp line 110: > 108: > 109: // UNIX platforms, fontconfig related > 110: "XDG_CACHE_HOME", "XDG_CONFIG_HOME", "FC_LANG", This seems a very limited selection - why only these three? (And I find it hard to imagine how these particular ones might lead to crashes - FONTCONFIG_USE_MMAP seems much more interesting in that regard). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14767#discussion_r1252483039 PR Review Comment: https://git.openjdk.org/jdk/pull/14767#discussion_r1252483744 From iklam at openjdk.org Wed Jul 5 05:37:17 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 5 Jul 2023 05:37:17 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v7] In-Reply-To: References: Message-ID: <2LjuTaIynj4_ApDR9KszWkYHHYPDtmlzKwdNHQcSGYA=.43bac8d2-db61-43ad-b3df-1d26e8762960@github.com> On Tue, 4 Jul 2023 05:31:09 GMT, Thomas Stuefe wrote: >> I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. >> >> ---- >> >> CDS narrow Klass handling plays a role for archived heap objects. >> >> When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. >> We recompute those narrow Klass ids using the following encoding scheme: >> - base = future assumed mapping start address >> - shift = dump time (!) JVMs encoding shift (A) >> see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 >> >> At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: >> - encoding base is the range start address (mapping base) >> - encoding shift is always zero >> see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 >> >> The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) >> >> At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. >> >> ------------------- >> >> There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: >> >> - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). >> >> - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass range size*. The comment in ArchiveHeapWriter explains this in greater detail. Tha... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - revert accidental change > - Fix Windows build > - Add alternative for !INCLUDE_CDS_JAVA_HEAP path > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - fix comment > - Merge > - -remove narrow_klass_xxx from FileMap > - remove ArchiveHeapWriter::precomputed_narrow_klass_base_delta and replaced it with clear comments > - changed runtime fail condition to asserts in FileMapInfo::can_use_heap_region() > - fix-cleanup-CDS-nKlass-encoding Looks reasonable to me. This is a nice clean up. Just a few nits. src/hotspot/share/cds/archiveHeapWriter.hpp line 205: > 203: // Since nKlass itself is 32 bit, our encoding range len is 4G, and since we set the base directly > 204: // at mapping start, these 4G are enough. Therefore, we don't need to shift at all (shift=0). > 205: static constexpr int precomputed_narrow_klass_shift = 0; "future" and "runtime" seem to mean the same thing. Maybe we should consistently use "runtime"? src/hotspot/share/oops/compressedOops.cpp line 192: > 190: // set this encoding scheme. Used by CDS at runtime to re-instate the scheme used to pre-compute klass ids for > 191: // archived heap objects. > 192: void CompressedKlassPointers::initialize_for_given_encoding(address addr, size_t len, address requested_base, int requested_shift) { This entire function should be inside _LP64. It should not be referenced by a 32-bit build. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14688#pullrequestreview-1513680142 PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1252552155 PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1252557753 From stuefe at openjdk.org Wed Jul 5 05:51:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 05:51:56 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v7] In-Reply-To: <2LjuTaIynj4_ApDR9KszWkYHHYPDtmlzKwdNHQcSGYA=.43bac8d2-db61-43ad-b3df-1d26e8762960@github.com> References: <2LjuTaIynj4_ApDR9KszWkYHHYPDtmlzKwdNHQcSGYA=.43bac8d2-db61-43ad-b3df-1d26e8762960@github.com> Message-ID: On Wed, 5 Jul 2023 05:32:43 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: >> >> - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding >> - revert accidental change >> - Fix Windows build >> - Add alternative for !INCLUDE_CDS_JAVA_HEAP path >> - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding >> - fix comment >> - Merge >> - -remove narrow_klass_xxx from FileMap >> - remove ArchiveHeapWriter::precomputed_narrow_klass_base_delta and replaced it with clear comments >> - changed runtime fail condition to asserts in FileMapInfo::can_use_heap_region() >> - fix-cleanup-CDS-nKlass-encoding > > src/hotspot/share/oops/compressedOops.cpp line 192: > >> 190: // set this encoding scheme. Used by CDS at runtime to re-instate the scheme used to pre-compute klass ids for >> 191: // archived heap objects. >> 192: void CompressedKlassPointers::initialize_for_given_encoding(address addr, size_t len, address requested_base, int requested_shift) { > > This entire function should be inside _LP64. It should not be referenced by a 32-bit build. I'll fix this. 32-bit would be worth a whole other cleanup. Unfortunately, a lot of that coding (e.g. the `decode_xxx` functions) must still be there for 32-bit since they are only guarded by constant UseCompressedClassPointers, not by ifdefs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1252567746 From stuefe at openjdk.org Wed Jul 5 06:36:03 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 06:36:03 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v8] In-Reply-To: References: Message-ID: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> > I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. > > (*2023-07-05 Updated to fit patch description to the agreed final form*) > > ---- > > CDS narrow Klass handling plays a role for archived heap objects. > > When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. > We recompute those narrow Klass ids using the following encoding scheme: > - base = future assumed mapping start address > - shift = dump time (!) JVMs encoding shift (A) > see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 > > At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: > - encoding base is the range start address (mapping base) > - encoding shift is always zero > see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 > > The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) > > At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. > > ------------------- > > There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: > > - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). > > - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass range size*. The comment in ArchiveHeapWriter explains this in g... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding - only build initializer functions for 64-bit - Consistently use runtime instead of future - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding - revert accidental change - Fix Windows build - Add alternative for !INCLUDE_CDS_JAVA_HEAP path - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding - fix comment - Merge - ... and 2 more: https://git.openjdk.org/jdk/compare/d6578bff...0ea2fa10 ------------- Changes: https://git.openjdk.org/jdk/pull/14688/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14688&range=07 Stats: 156 lines in 8 files changed: 67 ins; 57 del; 32 mod Patch: https://git.openjdk.org/jdk/pull/14688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14688/head:pull/14688 PR: https://git.openjdk.org/jdk/pull/14688 From stuefe at openjdk.org Wed Jul 5 06:36:07 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 06:36:07 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v7] In-Reply-To: <2LjuTaIynj4_ApDR9KszWkYHHYPDtmlzKwdNHQcSGYA=.43bac8d2-db61-43ad-b3df-1d26e8762960@github.com> References: <2LjuTaIynj4_ApDR9KszWkYHHYPDtmlzKwdNHQcSGYA=.43bac8d2-db61-43ad-b3df-1d26e8762960@github.com> Message-ID: On Wed, 5 Jul 2023 05:23:43 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: >> >> - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding >> - revert accidental change >> - Fix Windows build >> - Add alternative for !INCLUDE_CDS_JAVA_HEAP path >> - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding >> - fix comment >> - Merge >> - -remove narrow_klass_xxx from FileMap >> - remove ArchiveHeapWriter::precomputed_narrow_klass_base_delta and replaced it with clear comments >> - changed runtime fail condition to asserts in FileMapInfo::can_use_heap_region() >> - fix-cleanup-CDS-nKlass-encoding > > src/hotspot/share/cds/archiveHeapWriter.hpp line 205: > >> 203: // Since nKlass itself is 32 bit, our encoding range len is 4G, and since we set the base directly >> 204: // at mapping start, these 4G are enough. Therefore, we don't need to shift at all (shift=0). >> 205: static constexpr int precomputed_narrow_klass_shift = 0; > > "future" and "runtime" seem to mean the same thing. Maybe we should consistently use "runtime"? Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1252601039 From dholmes at openjdk.org Wed Jul 5 06:50:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 06:50:59 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 02:11:40 GMT, David Holmes wrote: >> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> use OS specific native stack printing in class load cause native stack logging > > src/hotspot/share/oops/instanceKlass.cpp line 3867: > >> 3865: // We have printed the native stack in platform-specific code, >> 3866: // so nothing else to do in this case. >> 3867: } else { > > Why did you do this? Did you observe problems on Windows? This is too low-level for this code. If we need to do this then it needs to be pushed down to a lower-level and encapsulated more cleanly. We do not do this in `JavaThread::print_jni_stack()`. > > I need to revoke my approval while this is resolved. I'm also investigating the details of the stack printing code - use of `os::platform_print_native_stack` is very inconsistent. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1252619122 From rehn at openjdk.org Wed Jul 5 06:57:00 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 5 Jul 2023 06:57:00 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> Message-ID: On Sun, 2 Jul 2023 18:35:30 GMT, Robbin Ehn wrote: > Hi all, > > This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. > > Thanks! It have a dependency on a enhancement in 22, I'll fix that. (rv64 do not build) ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1621136080 From stuefe at openjdk.org Wed Jul 5 07:02:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 07:02:54 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v7] In-Reply-To: <2LjuTaIynj4_ApDR9KszWkYHHYPDtmlzKwdNHQcSGYA=.43bac8d2-db61-43ad-b3df-1d26e8762960@github.com> References: <2LjuTaIynj4_ApDR9KszWkYHHYPDtmlzKwdNHQcSGYA=.43bac8d2-db61-43ad-b3df-1d26e8762960@github.com> Message-ID: <9zOhcTSUGxbRhzoN26ciiSrm_8MF8GgXBUTdTzoX9Cc=.c2263a60-1f88-4eaa-86db-061cb848dec6@github.com> On Wed, 5 Jul 2023 05:33:40 GMT, Ioi Lam wrote: > Looks reasonable to me. This is a nice clean up. Just a few nits. Thanks @iklam ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14688#issuecomment-1621146684 From mbaesken at openjdk.org Wed Jul 5 07:06:01 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 5 Jul 2023 07:06:01 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. Hi David , FONTCONFIG_USE_MMAP sounds interesting, I saw this too in the list of environment variables. Should I add this one? > And I find it hard to imagine how these particular ones might lead to crashes Shared cache home directory (in a shared HOME shared between different hosts) was causing a lot of trouble and leading to crashes rather often in our AIX tests. So this one is interesting for sure. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14767#issuecomment-1621152185 From stuefe at openjdk.org Wed Jul 5 09:09:05 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 09:09:05 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v8] In-Reply-To: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> References: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> Message-ID: On Wed, 5 Jul 2023 06:36:03 GMT, Thomas Stuefe wrote: >> I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. >> >> (*2023-07-05 Updated to fit patch description to the agreed final form*) >> >> ---- >> >> CDS narrow Klass handling plays a role for archived heap objects. >> >> When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. >> We recompute those narrow Klass ids using the following encoding scheme: >> - base = future assumed mapping start address >> - shift = dump time (!) JVMs encoding shift (A) >> see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 >> >> At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: >> - encoding base is the range start address (mapping base) >> - encoding shift is always zero >> see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 >> >> The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) >> >> At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. >> >> ------------------- >> >> There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: >> >> - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). >> >> - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass rang... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - only build initializer functions for 64-bit > - Consistently use runtime instead of future > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - revert accidental change > - Fix Windows build > - Add alternative for !INCLUDE_CDS_JAVA_HEAP path > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - fix comment > - Merge > - ... and 2 more: https://git.openjdk.org/jdk/compare/d6578bff...0ea2fa10 Could I have a second review, please? @ashu-mehra, maybe? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14688#issuecomment-1621339547 From mdoerr at openjdk.org Wed Jul 5 09:32:57 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 5 Jul 2023 09:32:57 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v2] In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 15:44:27 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Coleen and Amit comments src/hotspot/cpu/aarch64/templateTable_aarch64.cpp line 2538: > 2536: load_resolved_field_entry(obj, cache, index, off, raw_flags, is_static); > 2537: // Index holds the TOS > 2538: __ mov(flags, index); This is very confusing. You call the same thing "TOS", "type" and "index". Please use consistent naming and make it more comprehensive. In addition, why do you need the extra move? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1252826981 From dnsimon at openjdk.org Wed Jul 5 10:02:58 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 5 Jul 2023 10:02:58 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 06:48:25 GMT, David Holmes wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 3867: >> >>> 3865: // We have printed the native stack in platform-specific code, >>> 3866: // so nothing else to do in this case. >>> 3867: } else { >> >> Why did you do this? Did you observe problems on Windows? This is too low-level for this code. If we need to do this then it needs to be pushed down to a lower-level and encapsulated more cleanly. We do not do this in `JavaThread::print_jni_stack()`. >> >> I need to revoke my approval while this is resolved. > > I'm also investigating the details of the stack printing code - use of `os::platform_print_native_stack` is very inconsistent. I was under the impression that the only way to get native stack printing on windows is with `os::platform_print_native_stack`. I'm trying to set up a Windows machine to test this assumption now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1252866376 From mdoerr at openjdk.org Wed Jul 5 10:21:12 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 5 Jul 2023 10:21:12 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v2] In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 15:44:27 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Coleen and Amit comments I'll take a closer look once the aarch64 code is in a good shape and see if I can provide a PPC64 implementation. I hope to find time for it in 2 weeks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1621465257 From shade at openjdk.org Wed Jul 5 10:23:57 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 10:23:57 GMT Subject: RFR: JDK-8305962: update jcstress to 0.17 In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 20:09:50 GMT, Leonid Mesnik wrote: > The fix changes jcstress version and update some parameters used by the jtreg wrapper. Not actually sure if modern jcstress can be executed with this wrapper, given how it might need to drop in JNA blobs before execution. There is a (not yet released) version of jcstress that gives `-tb` (time budget) option that might be useful here. test/hotspot/jtreg/applications/jcstress/JcstressRunner.java line 45: > 43: */ > 44: @Artifact(organization = "org.openjdk.jcstress", name = "jcstress-tests-all", > 45: revision = "0.16", extension = "jar", unpack = false) This is is not `0.17`, as issue title implies. test/hotspot/jtreg/applications/jcstress/JcstressRunner.java line 62: > 60: + JcstressRunner.class.getName(), e); > 61: } > 62: return artifacts.get("org.openjdk.jcstress.jcstress-tests-all-0.16") Not `0.17` either. test/hotspot/jtreg/applications/jcstress/JcstressRunner.java line 114: > 112: // The "default" preset might take days for some tests > 113: // so use sanity testing by default. > 114: String mode = "sanity"; `sanity` mode is incorrect for actual testing runs. Should be at least `quick`. test/hotspot/jtreg/applications/jcstress/JcstressRunner.java line 129: > 127: > 128: extraFlags.add("-sc"); > 129: extraFlags.add("false"); Is this to shorten the execution time? I'd recommend `-af GLOBAL` too then, since we are reducing the test matrix for it anyway. ------------- Changes requested by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14742#pullrequestreview-1514173614 PR Review Comment: https://git.openjdk.org/jdk/pull/14742#discussion_r1252885192 PR Review Comment: https://git.openjdk.org/jdk/pull/14742#discussion_r1252885350 PR Review Comment: https://git.openjdk.org/jdk/pull/14742#discussion_r1252885661 PR Review Comment: https://git.openjdk.org/jdk/pull/14742#discussion_r1252890456 From rkennke at openjdk.org Wed Jul 5 11:09:25 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 5 Jul 2023 11:09:25 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v40] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 60 commits: - Merge branch 'master' into JDK-8139457 - Address @shipilev's comments - Build fixes - Merge branch 'master' into JDK-8139457 - Correctly handle oop array element aligment in 32bit builds; move method from Universe to Array - Require uncompressed oops to be 8-byte-aligned - Corresponding XGC fixes - Merge branch 'master' into JDK-8139457 - Fix calls to removed instanceOopDesc::header_size() - Add cast - ... and 50 more: https://git.openjdk.org/jdk/compare/d6578bff...f1dbb859 ------------- Changes: https://git.openjdk.org/jdk/pull/11044/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=39 Stats: 707 lines in 42 files changed: 486 ins; 120 del; 101 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From simonis at openjdk.org Wed Jul 5 11:52:11 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 5 Jul 2023 11:52:11 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively Message-ID: As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). ------------- Commit messages: - 8311500: StackWalker.getCallerClass() can throw if invoked reflectively Changes: https://git.openjdk.org/jdk/pull/14773/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14773&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311500 Stats: 83 lines in 3 files changed: 78 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14773.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14773/head:pull/14773 PR: https://git.openjdk.org/jdk/pull/14773 From dholmes at openjdk.org Wed Jul 5 12:04:57 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 12:04:57 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 10:00:26 GMT, Doug Simon wrote: >> I'm also investigating the details of the stack printing code - use of `os::platform_print_native_stack` is very inconsistent. > > I was under the impression that the only way to get native stack printing on windows is with `os::platform_print_native_stack`. I'm trying to set up a Windows machine to test this assumption now. It seems it is. I was not aware of that. So I need to fix `JavaThread::print_jni_stack`, and more generally consolidate this stack printing code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1253004943 From coleenp at openjdk.org Wed Jul 5 12:34:05 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Jul 2023 12:34:05 GMT Subject: RFR: 8311077: Fix -Wconversion warnings in jvmti code [v4] In-Reply-To: References: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> Message-ID: On Fri, 30 Jun 2023 13:10:06 GMT, Coleen Phillimore wrote: >> Please review change for mostly fixing return types in the constant pool and metadata to fix -Wconversion warnings in JVMTI code. The order of preference for changes are: 1. change the types to more distinct types (fields in the constant pool are u2 because that's their size in the classfile), 2. add direct int casts if the value has been checked in asserts above, and 3. checked_cast<> if not verified, and 4. added some pointer_delta_as_ints where needed. >> Tested with tier1-4. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > David's suggestions. Thanks David for the review. I do change the types as low in the call stack as practical with these changes. I think your comment above disappeared because the last change reverted that code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14710#issuecomment-1621658684 From coleenp at openjdk.org Wed Jul 5 12:34:07 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Jul 2023 12:34:07 GMT Subject: RFR: 8311077: Fix -Wconversion warnings in jvmti code [v3] In-Reply-To: References: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> Message-ID: On Mon, 3 Jul 2023 01:20:08 GMT, David Holmes wrote: >> JvmtiRawMonitor _recursions is an int. Maybe it shouldn't be. You could file an RFE to change that if it's wrong. >> >> >> volatile int _recursions; // recursion count, 0 for first entry > > Sorry, yes was looking at the wrong `_recursions`. `int` is fine here, `intx` is odd as the max expected recursions should not depend on 32-bit versus 64-bit. I can't imagine why recursions would be 64 bit in the ObjectMonitor code, that does seem excessive. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14710#discussion_r1253031977 From coleenp at openjdk.org Wed Jul 5 12:34:09 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Jul 2023 12:34:09 GMT Subject: Integrated: 8311077: Fix -Wconversion warnings in jvmti code In-Reply-To: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> References: <01jxrp1E-_5hGZ91QI0Og2XgbksXszXHWSzRiiuX9OM=.8d132e37-3821-4b48-a8e4-c3f2efb7b3ea@github.com> Message-ID: On Thu, 29 Jun 2023 12:17:23 GMT, Coleen Phillimore wrote: > Please review change for mostly fixing return types in the constant pool and metadata to fix -Wconversion warnings in JVMTI code. The order of preference for changes are: 1. change the types to more distinct types (fields in the constant pool are u2 because that's their size in the classfile), 2. add direct int casts if the value has been checked in asserts above, and 3. checked_cast<> if not verified, and 4. added some pointer_delta_as_ints where needed. > Tested with tier1-4. This pull request has now been integrated. Changeset: cf82e315 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/cf82e3152bba1d7332ecdc4dd57a2db2f0dc2aa8 Stats: 100 lines in 16 files changed: 5 ins; 2 del; 93 mod 8311077: Fix -Wconversion warnings in jvmti code Reviewed-by: fparain, matsaave, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/14710 From coleenp at openjdk.org Wed Jul 5 12:41:09 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Jul 2023 12:41:09 GMT Subject: Integrated: 8311180: Remove unused unneeded definitions from globalDefinitions In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 14:57:04 GMT, Coleen Phillimore wrote: > I noticed this cleanup in a patch that Axel shared with me that I thought should be pushed on its own as trivial. > Tested with tier1 on Oracle supported platforms and looked for these on the others. This pull request has now been integrated. Changeset: 22e17c29 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/22e17c29a2a4eb546fae4c01ae435283654e3bb3 Stats: 23 lines in 5 files changed: 0 ins; 17 del; 6 mod 8311180: Remove unused unneeded definitions from globalDefinitions Co-authored-by: Axel Boldt-Christmas Reviewed-by: dholmes, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/14737 From coleenp at openjdk.org Wed Jul 5 12:41:06 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 5 Jul 2023 12:41:06 GMT Subject: RFR: 8311180: Remove unused unneeded definitions from globalDefinitions [v3] In-Reply-To: References: Message-ID: On Sat, 1 Jul 2023 15:05:13 GMT, Coleen Phillimore wrote: >> I noticed this cleanup in a patch that Axel shared with me that I thought should be pushed on its own as trivial. >> Tested with tier1 on Oracle supported platforms and looked for these on the others. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > This should be uintptr_t. Thanks for the review David and Axel and finding a bug, Thomas. Maybe p2i should be fixed? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14737#issuecomment-1621670888 From shade at openjdk.org Wed Jul 5 13:04:21 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 13:04:21 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v40] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 11:09:25 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 60 commits: > > - Merge branch 'master' into JDK-8139457 > - Address @shipilev's comments > - Build fixes > - Merge branch 'master' into JDK-8139457 > - Correctly handle oop array element aligment in 32bit builds; move method from Universe to Array > - Require uncompressed oops to be 8-byte-aligned > - Corresponding XGC fixes > - Merge branch 'master' into JDK-8139457 > - Fix calls to removed instanceOopDesc::header_size() > - Add cast > - ... and 50 more: https://git.openjdk.org/jdk/compare/d6578bff...f1dbb859 More review comments (partial). src/hotspot/share/c1/c1_LIRGenerator.cpp line 654: > 652: const int instance_size = align_object_size(klass->size_helper()); > 653: __ allocate_object(dst, scratch1, scratch2, scratch3, scratch4, > 654: oopDesc::header_size(), Now that `oopDesc::header_size` is reverted, you can keep the argument list on the same line, to avoid the superfluous change. src/hotspot/share/gc/shared/memAllocator.cpp line 323: > 321: // ...and zap just allocated tlab. > 322: #ifdef ASSERT > 323: Copy::fill_to_words(mem, allocation._allocated_tlab_size, badHeapWordVal); Still not clear to me why we can zap the headers, given https://github.com/openjdk/jdk/commit/0917ad432eb3d01c104f03973c5c7ff52c6dfefe was needed for G1? src/hotspot/share/oops/arrayOop.cpp line 1: > 1: /* Do you need this file now? test/hotspot/gtest/oops/test_arrayOop.cpp line 2: > 1: /* > 2: * Copyright (c) 1997, 2022, Oracle and/or its affiliates. All rights reserved. Should be `2023`. test/hotspot/gtest/oops/test_objArrayOop.cpp line 2: > 1: /* > 2: * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. New file? If not, should be `2023`, at least. test/hotspot/jtreg/gtest/ObjArrayTests.java line 2: > 1: /* > 2: * Copyright (c) 2022 Amazon.com Inc. or its affiliates. All rights reserved. Inconsistent header format: drop the year. test/hotspot/jtreg/runtime/FieldLayout/ArrayBaseOffsets.java line 2: > 1: /* > 2: * Copyright (C) 2022, Amazon.com Inc. or its affiliates. All Rights Reserved. Inconsistent header format: drop the year. ------------- PR Review: https://git.openjdk.org/jdk/pull/11044#pullrequestreview-1514308429 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1252976727 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1253072169 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1252975349 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1252982236 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1252981940 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1252981132 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1252981026 From shade at openjdk.org Wed Jul 5 13:04:23 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 13:04:23 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v38] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 07:00:31 GMT, Roman Kennke wrote: >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Array.java line 68: >> >>> 66: return !VM.getVM().isCompressedOopsEnabled(); >>> 67: } >>> 68: } >> >> I think `isCompressedOopsEnabled` already does the right thing, so: >> >> Suggestion: >> >> if (type == BasicType.T_OBJECT || type == BasicType.T_ARRAY) { >> return !VM.getVM().isCompressedOopsEnabled(); >> } > > On 32-bit, isCompressedOopsEnabled() would be false, so the method returns true, which means we would 8-byte align reference array elements? That doesn't seem right. Right, nevermind. I keep tripping over this. `isLP64` is actually symmetric to `#ifdef _LP64`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1252979322 From dnsimon at openjdk.org Wed Jul 5 13:07:02 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 5 Jul 2023 13:07:02 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 12:02:06 GMT, David Holmes wrote: >> I was under the impression that the only way to get native stack printing on windows is with `os::platform_print_native_stack`. I'm trying to set up a Windows machine to test this assumption now. > > It seems it is. I was not aware of that. So I need to fix `JavaThread::print_jni_stack`, and more generally consolidate this stack printing code. Yep, that sounds like a good idea. I assume that can be done in a follow up issue so I can now proceed to mach5 test and then merge this PR? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1253079559 From shade at openjdk.org Wed Jul 5 14:43:56 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 14:43:56 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 11:45:59 GMT, Volker Simonis wrote: > As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). Neither PR nor the bug describes what the root cause it, so it is very hard to see if patch makes sense. Please describe the problem and solution briefly? src/hotspot/share/prims/stackwalk.cpp line 168: > 166: int max_nframes, int start_index, > 167: objArrayHandle frames_array, > 168: int& end_index, bool firstBatch, TRAPS) { The style is `is_first_batch`. ------------- PR Review: https://git.openjdk.org/jdk/pull/14773#pullrequestreview-1514691062 PR Review Comment: https://git.openjdk.org/jdk/pull/14773#discussion_r1253212866 From duke at openjdk.org Wed Jul 5 14:47:03 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 5 Jul 2023 14:47:03 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v8] In-Reply-To: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> References: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> Message-ID: On Wed, 5 Jul 2023 06:36:03 GMT, Thomas Stuefe wrote: >> I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. >> >> (*2023-07-05 Updated to fit patch description to the agreed final form*) >> >> ---- >> >> CDS narrow Klass handling plays a role for archived heap objects. >> >> When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. >> We recompute those narrow Klass ids using the following encoding scheme: >> - base = future assumed mapping start address >> - shift = dump time (!) JVMs encoding shift (A) >> see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 >> >> At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: >> - encoding base is the range start address (mapping base) >> - encoding shift is always zero >> see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 >> >> The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) >> >> At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. >> >> ------------------- >> >> There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: >> >> - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). >> >> - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass rang... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: > > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - only build initializer functions for 64-bit > - Consistently use runtime instead of future > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - revert accidental change > - Fix Windows build > - Add alternative for !INCLUDE_CDS_JAVA_HEAP path > - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding > - fix comment > - Merge > - ... and 2 more: https://git.openjdk.org/jdk/compare/d6578bff...0ea2fa10 src/hotspot/share/oops/compressedOops.cpp line 202: > 200: address encoding_range_end = requested_base + encoding_range_size; > 201: > 202: assert(requested_base <= addr && encoding_range_end >= end, "Encoding does not cover the full Klass range"); Is it possible for `requested_base` to ever be less than `addr`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1253221378 From duke at openjdk.org Wed Jul 5 14:49:09 2023 From: duke at openjdk.org (duke) Date: Wed, 5 Jul 2023 14:49:09 GMT Subject: Withdrawn: 8304149: Avoid walking the CodeCache in DeoptimizationScope::deoptimize_marked In-Reply-To: References: Message-ID: On Wed, 15 Mar 2023 08:08:18 GMT, Axel Boldt-Christmas wrote: > Change DeoptimizationScope to keep track of the marked CompiledMethods in a list to avoid having to walk the CodeCache to find them again when deoptimizing. > > This adds a linked list to DeoptimizationScope which tracks the marked CompiledMethods for the active deoptimization generation. Then when deoptimize_marked is called the committing caller claims the list and uses it to deoptimize the linked (and marked) CompileMethods instead of iterating over the CodeCache to find them again. > > Testing: Oracle platforms tier 1-7 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/13036 From stuefe at openjdk.org Wed Jul 5 15:14:58 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 15:14:58 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v8] In-Reply-To: References: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> Message-ID: On Wed, 5 Jul 2023 14:44:35 GMT, Ashutosh Mehra wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: >> >> - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding >> - only build initializer functions for 64-bit >> - Consistently use runtime instead of future >> - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding >> - revert accidental change >> - Fix Windows build >> - Add alternative for !INCLUDE_CDS_JAVA_HEAP path >> - Merge branch 'master' into fix-cleanup-CDS-nKlass-encoding >> - fix comment >> - Merge >> - ... and 2 more: https://git.openjdk.org/jdk/compare/d6578bff...0ea2fa10 > > src/hotspot/share/oops/compressedOops.cpp line 202: > >> 200: address encoding_range_end = requested_base + encoding_range_size; >> 201: >> 202: assert(requested_base <= addr && encoding_range_end >= end, "Encoding does not cover the full Klass range"); > > Is it possible for `requested_base` to ever be less than `addr`? Not at the moment, no. Probably never. I separated those two mainly for cleanliness, since Klass range and encoding range are different things. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1253261336 From duke at openjdk.org Wed Jul 5 15:28:56 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 5 Jul 2023 15:28:56 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v8] In-Reply-To: References: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> Message-ID: On Wed, 5 Jul 2023 15:11:55 GMT, Thomas Stuefe wrote: >> src/hotspot/share/oops/compressedOops.cpp line 202: >> >>> 200: address encoding_range_end = requested_base + encoding_range_size; >>> 201: >>> 202: assert(requested_base <= addr && encoding_range_end >= end, "Encoding does not cover the full Klass range"); >> >> Is it possible for `requested_base` to ever be less than `addr`? > > Not at the moment, no. Probably never. I separated those two mainly for cleanliness, since Klass range and encoding range are different things. I am fine having them separately but that assert condition forced me to think if requested_base can ever be less than addr. If not, then I guess assert for `requested_base == addr` ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1253279213 From pchilanomate at openjdk.org Wed Jul 5 15:35:16 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 5 Jul 2023 15:35:16 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite [v3] In-Reply-To: References: Message-ID: > Please review the following fix. Runtime methods called through the SharedRuntime::generate_resolve_blob() stub always return the value stored in _from_compiled_entry as the entry point to the callee method. This will either be the entry point to the compiled version of callee if there is one or the c2i adapter entry point. But this doesn't consider the case where an EnterInterpOnlyModeClosure handshake catches the JavaThread in the transition back to Java on those methods. In that case we should return the c2i adapter entry point even if there is a compiled entry point. Otherwise the JavaThread will continue calling the compiled versions of methods without noticing it's in interpreted only mode until it either calls a method that hasn't been compiled yet or it returns to the caller of that resolved callee where the change to interpreter only mode happened (since the EnterInterpOnlyModeClosure handshake marked all the frames on the stack for deoptimization). > > This is a long standing bug but has been made visible with the assert added as part of 8288949 where a related issue was fixed. There are more details in the bug comments about how this specific crash happens and its relation with 8288949. I also attached a reproducer. > > These runtime methods are already using JRT_BLOCK_ENTRY/JRT_BLOCK so that the entry point to the callee is fetched only after the last possible safepoint in JRT_BLOCK_END. This guarantees that we will not return an entry point to compiled code that has already been removed. So the fix just adds a check to verify if the JavaThread entered into interpreted only mode in that transition back to Java and to return the c2i entry point instead. > > I tested the patch in mach5 tiers 1-6. I also verified it with the reproducer I attached to the bug. I didn't include it as an additional test but I can do that if desired. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: revert to just remove assert added in 8288949 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14108/files - new: https://git.openjdk.org/jdk/pull/14108/files/e2209ac1..3eddfc59 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14108&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14108&range=01-02 Stats: 71 lines in 3 files changed: 35 ins; 12 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/14108.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14108/head:pull/14108 PR: https://git.openjdk.org/jdk/pull/14108 From pchilanomate at openjdk.org Wed Jul 5 15:35:18 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 5 Jul 2023 15:35:18 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite [v2] In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 07:00:05 GMT, Serguei Spitsyn wrote: > The fix looks pretty good to me. I have one question on the comment: > > ``` > 1288 // TODO: _linkToNative doesn't have an interpreted version so we always > 1289 // return the compiled code entry point. > ``` > > I guess there is already a bug filed on this, is not it? > Thanks for the review Serguei. There are additional changes that need to be made to handle the case of resolving method handle intrinsics, since we cannot just return the c2i as I mentioned above. There are also issues with enterSpecial and doYield since those do not have interpreted versions. I have been testing a patch (https://github.com/pchilano/jdk/tree/JDK-8302351-alt) but might be too risky to backport it to 21 at this stage, so I just reverted to remove the asserts in here. I filed 8311516 to fix the root issue there. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14108#issuecomment-1622002445 From rkennke at openjdk.org Wed Jul 5 15:39:26 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 5 Jul 2023 15:39:26 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v41] In-Reply-To: References: Message-ID: <5hHB_HoOlMWIfxR5F-dMALDAE1-lJ1-676KFZhNyeIA=.9ac522fa-41ca-45d8-8ab8-18f0b0f3486b@github.com> > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Some cleanups ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/f1dbb859..482c794d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=40 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=39-40 Stats: 40 lines in 3 files changed: 7 ins; 30 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From matsaave at openjdk.org Wed Jul 5 16:12:16 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 16:12:16 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v3] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Fixed test and is_resolved - Merge branch 'master' into field_entry_8301996 - Coleen and Amit comments - 8301996: Move field resolution information out of the cpCache ------------- Changes: https://git.openjdk.org/jdk/pull/14129/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=02 Stats: 1092 lines in 39 files changed: 732 ins; 119 del; 241 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From iklam at openjdk.org Wed Jul 5 16:45:59 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 5 Jul 2023 16:45:59 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Tue, 20 Jun 2023 15:56:57 GMT, Erik ?sterlund wrote: >> Ashutosh Mehra has updated the pull request incrementally with three additional commits since the last revision: >> >> - Remove unnecessary assert and condition for UseG1GC >> >> Signed-off-by: Ashutosh Mehra >> - Rename CollectedHeap::reserved() to CollectedHeap::reserved_range() >> >> Signed-off-by: Ashutosh Mehra >> - Rename alloc_archive_space to allocate_archive_space >> >> Signed-off-by: Ashutosh Mehra > > It's a bit disappointing for a PR aiming to make heap archiving GC agnostic, to make assumptions about GC internal memory layout, that doesn't apply to all collectors. > We have discussed previously with @iklam an approach where materializing archived objects uses the normal object allocation APIs. That would for real make the heap archiving mechanism GC agnostic. I would rather see that being prototyped, than a not GC agnostic approach that we might throw away right after it gets integrated, in favour of the more GC agnostic approach. > Based on discussion with @fisk, I created an RFE for investigating his idea. Please see https://bugs.openjdk.org/browse/JDK-8310823 > > It seems quite promising to me, and will greatly reduce (or eliminate) the interface between CDS and the collectors. > > I would like to keep the current performance as possible, so I am leaning towards having an API for CDS to tell the collector about preferred location of the archived objects, to allow mmap'ing and reduce/avoid relocation. But such an hinting API will be much smaller than the ones proposed by this PR. > > (We'd also need an reverse API for the collector to tell CDS what the preferred address would be, during CDS dump time). I've done a very simple (and rough) implementation of @fisk 's idea, without any optimizations. Every relocation is done via a hashtable lookup. For the default CDS archive, this makes about 48000 relocations on start-up. https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 $ perf stat -r 40 java --version > /dev/null 0.015065 +- 0.000228 seconds time elapsed ( +- 1.51% ) $ perf stat -r 40 java -XX:+NewArchiveHeapLoading --version > /dev/null 0.020598 +- 0.000215 seconds time elapsed ( +- 1.04% ) $ perf stat -r 40 java -XX:+UseSerialGC --version > /dev/null 0.013929 +- 0.000229 seconds time elapsed ( +- 1.64% ) $ perf stat -r 40 java -XX:+UseSerialGC -XX:+NewArchiveHeapLoading --version > /dev/null 0.019107 +- 0.000222 seconds time elapsed ( +- 1.16% ) The cost of the individual object allocation and relocation is about 5ms. So far the slow down doesn't seem too outrages. My next step is to optimize the relocation code to see how much faster it can get. I noticed that the objects in `ArchiveHeapLoader::newcode_runtime_allocate_objects()` are allocated in at least two contiguous blocks, due to TLAB overflow. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1622121484 From stuefe at openjdk.org Wed Jul 5 16:54:57 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 16:54:57 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v8] In-Reply-To: References: <_ngJ160BB12VAsvGB03wxoYvf_Pdq_veCXMkxFlCfz0=.3c0f2c57-1b03-4d78-9021-4ea612fc2b68@github.com> Message-ID: On Wed, 5 Jul 2023 15:25:38 GMT, Ashutosh Mehra wrote: >> Not at the moment, no. Probably never. I separated those two mainly for cleanliness, since Klass range and encoding range are different things. > > I am fine having them separately but that assert condition forced me to think if requested_base can ever be less than addr. If not, then I guess assert for `requested_base == addr` ? Sure, makes sense. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14688#discussion_r1253381689 From stuefe at openjdk.org Wed Jul 5 17:15:18 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 17:15:18 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v9] In-Reply-To: References: Message-ID: > I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. > > (*2023-07-05 Updated to fit patch description to the agreed final form*) > > ---- > > CDS narrow Klass handling plays a role for archived heap objects. > > When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. > We recompute those narrow Klass ids using the following encoding scheme: > - base = future assumed mapping start address > - shift = dump time (!) JVMs encoding shift (A) > see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 > > At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: > - encoding base is the range start address (mapping base) > - encoding shift is always zero > see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 > > The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) > > At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. > > ------------------- > > There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: > > - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). > > - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass range size*. The comment in ArchiveHeapWriter explains this in g... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Feedback Ashu ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14688/files - new: https://git.openjdk.org/jdk/pull/14688/files/0ea2fa10..0286c510 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14688&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14688&range=07-08 Stats: 4 lines in 1 file changed: 3 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14688.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14688/head:pull/14688 PR: https://git.openjdk.org/jdk/pull/14688 From simonis at openjdk.org Wed Jul 5 17:25:24 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 5 Jul 2023 17:25:24 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: > As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: > > The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: > - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. > - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). > - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). > - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. > - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, > - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. > - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. > - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. > - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). > > Following is a stacktrace of what I've explained so far: > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x143a96a] StackWalk::fill_in_frames... Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: Rename new parameter according to the HS coding conventions ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14773/files - new: https://git.openjdk.org/jdk/pull/14773/files/7ae5b2b5..1ec6d90b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14773&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14773&range=00-01 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14773.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14773/head:pull/14773 PR: https://git.openjdk.org/jdk/pull/14773 From simonis at openjdk.org Wed Jul 5 17:25:24 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 5 Jul 2023 17:25:24 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 14:41:21 GMT, Aleksey Shipilev wrote: > Neither PR nor the bug describes what the root cause it, so it is very hard to see if patch makes sense. Please describe the problem and solution briefly? Fair enough :) I've put the details into the initial comment and also renamed the new parameter according to the HS coding conventions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1622176346 From stuefe at openjdk.org Wed Jul 5 17:28:15 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 17:28:15 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: > (*Updated 2023-07-05 to reflect the current state of the patch*) > > This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. > > We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. > > ### Motivation > > The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. > > To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. > > #### Is this even a problem? > > Yes. > > The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. > > But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. > > ### How trimming works > > Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). > > `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its runtime will depend on the size of the reclaimed memory. ... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: - fix windows build - Merge branch 'master' into JDK-8293114-GC-trim-native - wip - Merge branch 'master' into JDK-8293114-GC-trim-native - wip - Remove adaptive stepdown coding - Merge master - wip - Merge branch 'master' into JDK-8293114-GC-trim-native - wip - ... and 31 more: https://git.openjdk.org/jdk/compare/22e17c29...162b880a ------------- Changes: https://git.openjdk.org/jdk/pull/10085/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=10085&range=10 Stats: 653 lines in 21 files changed: 645 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/10085.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/10085/head:pull/10085 PR: https://git.openjdk.org/jdk/pull/10085 From stuefe at openjdk.org Wed Jul 5 17:28:19 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 17:28:19 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Thu, 2 Mar 2023 16:37:47 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: > > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - merge master > - wip > - wip > - rename GCTrimNative TrimNative > - rename NativeTrimmer > - rename > - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp > - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e Not yet bot Down Bot. Down. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1527036482 PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1604929041 From shade at openjdk.org Wed Jul 5 17:28:20 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 17:28:20 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Fri, 23 Jun 2023 20:41:26 GMT, Thomas Stuefe wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: >> >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - merge master >> - wip >> - wip >> - rename GCTrimNative TrimNative >> - rename NativeTrimmer >> - rename >> - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp >> - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e > > Down Bot. Down. Do you need help moving this forward, @tstuefe? ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1614454216 From rehn at openjdk.org Wed Jul 5 17:28:20 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 5 Jul 2023 17:28:20 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: <86zN5l0ylpFEFsUtxYKb9y_SkyeqzcH4CYAUTvmLYXY=.4082a3af-6ca4-4d53-8b84-9a8e5b6c4f27@github.com> On Fri, 23 Jun 2023 20:41:26 GMT, Thomas Stuefe wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: >> >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - merge master >> - wip >> - wip >> - rename GCTrimNative TrimNative >> - rename NativeTrimmer >> - rename >> - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp >> - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e > > Down Bot. Down. > Do you need help moving this forward, @tstuefe? Agreed, I think you should open PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1614505796 From stuefe at openjdk.org Wed Jul 5 17:28:20 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 17:28:20 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: <86zN5l0ylpFEFsUtxYKb9y_SkyeqzcH4CYAUTvmLYXY=.4082a3af-6ca4-4d53-8b84-9a8e5b6c4f27@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> <86zN5l0ylpFEFsUtxYKb9y_SkyeqzcH4CYAUTvmLYXY=.4082a3af-6ca4-4d53-8b84-9a8e5b6c4f27@github.com> Message-ID: On Fri, 30 Jun 2023 11:13:46 GMT, Robbin Ehn wrote: >> Down Bot. Down. > >> Do you need help moving this forward, @tstuefe? > > Agreed, I think you should open PR. Thanks @robehn and @shipilev ! This keeps getting pushed down the pile. I'll dust this up next week and post a PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1614581789 From shade at openjdk.org Wed Jul 5 17:28:21 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 17:28:21 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Thu, 2 Mar 2023 16:37:47 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: > > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - merge master > - wip > - wip > - rename GCTrimNative TrimNative > - rename NativeTrimmer > - rename > - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp > - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e Okay, cool. The reason I am asking it that glibc "memory leaks" are not uncommon in production cases. There are quite a few libraries that churn native memory (looks at Netty), even with internal pooling. Having something that is backportable to 21u and 17u would be a plus. So I see that `malloc_trim` actually has the parameter `pad`: The pad argument specifies the amount of free space to leave untrimmed at the top of the heap. If this argument is 0, only the minimum amount of memory is maintained at the top of the heap (i.e., one page or less). A nonzero argument can be used to maintain some trailing space at the top of the heap in order to allow future allocations to be made without having to extend the heap with [sbrk(2)](https://man7.org/linux/man-pages/man2/sbrk.2.html). Does current glibc honor that argument at all? Can we use that to control the incrementality of the trim? Since `malloc_trim` return value also describes if trimming was successful, we could probably come up with the adaptive scheme that guesses which pad to use, so that we don't trim the entirety of the memory, but still trim something. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1614633876 PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1621991402 From stuefe at openjdk.org Wed Jul 5 17:28:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 17:28:21 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Fri, 30 Jun 2023 13:12:35 GMT, Aleksey Shipilev wrote: > Okay, cool. The reason I am asking it that glibc "memory leaks" are not uncommon in production cases. There are quite a few libraries that churn native memory (looks at Netty), even with internal pooling. Having something that is backportable to 21u and 17u would be a plus. Thanks, nice to see a confirmation that this is useful. This patch started out simple, and then I fear I started to seriously over-engineer the GC part of it. I'll give it a look next week to see if I can dumb it down. > So I see that `malloc_trim` actually has the parameter `pad`: > > ``` > The pad argument specifies the amount of free space to leave > untrimmed at the top of the heap. If this argument is 0, only > the minimum amount of memory is maintained at the top of the heap > (i.e., one page or less). A nonzero argument can be used to > maintain some trailing space at the top of the heap in order to > allow future allocations to be made without having to extend the > heap with [sbrk(2)](https://man7.org/linux/man-pages/man2/sbrk.2.html). > ``` > > Does current glibc honor that argument at all? Can we use that to control the incrementality of the trim? Since `malloc_trim` return value also describes if trimming was successful, we could probably come up with the adaptive scheme that guesses which pad to use, so that we don't trim the entirety of the memory, but still trim something. This was one of the first things I tried last year. Does have very little impact. IIRC it only applies to the main arena (the one using sbrk) and limits the amount by which the break was lowered? I may remember the details wrong, but my tests also showed very little effect. The bulk of memory reclamation comes from the glibc MADV_DONTNEED'ing free chunks. I think it does that without any bookkeeping, so for subsequent trims, it has no idea of how much memory of that range was paged in. So I think there is no way to implement a limiting trim via the size parameter. I though about that and I think the only way to implement a limited trim would be to add a new API, since you cannot use the existing one without breaking compatibility. I always meant to ask Florian about this. I will tomorrow. In any case, this would only be a solution for future glibcs, not for the ones that are around. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1614843984 PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1622178112 From simonis at openjdk.org Wed Jul 5 17:28:22 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 5 Jul 2023 17:28:22 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Thu, 2 Mar 2023 16:37:47 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: > > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - merge master > - wip > - wip > - rename GCTrimNative TrimNative > - rename NativeTrimmer > - rename > - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp > - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e My main concern with this change is increased latency. You wrote "*..concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena..*". Not sure what "*usually*" means here and how many mallocs are satisfied from a local arena. But introducing pauses up to a second seems significant for some applications. The other question is that I still don't understand if glibc-malloc will ever call `malloc_trim()` automatically (and in that case introduce the latency anyway). The manpage says that `malloc_trim()` "*..is automatically called by free(3) in certain circumstances; see the discussion of `M_TOP_PAD` and `M_TRIM_THRESHOLD` in `mallopt(3)`..*" but you reported that you couldn't observe any cleanup effect when playing around with `M_TRIM_THRESHOLD`. In the end, calling `malloc_trim()` periodically might even help to decrease latency if this prevents seldom, but longer automatic invocations of `malloc_trim()` by glibc itself. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1617840919 From rehn at openjdk.org Wed Jul 5 17:28:22 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 5 Jul 2023 17:28:22 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: <2ZifNHgzbaWYqQj2Q24S1gA-ufpuNzGSaVFN53V519o=.1494349b-266d-447d-97c1-8dcdc38fcd9b@github.com> On Thu, 2 Mar 2023 16:37:47 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: > > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - merge master > - wip > - wip > - rename GCTrimNative TrimNative > - rename NativeTrimmer > - rename > - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp > - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e > My main concern with this change is increased latency. You wrote "_..concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena.._". Not sure what "_usually_" means here and how many mallocs are satisfied from a local arena. But introducing pauses up to a second seems significant for some applications. > > The other question is that I still don't understand if glibc-malloc will ever call `malloc_trim()` automatically (and in that case introduce the latency anyway). The manpage says that `malloc_trim()` "_..is automatically called by free(3) in certain circumstances; see the discussion of `M_TOP_PAD` and `M_TRIM_THRESHOLD` in `mallopt(3)`.._" but you reported that you couldn't observe any cleanup effect when playing around with `M_TRIM_THRESHOLD`. In the end, calling `malloc_trim()` periodically might even help to decrease latency if this prevents seldom, but longer automatic invocations of `malloc_trim()` by glibc itself. The trim performed automatically on some free() is one done in the 'chunk' you were freeing in. While the explicit call visits all 'chunks'. @jdksjolen can explain this more deeply. I share your concern. But as this is a opt-in and the benefits for a certain set of workloads overwhelms the risk of latency increases I'm for this change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1618932039 From stuefe at openjdk.org Wed Jul 5 17:28:22 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 17:28:22 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Mon, 3 Jul 2023 10:25:49 GMT, Volker Simonis wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: >> >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - merge master >> - wip >> - wip >> - rename GCTrimNative TrimNative >> - rename NativeTrimmer >> - rename >> - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp >> - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e > > My main concern with this change is increased latency. You wrote "*..concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena..*". Not sure what "*usually*" means here and how many mallocs are satisfied from a local arena. But introducing pauses up to a second seems significant for some applications. > > The other question is that I still don't understand if glibc-malloc will ever call `malloc_trim()` automatically (and in that case introduce the latency anyway). The manpage says that `malloc_trim()` "*..is automatically called by free(3) in certain circumstances; see the discussion of `M_TOP_PAD` and `M_TRIM_THRESHOLD` in `mallopt(3)`..*" but you reported that you couldn't observe any cleanup effect when playing around with `M_TRIM_THRESHOLD`. In the end, calling `malloc_trim()` periodically might even help to decrease latency if this prevents seldom, but longer automatic invocations of `malloc_trim()` by glibc itself. @simonis @robehn Thanks for thinking this through. >My main concern with this change is increased latency. You wrote "..concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena..". Not sure what "usually" means here and how many mallocs are satisfied from a local arena. But introducing pauses up to a second seems significant for some applications. >From looking at the sources (glibc 2.31), I see `malloc_trim` iterates over all arenas and locks each arena while trimming it. I also see this lock getting locked when: - creating, re-assigning arenas - on statistics (`malloc_stats`, `mallinfo2`, `malloc_info`) - on `arena_get_retry()` which AFAICS seems to be "stealing" from a neighboring arena if my arena is full - on the free path, when adding chunks to the fast bin, but guarded by __builtin_expect(.., 0), so probably a very rare path - on realloc'ing non-mmapped chunks. So, malloc_trim will incovenience concurrent reallocs, and rarely frees, or allocations that cause arena stealing or allocating new arenas. I may have missed some cases, but it makes sense that glibc attempts to avoid locking as much as possible. About the "up to a second" - this was measured on my machine with ~32GB of reclaimable memory. Having that much floating garbage in the C-heap would hopefully be rare. >The other question is that I still don't understand if glibc-malloc will ever call malloc_trim() automatically (and in that case introduce the latency anyway). The manpage says that malloc_trim() "..is automatically called by free(3) in certain circumstances; see the discussion of M_TOP_PAD and M_TRIM_THRESHOLD in mallopt(3).." but you reported that you couldn't observe any cleanup effect when playing around with M_TRIM_THRESHOLD. In the end, calling malloc_trim() periodically might even help to decrease latency if this prevents seldom, but longer automatic invocations of malloc_trim() by glibc itself. >From looking at the sources, the glibc trims on free: - the returned chunk may go into the thread local cache (tcache) or into the fastbin. In both cases, the chunk still counts as used. Nothing happens. - Otherwise, the returned chunk gets merged with its immediate neighbors (once, so not recursively). If the resulting size is larger than 64K, glibc calls trim, but only for the arena the chunk is contained in. As you can see, trim only happens sometimes. I did experiments with mallocing, then freeing, 64K 30000 times: 1) done from 10 threads will leave me with a remainder of 300-500MB unreclaimed RSS after all is freed. 2) done from 1 thread, or setting MALLOC_ARENA_MAX=1, ends up reclaiming most of the memory. Unfortunately, most of C-Heap allocations are a lot finer grained than 64K. *Update* I see we also lock on the malloc path if we don't pull the chunk from the tcache... . ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1619638641 From stuefe at openjdk.org Wed Jul 5 17:42:05 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 17:42:05 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v10] In-Reply-To: <2ZifNHgzbaWYqQj2Q24S1gA-ufpuNzGSaVFN53V519o=.1494349b-266d-447d-97c1-8dcdc38fcd9b@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> <2ZifNHgzbaWYqQj2Q24S1gA-ufpuNzGSaVFN53V519o=.1494349b-266d-447d-97c1-8dcdc38fcd9b@github.com> Message-ID: <-IgGOhdyw4eHbB3jrPrayyg-GVdhBXXt8pX8Lb6lMO8=.070b3202-e74d-42a8-802e-88c996ee8fc7@github.com> On Mon, 3 Jul 2023 17:39:59 GMT, Robbin Ehn wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: >> >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - merge master >> - wip >> - wip >> - rename GCTrimNative TrimNative >> - rename NativeTrimmer >> - rename >> - src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp >> - ... and 24 more: https://git.openjdk.org/jdk/compare/99f5687e...5d41312e > >> My main concern with this change is increased latency. You wrote "_..concurrent malloc/frees are usually not blocked while trimming if they are satisfied from the local arena.._". Not sure what "_usually_" means here and how many mallocs are satisfied from a local arena. But introducing pauses up to a second seems significant for some applications. >> >> The other question is that I still don't understand if glibc-malloc will ever call `malloc_trim()` automatically (and in that case introduce the latency anyway). The manpage says that `malloc_trim()` "_..is automatically called by free(3) in certain circumstances; see the discussion of `M_TOP_PAD` and `M_TRIM_THRESHOLD` in `mallopt(3)`.._" but you reported that you couldn't observe any cleanup effect when playing around with `M_TRIM_THRESHOLD`. In the end, calling `malloc_trim()` periodically might even help to decrease latency if this prevents seldom, but longer automatic invocations of `malloc_trim()` by glibc itself. > > The trim performed automatically on some free() is one done in the 'chunk' you were freeing in. > While the explicit call visits all 'chunks'. @jdksjolen can explain this more deeply. > > I share your concern. > But as this is a opt-in and the benefits for a certain set of workloads overwhelms the risk of latency increases I'm for this change. @robehn @zhengyu123 @shipilev @simonis Thank you all for your support and input. I dusted off the patch and simplified it: - removed the adaptive step-down logic, since that was overly involved and in my tests did not work that well - removed the expedite-trim logic. - Pauses now stack. So, in very few words, this patch - adds an optional thread to execute trims at periodic intervals - can be temporarily paused. - I guarded sections that are vulnerable against concurrent work (GC STW phases) or that are doing build C-heap operations (e.g. monitor bulk deletion, stringtable cleanups, arena cleanups etc) with pauses. - I'll do some more benchmarks over the next days, but honestly don't expect to see this raising above background noise. If I have time, I also will simulate heavy C-Heap activity to give the trim something to do. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1622198358 From duke at openjdk.org Wed Jul 5 17:50:01 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 5 Jul 2023 17:50:01 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids [v9] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 17:15:18 GMT, Thomas Stuefe wrote: >> I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. >> >> (*2023-07-05 Updated to fit patch description to the agreed final form*) >> >> ---- >> >> CDS narrow Klass handling plays a role for archived heap objects. >> >> When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. >> We recompute those narrow Klass ids using the following encoding scheme: >> - base = future assumed mapping start address >> - shift = dump time (!) JVMs encoding shift (A) >> see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 >> >> At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: >> - encoding base is the range start address (mapping base) >> - encoding shift is always zero >> see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 >> >> The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) >> >> At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. >> >> ------------------- >> >> There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: >> >> - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). >> >> - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass rang... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback Ashu Marked as reviewed by ashu-mehra at github.com (no known OpenJDK username). Thanks for the changes. lgtm! ------------- PR Review: https://git.openjdk.org/jdk/pull/14688#pullrequestreview-1515041679 PR Comment: https://git.openjdk.org/jdk/pull/14688#issuecomment-1622211868 From matsaave at openjdk.org Wed Jul 5 17:52:44 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 17:52:44 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v4] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Register naming improved ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/2ef87d42..e59773eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=02-03 Stats: 21 lines in 1 file changed: 2 ins; 4 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From matsaave at openjdk.org Wed Jul 5 17:52:48 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 17:52:48 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 09:30:08 GMT, Martin Doerr wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen and Amit comments > > src/hotspot/cpu/aarch64/templateTable_aarch64.cpp line 2538: > >> 2536: load_resolved_field_entry(obj, cache, index, off, raw_flags, is_static); >> 2537: // Index holds the TOS >> 2538: __ mov(flags, index); > > This is very confusing. You call the same thing "TOS", "type" and "index". Please use consistent naming and make it more comprehensive. In addition, why do you need the extra move? Great point, this is very unclear. I will use better names which should remove the need for any moves. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253440770 From rkennke at openjdk.org Wed Jul 5 18:01:25 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 5 Jul 2023 18:01:25 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v42] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Merge remote-tracking branch 'origin/JDK-8139457' into JDK-8139457 - Revert more non-essential changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/482c794d..6ea90df0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=41 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=40-41 Stats: 61 lines in 5 files changed: 33 ins; 14 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From fparain at openjdk.org Wed Jul 5 18:24:11 2023 From: fparain at openjdk.org (Frederic Parain) Date: Wed, 5 Jul 2023 18:24:11 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v2] In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 15:44:27 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Coleen and Amit comments Thank you for tackling this big refactoring work of the cpCache. I've only reviewed the x86 and the shared part of this PR. There're several issues to be addressed before this work can move forward. Feel free to contact me if you have questions about my comments. src/hotspot/cpu/aarch64/templateTable_aarch64.cpp line 2686: > 2684: > 2685: Label notVolatile; > 2686: __ tbz(raw_flags, 0, notVolatile); Hard coded values should be avoided. The volatile bit shift should be declared as a constant in ResolvedFieldEntry and used here. src/hotspot/cpu/x86/templateTable_x86.cpp line 2943: > 2941: load_resolved_field_entry(obj, cache, index, off, flags, is_static); > 2942: // Index holds the TOS > 2943: __ mov(flags, index); Names are confusing. src/hotspot/cpu/x86/templateTable_x86.cpp line 2954: > 2952: // Make sure we don't need to mask edx after the above shift > 2953: assert(btos == 0, "change code, btos != 0"); > 2954: // Compare the method to zero Wrong comment. src/hotspot/cpu/x86/templateTable_x86.cpp line 3187: > 3185: > 3186: // Swap registers so flags has TOS state > 3187: __ xchgl(index, flags); I don't get this swap operation. Line 3178, index (containing the TosState) has been copied to flags (rax). The real flags for this field have been erased, and the volatile test below is not applied to the real flags but to the ToSState value. src/hotspot/cpu/x86/templateTable_x86.cpp line 3188: > 3186: // Swap registers so flags has TOS state > 3187: __ xchgl(index, flags); > 3188: __ andl(index, 0x1); An assert is required here to ensure that the volatile flag is at the least significant bit position. src/hotspot/cpu/x86/templateTable_x86.cpp line 3452: > 3450: load_resolved_field_entry(noreg, cache, rax, rbx, rdx); > 3451: // RBX: field offset, RCX: RAX: TOS, RDX: flags > 3452: __ andl(rdx, 0x1); //is_volatile An assert is required here to ensure that the volatile flag is at the least significant bit position. src/hotspot/share/interpreter/bytecodeTracer.cpp line 328: > 326: ConstantPool* constants = _current_method->constants(); > 327: if (constants->cache() == nullptr) { > 328: cp_index = i; // TODO: This is wrong on little-endian. See JDK-8309811. Fix the code or the comment. src/hotspot/share/interpreter/bytecodeTracer.cpp line 627: > 625: case Bytecodes::_putfield: > 626: case Bytecodes::_getfield: > 627: // TODO: get_index_u2() does not work here due to using java_u2 instead of native_u2 Is the TODO comment still relevant? src/hotspot/share/interpreter/interpreterRuntime.cpp line 719: > 717: ResolvedFieldEntry* entry = pool->resolved_field_entry_at(field_index); > 718: entry->fill_in(info.field_holder(), info.offset(), (u2)info.index(), (u1)state, (u1)get_code, (u1)put_code); > 719: entry->set_flags(info.access_flags().is_final(), info.access_flags().is_volatile()); I think there's an issue here: fill_in() will mark the field entry as resolved (by setting get_code or put_code), and by consequence usable by other threads *before* the flags are initialized. This means another thread could see the entry as being resolved but still get the uninitialized flags. src/hotspot/share/interpreter/interpreterRuntime.cpp line 1178: > 1176: h_obj = Handle(current, obj); > 1177: } > 1178: InstanceKlass* entry_f1 = entry->field_holder(); This variable could have a more explicit name (ik, klass or field_holder) src/hotspot/share/interpreter/rewriter.cpp line 54: > 52: switch (tag) { > 53: case JVM_CONSTANT_InterfaceMethodref: > 54: add_cp_cache_entry(i); InterfaceMethodref and Methodref cases could be grouped together. src/hotspot/share/oops/cpCache.cpp line 746: > 744: MetadataFactory::free_array(data, _resolved_indy_entries); > 745: if (_resolved_field_entries) > 746: MetadataFactory::free_array(data, _resolved_field_entries); Both _resolved_indy_entries and _resolved_field_entries should be set to nullptr. src/hotspot/share/oops/resolvedFieldEntry.hpp line 73: > 71: // Note: Only two flags exists at the moment but more could be added > 72: enum { > 73: is_final_shift = 1, // unused It would be more readable to declare all fields here: enum { is_volatile_ shift = 0, is_final_shift = 1, }; And using those shift values consistently in setters/getters, even if the current value is zero, would prevent bugs if those flags are re-ordered later. src/hotspot/share/oops/resolvedFieldEntry.hpp line 95: > 93: return (put_code() == code); > 94: default: > 95: assert(code == Bytecodes::_nop, "Must be get, put, or nop bytecode"); In which cases the is_resolved() method is called with the _nop bytecode in argument? ------------- Changes requested by fparain (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14129#pullrequestreview-1507211765 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1247960640 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253436948 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253436652 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253464804 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253459999 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253467116 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253270694 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1247917510 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1248168064 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1248198237 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1248202649 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253292341 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1247874434 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1247894933 From shade at openjdk.org Wed Jul 5 18:45:26 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 18:45:26 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v2] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Thu, 1 Sep 2022 06:47:27 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - reduce test runtime on slow hardware > - make tests more stable on slow hardware src/hotspot/os/aix/os_aix.cpp line 2990: > 2988: void os::print_memory_mappings(char* addr, size_t bytes, outputStream* st) {} > 2989: > 2990: // stubbed-out trim-native support I think these comments should be more succinct. Example: "Native heap trimming is not implemented yet." (this tells it can be implemented in future) src/hotspot/os/linux/os_linux.cpp line 190: > 188: int keepcost; > 189: }; > 190: typedef struct glibc_mallinfo (*mallinfo_func_t)(void); Should be `os::Linux::glibc_mallinfo` for consistency? src/hotspot/os/linux/os_linux.cpp line 5440: > 5438: out->uordblks = (int) mi.uordblks; > 5439: out->fordblks = (int) mi.fordblks; > 5440: out->keepcost = (int) mi.keepcost; Style: please indent it so that `=` are in the same column? src/hotspot/os/linux/os_linux.cpp line 5452: > 5450: return true; > 5451: #else > 5452: return false; // musl Let's avoid comments like `// musl` -- it might mislead if we ever go for e.g. `uClibc` and friends? src/hotspot/os/linux/trimCHeapDCmd.cpp line 36: > 34: > 35: void TrimCLibcHeapDCmd::execute(DCmdSource source, TRAPS) { > 36: if (os::can_trim_native_heap()) { So, this dcmd command basically trims "asynchronously", in `GCTrimNative`-speak. This seems to be a bit at odds with `GCTrimNative` configured in "sync mode". Should it just notify the trimming thread in async mode, and leave it to GC in sync mode? src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp line 52: > 50: // The mode is set as argument to GCTrimNative::initialize(). > 51: > 52: class NativeTrimmer : public ConcurrentGCThread { It is a bit ugly it pretends to be `ConcurrentGCThread`. Can it be just `NamedThread`? src/hotspot/share/gc/shared/gcTrimNativeHeap.cpp line 67: > 65: > 66: // Note: GCTrimNativeHeapInterval=0 -> zero wait time -> indefinite waits, disabling periodic trim > 67: const int64_t delay_ms = GCTrimNativeHeapInterval * 1000; I wonder if `run_service` should just exit if no periodic trims are configured, instead of wasting a thread here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r980059530 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r980062314 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r980064163 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r980066226 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r980101247 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r980094065 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r980095999 From shade at openjdk.org Wed Jul 5 18:45:18 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 18:45:18 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: <7q-A34HZnue-oo56RV6hd6hVV3SNHlLL7wx-GQ6scm8=.20e79884-9add-4954-ac4a-1ea825c9b458@github.com> On Wed, 5 Jul 2023 17:28:15 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: > > - fix windows build > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Remove adaptive stepdown coding > - Merge master > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - ... and 31 more: https://git.openjdk.org/jdk/compare/22e17c29...162b880a Cursory review follows. Generally, does the issue synopsis reflect what is going on correctly? This is not about "GC should trim" anymore, as we have a perfectly separate thread for this. In fact, we very specifically _do not_ trim during GC :) Related question if we want to use `gc, trim` tag for this, or just `trim`. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 918: > 916: } > 917: > 918: TrimNative::PauseMark trim_native_pause("gc"); Maybe instead of whack-a-mole game of putting the pause in most GC safepoint ops, we should "just" put this mark straight into `VMOperation`, so that _all_ safepoint ops are covered? Or, if we want to limit to GC ops, `VM_GC_Operation`? src/hotspot/share/gc/g1/g1FullCollector.cpp line 181: > 179: > 180: void G1FullCollector::prepare_collection() { > 181: Redundant. src/hotspot/share/gc/shared/gc_globals.hpp line 696: > 694: \ > 695: product(bool, TrimNativeHeap, false, EXPERIMENTAL, \ > 696: "GC will attempt to trim the native heap periodically. By " \ This is not trimmed by GC anymore, right? By a separate thread now? I would also avoid mentioning the default here, just refer to TrimNativeHeapInterval. Saves the headache when default actually changes. src/hotspot/share/gc/shared/gc_globals.hpp line 700: > 698: "changed using TrimNativeHeapInterval.") \ > 699: \ > 700: product(uint, TrimNativeHeapInterval, 60, EXPERIMENTAL, \ Most intervals like these are in milliseconds in Hotspot. Actually, I think it would be useful to have the sub-second interval for some short-lived workloads. src/hotspot/share/gc/shared/trimNative.cpp line 40: > 38: class NativeTrimmerThread : public ConcurrentGCThread { > 39: > 40: Monitor* _lock; `const`? src/hotspot/share/gc/shared/trimNative.cpp line 104: > 102: void execute_trim_and_log() const { > 103: assert(os::can_trim_native_heap(), "Unexpected"); > 104: const int64_t tnow = now(); This line looks redundant. src/hotspot/share/gc/shared/trimNative.hpp line 46: > 44: // Pause periodic trimming while in scope; when leaving scope, > 45: // resume periodic trimming. > 46: struct PauseMark { Maybe move it out? Looks a bit weird to see `TrimNative::PauseMark` instead of e.g. `TrimNativePauseMark`. Also "pause" might be confusing with GC. Maybe `TrimNativeSuspend` or even `NativeTrimSuspend`? src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp line 288: > 286: heap->pacer()->setup_for_idle(); > 287: } > 288: Redundant. src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp line 177: > 175: \ > 176: f(conc_uncommit, "Concurrent Uncommit") \ > 177: f(conc_trim, "Concurrent Trim") \ Looks like a leftover. src/hotspot/share/memory/arena.cpp line 101: > 99: STATIC_ASSERT(_num_pools == 4); > 100: return !_pools[0].empty() || !_pools[1].empty() || > 101: !_pools[2].empty() || !_pools[3].empty(); Suggestion: for (int i = 0; i < _num_pools; i++) { if (!_pools[i].empty()) return false; } return true; src/hotspot/share/memory/arena.cpp line 105: > 103: > 104: static void clean() { > 105: ThreadCritical tc; Why do you need `ThreadCritical` here? I would have thought `PauseMark` handles everything right? ------------- PR Review: https://git.openjdk.org/jdk/pull/10085#pullrequestreview-1120265106 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253439204 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253435572 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253445627 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253447207 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253483365 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253490281 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253482841 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253471989 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253486311 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253486975 PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1253488630 From shade at openjdk.org Wed Jul 5 18:48:07 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 5 Jul 2023 18:48:07 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Wed, 5 Jul 2023 17:28:15 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: > > - fix windows build > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Remove adaptive stepdown coding > - Merge master > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - ... and 31 more: https://git.openjdk.org/jdk/compare/22e17c29...162b880a Maybe it would be cleaner to close this PR and open a new one, now that feature took another turn. I just realized my last review included comments I did not commit as review since last year! Oops. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1622287665 From sspitsyn at openjdk.org Wed Jul 5 19:12:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 5 Jul 2023 19:12:57 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite [v3] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 15:35:16 GMT, Patricio Chilano Mateo wrote: >> Please review the following fix. Runtime methods called through the SharedRuntime::generate_resolve_blob() stub always return the value stored in _from_compiled_entry as the entry point to the callee method. This will either be the entry point to the compiled version of callee if there is one or the c2i adapter entry point. But this doesn't consider the case where an EnterInterpOnlyModeClosure handshake catches the JavaThread in the transition back to Java on those methods. In that case we should return the c2i adapter entry point even if there is a compiled entry point. Otherwise the JavaThread will continue calling the compiled versions of methods without noticing it's in interpreted only mode until it either calls a method that hasn't been compiled yet or it returns to the caller of that resolved callee where the change to interpreter only mode happened (since the EnterInterpOnlyModeClosure handshake marked all the frames on the stack for deoptimization). >> >> This is a long standing bug but has been made visible with the assert added as part of 8288949 where a related issue was fixed. There are more details in the bug comments about how this specific crash happens and its relation with 8288949. I also attached a reproducer. >> >> These runtime methods are already using JRT_BLOCK_ENTRY/JRT_BLOCK so that the entry point to the callee is fetched only after the last possible safepoint in JRT_BLOCK_END. This guarantees that we will not return an entry point to compiled code that has already been removed. So the fix just adds a check to verify if the JavaThread entered into interpreted only mode in that transition back to Java and to return the c2i entry point instead. >> >> I tested the patch in mach5 tiers 1-6. I also verified it with the reproducer I attached to the bug. I didn't include it as an additional test but I can do that if desired. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > revert to just remove assert added in 8288949 Just removing the asserts looks okay to me. Also, it will be safe to backport to 21. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14108#pullrequestreview-1515166946 From mchung at openjdk.org Wed Jul 5 19:16:53 2023 From: mchung at openjdk.org (Mandy Chung) Date: Wed, 5 Jul 2023 19:16:53 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 17:25:24 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Rename new parameter according to the HS coding conventions Thanks for catching this issue. I agree that `Method::invoke` should be skipped the caller-sensitive test in this case but the fix isn't quite right. The caller-sensitive test should apply in any batch. For example, `CSM` calls `getCallerClass` reflectively, I think the stack would look like this: java.lang.StackWalker::getCallerClass java.lang.invoke.DirectMethodHandle$Holder::invokeStatic java.lang.invoke.LambdaForm$MH/0x0000000800002c00::invoke : : jdk.internal.reflect.DirectMethodHandleAccessor::invokeImpl jdk.internal.reflect.DirectMethodHandleAccessor::invoke java.lang.reflect.Method::invoke CSM <--------- caller-sensitive method and UOE should be thrown In this case, UOE should be thrown. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1622325630 From duke at openjdk.org Wed Jul 5 19:22:59 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 5 Jul 2023 19:22:59 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 16:43:18 GMT, Ioi Lam wrote: >> It's a bit disappointing for a PR aiming to make heap archiving GC agnostic, to make assumptions about GC internal memory layout, that doesn't apply to all collectors. >> We have discussed previously with @iklam an approach where materializing archived objects uses the normal object allocation APIs. That would for real make the heap archiving mechanism GC agnostic. I would rather see that being prototyped, than a not GC agnostic approach that we might throw away right after it gets integrated, in favour of the more GC agnostic approach. > >> Based on discussion with @fisk, I created an RFE for investigating his idea. Please see https://bugs.openjdk.org/browse/JDK-8310823 >> >> It seems quite promising to me, and will greatly reduce (or eliminate) the interface between CDS and the collectors. >> >> I would like to keep the current performance as possible, so I am leaning towards having an API for CDS to tell the collector about preferred location of the archived objects, to allow mmap'ing and reduce/avoid relocation. But such an hinting API will be much smaller than the ones proposed by this PR. >> >> (We'd also need an reverse API for the collector to tell CDS what the preferred address would be, during CDS dump time). > > I've done a very simple (and rough) implementation of @fisk 's idea, without any optimizations. Every relocation is done via a hashtable lookup. For the default CDS archive, this makes about 48000 relocations on start-up. > > https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 > > > $ perf stat -r 40 java --version > /dev/null > 0.015065 +- 0.000228 seconds time elapsed ( +- 1.51% ) > > $ perf stat -r 40 java -XX:+NewArchiveHeapLoading --version > /dev/null > 0.020598 +- 0.000215 seconds time elapsed ( +- 1.04% ) > > $ perf stat -r 40 java -XX:+UseSerialGC --version > /dev/null > 0.013929 +- 0.000229 seconds time elapsed ( +- 1.64% ) > > $ perf stat -r 40 java -XX:+UseSerialGC -XX:+NewArchiveHeapLoading --version > /dev/null > 0.019107 +- 0.000222 seconds time elapsed ( +- 1.16% ) > > > The cost of the individual object allocation and relocation is about 5ms. > > So far the slow down doesn't seem too outrages. > > My next step is to optimize the relocation code to see how much faster it can get. > > I noticed that the objects in `ArchiveHeapLoader::newcode_runtime_allocate_objects()` are allocated in at least two contiguous blocks, due to TLAB overflow. @iklam in your performance tests for "java -version" what is the "heap data relocation delta"? Is it non-zero? If so, can you also add the numbers for the runs with -Xmx128m which would correspond to the best case where no relocation is done just to add another data point. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1622331716 From matsaave at openjdk.org Wed Jul 5 19:24:03 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 19:24:03 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v2] In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 14:14:53 GMT, Frederic Parain wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen and Amit comments > > src/hotspot/share/interpreter/bytecodeTracer.cpp line 627: > >> 625: case Bytecodes::_putfield: >> 626: case Bytecodes::_getfield: >> 627: // TODO: get_index_u2() does not work here due to using java_u2 instead of native_u2 > > Is the TODO comment still relevant? This was removed after the merge > src/hotspot/share/interpreter/interpreterRuntime.cpp line 719: > >> 717: ResolvedFieldEntry* entry = pool->resolved_field_entry_at(field_index); >> 718: entry->fill_in(info.field_holder(), info.offset(), (u2)info.index(), (u1)state, (u1)get_code, (u1)put_code); >> 719: entry->set_flags(info.access_flags().is_final(), info.access_flags().is_volatile()); > > I think there's an issue here: fill_in() will mark the field entry as resolved (by setting get_code or put_code), and by consequence usable by other threads *before* the flags are initialized. This means another thread could see the entry as being resolved but still get the uninitialized flags. Great catch! Switching these calls should be sufficient, correct? > src/hotspot/share/oops/resolvedFieldEntry.hpp line 95: > >> 93: return (put_code() == code); >> 94: default: >> 95: assert(code == Bytecodes::_nop, "Must be get, put, or nop bytecode"); > > In which cases the is_resolved() method is called with the _nop bytecode in argument? After some testing, this does not seem necessary, the bytecode can only be get or put based on the callers. The default case will be replaced with a ShouldNotReachHere() instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253529383 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253529663 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253529233 From matsaave at openjdk.org Wed Jul 5 19:28:03 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 19:28:03 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 15:18:51 GMT, Frederic Parain wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Coleen and Amit comments > > src/hotspot/share/interpreter/bytecodeTracer.cpp line 328: > >> 326: ConstantPool* constants = _current_method->constants(); >> 327: if (constants->cache() == nullptr) { >> 328: cp_index = i; // TODO: This is wrong on little-endian. See JDK-8309811. > > Fix the code or the comment. This has been resolved with the merge ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253534001 From pchilanomate at openjdk.org Wed Jul 5 19:29:58 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 5 Jul 2023 19:29:58 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite [v3] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 19:10:20 GMT, Serguei Spitsyn wrote: > Just removing the asserts looks okay to me. Also, it will be safe to backport to 21. Thanks, Serguei > Thanks for the review Serguei! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14108#issuecomment-1622345550 From sspitsyn at openjdk.org Wed Jul 5 19:31:04 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 5 Jul 2023 19:31:04 GMT Subject: RFR: 8303086: SIGSEGV in JavaThread::is_interp_only_mode() In-Reply-To: References: <5PhlR3J_Cg-n5DSnX258xNjol34M6eqgTd3AZaoDNsc=.4c7d9382-9f5d-4bdd-b985-7ffc4149da67@github.com> Message-ID: On Sun, 2 Jul 2023 22:39:46 GMT, David Holmes wrote: >> The JVMTI function `SetEventNotificationMode` can set notification mode globally (`event_thread == nullptr`) for all threads or for a specific thread (`event_thread != nullptr`). To get a stable mount/unmount vision of virtual threads a JvmtiVTMSTransitionDisabler helper object is created : >> `JvmtiVTMSTransitionDisabler disabler(event_thread);` >> >> In a case if `event_thread == nullptr` the VTMS transitions are disabled for all virtual thread, >> otherwise they are disabled for a specific thread if it is virtual. >> The call to `JvmtiEventController::set_user_enabled()` makes a call to `recompute_enabled()` at the end of its work to do a required bookkeeping. As part of this work, the `recompute_thread_enabled(state)` is called for each thread from the `ThreadsListHandle`, not only for the given `event_thread`: >> >> ThreadsListHandle tlh; >> for (; state != nullptr; state = state->next()) { >> any_env_thread_enabled |= recompute_thread_enabled(state); >> } >> >> This can cause crashes as VTMS transitions for other virtual threads are allowed. >> Crashes are observed in this small function: >> >> bool is_interp_only_mode() { >> return _thread == nullptr ? _saved_interp_only_mode != 0 : _thread->is_interp_only_mode(); >> } >> >> In a case `_thread != nullptr` then the call needs to be executed: `_thread->is_interp_only_mode()`. >> But the filed `_thread` can be already changed to `nullptr` by a VTMS transition. >> >> The fix is to always disable all transitions. >> Thanks to Dan and Patricio for great analysis of this crash! >> >> Testing: >> - In progress: mach5 tiers 1-6 > > src/hotspot/share/prims/jvmtiEnv.cpp line 578: > >> 576: record_class_file_load_hook_enabled(); >> 577: } >> 578: JvmtiVTMSTransitionDisabler disabler; > > An explanatory comment would have been good. The fix has been already integrated. I'll add a comment when there is a chance. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14728#discussion_r1253537846 From stuefe at openjdk.org Wed Jul 5 19:55:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 19:55:04 GMT Subject: RFR: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 15:48:01 GMT, Ioi Lam wrote: >> I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. >> >> (*2023-07-05 Updated to fit patch description to the agreed final form*) >> >> ---- >> >> CDS narrow Klass handling plays a role for archived heap objects. >> >> When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. >> We recompute those narrow Klass ids using the following encoding scheme: >> - base = future assumed mapping start address >> - shift = dump time (!) JVMs encoding shift (A) >> see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 >> >> At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: >> - encoding base is the range start address (mapping base) >> - encoding shift is always zero >> see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 >> >> The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) >> >> At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. >> >> ------------------- >> >> There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: >> >> - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). >> >> - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass rang... > > Hi Thomas, thanks for doing the clean up. > > I am not quite sure about the meaning of `precomputed_narrow_klass_base_delta`. Is it intended to be used for Lilliput, where a non-zero value may be used? > > Since it's hard-coded to zero now, it's hard to observe how it would affect the calculation. > > Before Liliput: > > At dump time, all narrowKlass values are updated to be relative to `_requested_static_archive_bottom`, with `ArchiveHeapWriter::precomputed_narrow_klass_shift` (which is hard coded to zero) > > At runtime, we always set `CompressedKlassPointers::base()` to `header()->mapped_base_address()` in here: > > https://github.com/openjdk/jdk/blob/46e4ee1e80652203bd59d968ea72b27681bdf312/src/hotspot/share/cds/metaspaceShared.cpp#L1149 > > Currently it's impossible to set an alternative `CompressedClassSpaceBaseAddress` when using CDS: > > > develop(size_t, CompressedClassSpaceBaseAddress, 0, \ > "Force the class space to be allocated at this address or " \ > "fails VM initialization (requires -Xshare=off.") \ > > > Also we will always have zero shifts, as we don't allow CompressedClassSpaceSize to be larger than 3G: > > > product(size_t, CompressedClassSpaceSize, 1*G, \ > "Maximum size of class area in Metaspace when compressed " \ > "class pointers are used") \ > range(1*M, 3*G) \ > \ > > > This means we can actually always use the archived Java objects, and this "if" block can be turned into an assert > > > if (archive_narrow_klass_base != CompressedKlassPointers::base() || > narrow_klass_shift() != CompressedKlassPointers::shift()) { > log_info(cds)("CDS heap data cannot be used because the archive was created with an incompatible narrow klass encoding mode."); > return false; > > > I think for the current implementation, we might as well get rid of these fields in filemap.hpp. > > > size_t _narrow_klass_base_delta; // narrow Klass base and shift used to pre-calculate nKlass ids > int _narrow_klass_shift; // in archived objects (see comment in archiveHeapWriter.hpp) > > > Having them there will create more confusion. They may be useful for the future, but it's hard to understand how these fields would be used without implementing the "future" :-) Thanks @iklam and @ashu-mehra ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14688#issuecomment-1622398507 From stuefe at openjdk.org Wed Jul 5 19:55:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 5 Jul 2023 19:55:06 GMT Subject: Integrated: JDK-8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 07:48:33 GMT, Thomas Stuefe wrote: > I recently spent too much time trying to understand the interleaving of narrow Klass decoding with CDS. Thanks to @iklam for clarifying some details. This patch aims to make the code easier to understand and more correct. > > (*2023-07-05 Updated to fit patch description to the agreed final form*) > > ---- > > CDS narrow Klass handling plays a role for archived heap objects. > > When dumping heap objects, we must recompute their narrow Klass ids since the relative positions of archived Klass instances change compared to their live counterparts in the dump time JVM. > We recompute those narrow Klass ids using the following encoding scheme: > - base = future assumed mapping start address > - shift = dump time (!) JVMs encoding shift (A) > see `ArchiveHeapWriter::update_header_for_requested_obj` https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/cds/archiveHeapWriter.cpp#L419-L425 > > At CDS runtime, we load the CDS archive, then place the class space behind it. We then initialize narrow Klass encoding for the resulting combined Klass range such that: > - encoding base is the range start address (mapping base) > - encoding shift is always zero > see `CompressedKlassPointers::initialize` : https://github.com/openjdk/jdk/blob/c3f10e847999ec254893de5a1a5de32fd07f715a/src/hotspot/share/oops/compressedOops.cpp#L195-L231 > > The lengthy comment there is misleading and partly wrong (regrettable since it was mine :) > > At dump time, when initializing the JVM, we also set the encoding base to klass range start and shift to zero (also `CompressedKlassPointers::initialize`). That is the shift we later use for (A); hence, that shift is zero. > > ------------------- > > There are minor things wrong with the current code. Mainly, the *dump time* VM's narrow Klass encoding scheme is irrelevant for the encoding employed on the future runtime archive since we recalculate the narrow Klass ids for archived heap objects. That means: > > - In `CompressedKlassPointers::initialize`, there is no need to fix the encoding base and shift for the *dump time* JVM. Dump time JVM can use whatever base+shift it pleases; it can use the same code path as when CDS is off (e.g. use zero-based encoding). > > - In `ArchiveHeapWriter::update_header_for_requested_obj`, we should not use the dump time JVM shift for pre-computing the narrow Klass ids. Instead, we should use the *minimal shift needed for the maximal possible future Klass range size*. The comment in ArchiveHeapWriter explains this in g... This pull request has now been integrated. Changeset: 0616648c Author: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/0616648c59215d001211423402c6444ce228f01e Stats: 159 lines in 8 files changed: 70 ins; 57 del; 32 mod 8311035: CDS should not use dump time JVM narrow Klass encoding to pre-compute Klass ids Reviewed-by: iklam ------------- PR: https://git.openjdk.org/jdk/pull/14688 From matsaave at openjdk.org Wed Jul 5 20:14:00 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 20:14:00 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v5] In-Reply-To: References: Message-ID: <7yMnTEisZZggwjKlv5idBInF8rW3-Odd_UH8fy8xo2s=.c0f7577b-b44e-46e8-ae69-4dd7050db0d3@github.com> > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Frederic comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/e59773eb..ef61b5b1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=03-04 Stats: 35 lines in 5 files changed: 7 ins; 4 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From sspitsyn at openjdk.org Wed Jul 5 20:18:13 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 5 Jul 2023 20:18:13 GMT Subject: [jdk21] RFR: 8303086: SIGSEGV in JavaThread::is_interp_only_mode() Message-ID: <7YTjvWV6Y598BtSLd1dwToqmDj0D5QZK3jXOTIPBXdU=.af12e295-bf97-42c5-a6cc-5c61bb367cf0@github.com> Clean backport from mainline jdk repo to jdk21 for the fix of: [8303086](https://bugs.openjdk.org/browse/JDK-8303086): SIGSEGV in JavaThread::is_interp_only_mode() Testing: - TBD: mach5 tiers 1-5 ------------- Commit messages: - Backport 971c2efb698065c65dcf7373d8c3027f58d5f503 Changes: https://git.openjdk.org/jdk21/pull/96/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=96&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8303086 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk21/pull/96.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/96/head:pull/96 PR: https://git.openjdk.org/jdk21/pull/96 From matsaave at openjdk.org Wed Jul 5 20:19:16 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 20:19:16 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v6] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Fred comments x86 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/ef61b5b1..aabd5b45 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=04-05 Stats: 34 lines in 1 file changed: 1 ins; 6 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From matsaave at openjdk.org Wed Jul 5 20:23:57 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 5 Jul 2023 20:23:57 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v2] In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 14:40:34 GMT, Matias Saavedra Silva wrote: >> src/hotspot/cpu/aarch64/templateTable_aarch64.cpp line 2553: >> >>> 2551: if (!CompilerConfig::is_c1_or_interpreter_only_no_jvmci()){ >>> 2552: Label notVolatile; >>> 2553: __ andr(raw_flags, raw_flags, 0x1); >> >> It looks like `andr` is not needed here? > > I believe `andr` is still necessary here since `raw_flags` has two bits that are used. The is_volatile flag may be false but if is_final is true, the flag will not be zero without the and. It looks like I misunderstood what `tbz` does! I believe you are correct in suggesting that the `andr` is not necessary. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1253605354 From pchilanomate at openjdk.org Wed Jul 5 20:41:01 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Wed, 5 Jul 2023 20:41:01 GMT Subject: [jdk21] RFR: 8303086: SIGSEGV in JavaThread::is_interp_only_mode() In-Reply-To: <7YTjvWV6Y598BtSLd1dwToqmDj0D5QZK3jXOTIPBXdU=.af12e295-bf97-42c5-a6cc-5c61bb367cf0@github.com> References: <7YTjvWV6Y598BtSLd1dwToqmDj0D5QZK3jXOTIPBXdU=.af12e295-bf97-42c5-a6cc-5c61bb367cf0@github.com> Message-ID: On Wed, 5 Jul 2023 19:33:16 GMT, Serguei Spitsyn wrote: > Clean backport from mainline jdk repo to jdk21 for the fix of: > [8303086](https://bugs.openjdk.org/browse/JDK-8303086): SIGSEGV in JavaThread::is_interp_only_mode() > > Testing: > - TBD: mach5 tiers 1-5 Marked as reviewed by pchilanomate (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk21/pull/96#pullrequestreview-1515324386 From iklam at openjdk.org Wed Jul 5 21:09:55 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 5 Jul 2023 21:09:55 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 16:43:18 GMT, Ioi Lam wrote: >> It's a bit disappointing for a PR aiming to make heap archiving GC agnostic, to make assumptions about GC internal memory layout, that doesn't apply to all collectors. >> We have discussed previously with @iklam an approach where materializing archived objects uses the normal object allocation APIs. That would for real make the heap archiving mechanism GC agnostic. I would rather see that being prototyped, than a not GC agnostic approach that we might throw away right after it gets integrated, in favour of the more GC agnostic approach. > >> Based on discussion with @fisk, I created an RFE for investigating his idea. Please see https://bugs.openjdk.org/browse/JDK-8310823 >> >> It seems quite promising to me, and will greatly reduce (or eliminate) the interface between CDS and the collectors. >> >> I would like to keep the current performance as possible, so I am leaning towards having an API for CDS to tell the collector about preferred location of the archived objects, to allow mmap'ing and reduce/avoid relocation. But such an hinting API will be much smaller than the ones proposed by this PR. >> >> (We'd also need an reverse API for the collector to tell CDS what the preferred address would be, during CDS dump time). > > I've done a very simple (and rough) implementation of @fisk 's idea, without any optimizations. Every relocation is done via a hashtable lookup. For the default CDS archive, this makes about 48000 relocations on start-up. > > https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 > > > $ perf stat -r 40 java --version > /dev/null > 0.015065 +- 0.000228 seconds time elapsed ( +- 1.51% ) > > $ perf stat -r 40 java -XX:+NewArchiveHeapLoading --version > /dev/null > 0.020598 +- 0.000215 seconds time elapsed ( +- 1.04% ) > > $ perf stat -r 40 java -XX:+UseSerialGC --version > /dev/null > 0.013929 +- 0.000229 seconds time elapsed ( +- 1.64% ) > > $ perf stat -r 40 java -XX:+UseSerialGC -XX:+NewArchiveHeapLoading --version > /dev/null > 0.019107 +- 0.000222 seconds time elapsed ( +- 1.16% ) > > > The cost of the individual object allocation and relocation is about 5ms. > > So far the slow down doesn't seem too outrages. > > My next step is to optimize the relocation code to see how much faster it can get. > > I noticed that the objects in `ArchiveHeapLoader::newcode_runtime_allocate_objects()` are allocated in at least two contiguous blocks, due to TLAB overflow. > @iklam in your performance tests for "java -version" what is the "heap data relocation delta"? Is it non-zero? If so, can you also add the numbers for the runs with -Xmx128m which would correspond to the best case where no relocation is done just to add another data point. I first ran `java -Xshare:dump` so all the subsequent `java --version` runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. Anyway, I just wanted to get a rough baseline. I will do more detailed benchmarking after implementing the optimized relocation for the `-XX:+NewArchiveHeapLoading` case. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1622512296 From duke at openjdk.org Wed Jul 5 21:36:55 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 5 Jul 2023 21:36:55 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 21:07:01 GMT, Ioi Lam wrote: > I first ran java -Xshare:dump so all the subsequent java --version runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. Okay, thanks for clarifying. I thought `java --version` runs were using the default archive. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1622557048 From dholmes at openjdk.org Wed Jul 5 22:18:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 22:18:59 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 23:23:07 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > use OS specific native stack printing in class load cause native stack logging Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14553#pullrequestreview-1515451688 From dholmes at openjdk.org Wed Jul 5 22:19:01 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 5 Jul 2023 22:19:01 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 13:03:53 GMT, Doug Simon wrote: >> It seems it is. I was not aware of that. So I need to fix `JavaThread::print_jni_stack`, and more generally consolidate this stack printing code. > > Yep, that sounds like a good idea. I assume that can be done in a follow up issue so I can now proceed to mach5 test and then merge this PR? Yes I will do the cleanup in a follow up issue. I will approve this PR now. You still need a second reviewer to approve. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1253708977 From sspitsyn at openjdk.org Thu Jul 6 01:12:59 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 6 Jul 2023 01:12:59 GMT Subject: [jdk21] RFR: 8303086: SIGSEGV in JavaThread::is_interp_only_mode() In-Reply-To: <7YTjvWV6Y598BtSLd1dwToqmDj0D5QZK3jXOTIPBXdU=.af12e295-bf97-42c5-a6cc-5c61bb367cf0@github.com> References: <7YTjvWV6Y598BtSLd1dwToqmDj0D5QZK3jXOTIPBXdU=.af12e295-bf97-42c5-a6cc-5c61bb367cf0@github.com> Message-ID: On Wed, 5 Jul 2023 19:33:16 GMT, Serguei Spitsyn wrote: > Clean backport from mainline jdk repo to jdk21 for the fix of: > [8303086](https://bugs.openjdk.org/browse/JDK-8303086): SIGSEGV in JavaThread::is_interp_only_mode() > > Testing: > - TBD: mach5 tiers 1-5 Thank you for review, Patricio! ------------- PR Comment: https://git.openjdk.org/jdk21/pull/96#issuecomment-1622765842 From sspitsyn at openjdk.org Thu Jul 6 01:13:00 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 6 Jul 2023 01:13:00 GMT Subject: [jdk21] Integrated: 8303086: SIGSEGV in JavaThread::is_interp_only_mode() In-Reply-To: <7YTjvWV6Y598BtSLd1dwToqmDj0D5QZK3jXOTIPBXdU=.af12e295-bf97-42c5-a6cc-5c61bb367cf0@github.com> References: <7YTjvWV6Y598BtSLd1dwToqmDj0D5QZK3jXOTIPBXdU=.af12e295-bf97-42c5-a6cc-5c61bb367cf0@github.com> Message-ID: <4xXhNBuYiAfDaWjvXoRoazho6koSbOeM8Fk8XRpuEtg=.df098456-e7ec-43f3-991d-17c4a18aac68@github.com> On Wed, 5 Jul 2023 19:33:16 GMT, Serguei Spitsyn wrote: > Clean backport from mainline jdk repo to jdk21 for the fix of: > [8303086](https://bugs.openjdk.org/browse/JDK-8303086): SIGSEGV in JavaThread::is_interp_only_mode() > > Testing: > - TBD: mach5 tiers 1-5 This pull request has now been integrated. Changeset: f24c5540 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk21/commit/f24c5540ffd9ad6ef151338f64cd15f0a4df9ed1 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8303086: SIGSEGV in JavaThread::is_interp_only_mode() Reviewed-by: pchilanomate Backport-of: 971c2efb698065c65dcf7373d8c3027f58d5f503 ------------- PR: https://git.openjdk.org/jdk21/pull/96 From jpbempel at openjdk.org Thu Jul 6 05:25:14 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Thu, 6 Jul 2023 05:25:14 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform Message-ID: Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. ------------- Commit messages: - 8308762: Metaspace leak with Instrumentation.retransform Changes: https://git.openjdk.org/jdk/pull/14780/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14780&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308762 Stats: 62 lines in 4 files changed: 29 ins; 31 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14780.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14780/head:pull/14780 PR: https://git.openjdk.org/jdk/pull/14780 From stuefe at openjdk.org Thu Jul 6 05:36:34 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 05:36:34 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> Message-ID: On Wed, 5 Jul 2023 17:28:15 GMT, Thomas Stuefe wrote: >> (*Updated 2023-07-05 to reflect the current state of the patch*) >> >> This RFE adds the option to auto-trim the Glibc heap as part of the GC cycle. If the VM process suffered high temporary malloc spikes (regardless of whether from JVM- or user code), this could recover significant amounts of memory. >> >> We discussed this a year ago [1], but the item got pushed to the bottom of my work pile, therefore, it took longer than I thought. >> >> ### Motivation >> >> The Glibc is reluctant to return memory to the OS, more so than other allocators. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching, and a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap (the typical native application). The JVM, however, clusters allocations and for a lot of use cases rolls its own memory management via mmap. And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. >> >> To help, Glibc exports an API to trim the C-heap: `malloc_trim(3)`. With JDK 18 [2], SAP contributed a new jcmd command to *manually* trim the C-heap on Linux. This RFE adds a complementary way to trim automatically. >> >> #### Is this even a problem? >> >> Yes. >> >> The JVM clusters most native memory allocations and satisfies them with mmap. But there are enough C-heap allocations left to cause malloc spikes that are subject of memory retention. Note that one example are hotspot arenas themselves. >> >> But many cases of high memory retention in Glibc I have seen in third-party JNI code. Libraries allocate large buffers via malloc as temporary buffers. In fact, since we introduced the jcmd "System.trim_native_heap", some of our customers started to call this command periodically in scripts to counter these issues. >> >> ### How trimming works >> >> Trimming is done via `malloc_trim(2)`. `malloc_trim` will iterate over all arenas and trim each one subsequently. While doing that, it will lock the arena, which may cause some (but not all) subsequent actions on the same arenas to block. glibc also trims automatically on free, but that is very limited (see https://github.com/openjdk/jdk/pull/10085#issuecomment-1619638641 for details). >> >> `malloc_trim` offers almost no way to control its behavior; in particular, no way to limit its runtime. Its run... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: > > - fix windows build > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - Remove adaptive stepdown coding > - Merge master > - wip > - Merge branch 'master' into JDK-8293114-GC-trim-native > - wip > - ... and 31 more: https://git.openjdk.org/jdk/compare/22e17c29...162b880a Closing this PR in favour of a new, slightly renamed one, as Aleksey suggested ------------- PR Comment: https://git.openjdk.org/jdk/pull/10085#issuecomment-1623016718 From dholmes at openjdk.org Thu Jul 6 06:47:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 6 Jul 2023 06:47:56 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 05:18:01 GMT, Jean-Philippe Bempel wrote: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. Sorry I'm having a lot of trouble trying to understand the fix in the context of the problem description as outlined in the bug report. The issue pertained only to the treatment of Throwable due to it being pre-resolved by the verifier, but your fix is looking at Field and MethodRefs ?? There are also remaining comments about resolved and unresolved class entries deliberately not being considered the same. src/hotspot/share/oops/constantPool.cpp line 1281: > 1279: // Returns true if the current mismatch is due to a resolved/unresolved > 1280: // class pair. Otherwise, returns false. > 1281: bool ConstantPool::is_unresolved_class_mismatch(int index1, Has this been moved verbatim from jvmtiRedefineClasses.cpp? There are a couple of style nits with that existing code that don't fit this file: - parameters should line up ie. const under int - no comment on the closing brace of the method body. src/hotspot/share/oops/constantPool.cpp line 1298: > 1296: } > 1297: > 1298: char *s1 = klass_name_at(index1)->as_C_string(); `as_C_string` needs an active `ResourceMark` - not sure where it is coming from even in the existing code. There should at least be a comment that an active ResourceMark is needed. ------------- PR Review: https://git.openjdk.org/jdk/pull/14780#pullrequestreview-1515832397 PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1253992199 PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1253988705 From stuefe at openjdk.org Thu Jul 6 07:04:23 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 07:04:23 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: <7q-A34HZnue-oo56RV6hd6hVV3SNHlLL7wx-GQ6scm8=.20e79884-9add-4954-ac4a-1ea825c9b458@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> <7q-A34HZnue-oo56RV6hd6hVV3SNHlLL7wx-GQ6scm8=.20e79884-9add-4954-ac4a-1ea825c9b458@github.com> Message-ID: <777YQVlFLc_r58lJDCMMF08xVdkJimzcAZlxlYmSMeo=.d0e1b8d9-8b41-42b5-b0f8-819aa8640636@github.com> On Wed, 5 Jul 2023 18:37:41 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: >> >> - fix windows build >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - Remove adaptive stepdown coding >> - Merge master >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - ... and 31 more: https://git.openjdk.org/jdk/compare/22e17c29...162b880a > > src/hotspot/share/memory/arena.cpp line 105: > >> 103: >> 104: static void clean() { >> 105: ThreadCritical tc; > > Why do you need `ThreadCritical` here? I would have thought `PauseMark` handles everything right? Unrelated to pause. I introduced an empty check on all pools, but that has to happen under lock protection too. So I moved up the 4 ThreadCriticals from the prune functions to this function. The point is to avoid pausing if nothing is done, which is most of the time. Also, instead of 4 calls to ThreadCritical, just one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1254019061 From dholmes at openjdk.org Thu Jul 6 07:19:01 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 6 Jul 2023 07:19:01 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 17:25:24 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Rename new parameter according to the HS coding conventions src/hotspot/share/prims/stackwalk.cpp line 501: > 499: KeepStackGCProcessedMark keep_stack(THREAD); > 500: numFrames = fill_in_frames(mode, stream, frame_count, start_index, > 501: frames_array, end_index, true, CHECK_NULL); Could you annotate the new argument please ie. `true /* first batch */` and `false /* not first batch */`. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14773#discussion_r1254033569 From dholmes at openjdk.org Thu Jul 6 07:35:57 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 6 Jul 2023 07:35:57 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 07:03:36 GMT, Matthias Baesken wrote: > Hi David , FONTCONFIG_USE_MMAP sounds interesting, I saw this too in the list of environment variables. Should I add this one? It seems more useful, at first glance, than the others :) I find it hard to know where to draw the line here. Over time you could end up eventually adding everything. Would it be better to dump the entire environment to a file along-side the hs_err file? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14767#issuecomment-1623139664 From mbaesken at openjdk.org Thu Jul 6 07:58:54 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 6 Jul 2023 07:58:54 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: <3s1cIlmc8GaNY_J8pfyLmbM-GjhVdyq0oqk8pD2WFJE=.103b3273-3e0b-480e-9ef1-6b66e05d3d68@github.com> On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. Hi David, > Would it be better to dump the entire environment to a file along-side the hs_err file? I am fine with this idea. However I am not sure if this is the policy of the OpenJDK. If you print everything, maybe some application dependent variables show up as well that have nothing to do with the JVM at all. For example in my environment on my Windows laptop, there would be environment variables about OneDrive and about Perforce SCM, not sure if we want to print those. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14767#issuecomment-1623168412 From simonis at openjdk.org Thu Jul 6 08:00:54 2023 From: simonis at openjdk.org (Volker Simonis) Date: Thu, 6 Jul 2023 08:00:54 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 19:14:07 GMT, Mandy Chung wrote: > Thanks for catching this issue. I agree that `Method::invoke` should be skipped the caller-sensitive test in this case but the fix isn't quite right. The caller-sensitive test should apply in any batch. For example, `CSM` calls `getCallerClass` reflectively, I think the stack would look like this: > > ``` > java.lang.StackWalker::getCallerClass > java.lang.invoke.DirectMethodHandle$Holder::invokeStatic > java.lang.invoke.LambdaForm$MH/0x0000000800002c00::invoke > : > : > jdk.internal.reflect.DirectMethodHandleAccessor::invokeImpl > jdk.internal.reflect.DirectMethodHandleAccessor::invoke > java.lang.reflect.Method::invoke > CSM <--------- caller-sensitive method and UOE should be thrown > ``` > > In this case, UOE should be thrown. Hi @mlchung , Thanks for your comments. I think you're right. More specifically, the caller-sensitive test should be applied to the first non-reflective, non-methodhandle frame, independent on the batch in which it appears. I'll add a new test for the case you've brought up and update the fix accordingly. Best regards, Volker ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1623171029 From stuefe at openjdk.org Thu Jul 6 10:07:25 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 10:07:25 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap Message-ID: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. --------------- This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. ### Background: The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. #### GLIBC internals The following information I took from the glibc source code and experimenting. ##### Why do we need to trim manually? Does the Glibc not trim on free? Upon `free()`, glibc may return memory to the OS if: - the returned block was mmap'ed - the returned block was not added to tcache or to fastbins - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: a) for the main arena, glibc attempts to lower the brk() b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. To increase the chance of auto-reclamation happening, one can do one or more things: - a) increase allocation sizes - b) limit mallocs to very few threads, ideally just one - c) set MALLOC_ARENA_MAX=1 - d) set the `glibc.malloc.trim_threshold` to a very low value, e.g., 1 But: - (a) is not possible for foreign code; not that even within hotspot, where we cluster allocations using hotspot arenas, the typical arena chunk size is too small to be auto-reclaimed - (b) may just not be feasible - (c) is *terrible* for performance since many C-Heap operations will compete over the lock of that single arena - (d) works if either (b) or (c) are in place and if the released block happens to border the top of the arena. And it also costs performance. The JVM only has limited influence on (a), none on (b), (c) is a really bad idea, and hence (d) often does little. That mirrors my practical experiences. ##### How does `malloc_trim()` differ from trimming on free() ? `malloc_trim()`, will look for holes that are larger than a page; so it limits itself not to just reclaiming memory at the top of the arena. It will then `madvise(MADV_DONTNEED)` those holes. It does that for every arena. ##### What are the cons of calling `malloc_trim()`? `malloc_trim()` cannot be interrupted. Once it runs, it runs. The runtime of `malloc_trim()` is not predictable. If there is nothing to reclaim, it is very fast (sub-ms). If there is a lot to reclaim (e.g. >32GB), I saw times of up to 800ms. Moreover, `malloc_trim`, while trimming each arena, locks the arena. That may lock out concurrent C-heap operations in the thread that uses this arena. Note, however, that this is rare since many operations will be satisfied from the tcache and therefore don't lock. ##### What about the `pad` parameter for `malloc_trim()` I found it has very little effect. It only affects how many bytes are preserved at the top of the main arena. It does not affect other arenas, nor does it affect how much space malloc_trim reclaims by releasing "holes", which is the main part of memory release. ### The Patch Patch adds new options (experimental): -XX:+GCTrimNativeHeap -XX:GCTrimNativeHeapInterval= (defaults to 60) `GCTrimNativeHeap` is off by default. If enabled, it will cause the VM to trim the native heap periodically. The period is defined by `GCTrimNativeHeapInterval`. Periodic trimming is done in its own thread. We cannot burden the ServiceThread, since the runtime of trims is unpredictable. The patch also adds a way to suspend trimming temporarily; if suspended, no trims will start, but ongoing trims will still finish. The patch uses this mechanism to suspend trimming during GC STW phases and whenever we are about to do bulk C-heap operations (e.g. deleting deflated monitors). ### Examples: This is an artificial test that causes two high malloc spikes with long idle periods. (yellow) NMT shows two spikes for malloc'ed memory; (red) RSS of the baseline JVM shows that we reach a maximum and then never recover. This is the glibc retaining the free'd memory. (blue) RSS of the patched JVM shows that we recover RSS in steps by doing periodic C-heap trimming. ![alloc-test](https://raw.githubusercontent.com/tstuefe/autotrim-experiments/master/alloc-c-heap-repro/basline-vs-autotrim30s.png) (See here for parameters: [run script](http://cr.openjdk.java.net/~stuefe/other/autotrim/run-all.sh) ) ### Tests Tested older Glibc (2.31), and newer Glibc (2.35) (`mallinfo()` vs` mallinfo2()`), on Linux x64. Older versions of this patch were routinely tested at SAP for almost half a year. - [1] https://mail.openjdk.org/pipermail/hotspot-dev/2021-August/054323.html - [2] https://bugs.openjdk.org/browse/JDK-8269345 ------------- Commit messages: - Fix test for non-glibc platforms - Initial implementation Changes: https://git.openjdk.org/jdk/pull/14781/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8293114 Stats: 643 lines in 20 files changed: 637 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From jsjolen at openjdk.org Thu Jul 6 10:07:25 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 6 Jul 2023 10:07:25 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 06:54:22 GMT, Thomas Stuefe wrote: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... >And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. Are you talking about allocations into native memory that a Java application does on its own accord and not as a consequence of the JVM doing its own allocs? For compiling, for example. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623304064 From stuefe at openjdk.org Thu Jul 6 10:07:25 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 10:07:25 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 09:20:25 GMT, Johan Sj?len wrote: > > And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. > > Are you talking about allocations into native memory that a Java application does on its own accord and not as a consequence of the JVM doing its own allocs? For compiling, for example. This does not matter. The resulting malloc load is the sum of whatever we do and whatever the native code does. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623385082 From stuefe at openjdk.org Thu Jul 6 10:07:57 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 10:07:57 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 06:54:22 GMT, Thomas Stuefe wrote: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Ping @shipilev and @robehn ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623397350 From shade at openjdk.org Thu Jul 6 10:10:53 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Jul 2023 10:10:53 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 06:54:22 GMT, Thomas Stuefe wrote: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... I don't think the comments from my yesterday's review were addressed :) There are some old comments (marked with `(Outdated)`, helpfully), but some are new. Please take a look at them? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623400537 From stuefe at openjdk.org Thu Jul 6 10:16:25 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 10:16:25 GMT Subject: RFR: JDK-8293114: GC should trim the native heap [v11] In-Reply-To: <7q-A34HZnue-oo56RV6hd6hVV3SNHlLL7wx-GQ6scm8=.20e79884-9add-4954-ac4a-1ea825c9b458@github.com> References: <23KpPM4oPV6F1nz3g5CvIqvuX-ANcsMH4GuVNXjR-Lw=.b8d0fa2d-bb85-4899-8e21-f68ea64b988d@github.com> <7q-A34HZnue-oo56RV6hd6hVV3SNHlLL7wx-GQ6scm8=.20e79884-9add-4954-ac4a-1ea825c9b458@github.com> Message-ID: <70PUTZhWuZYyoyF9lLQum0eXAGpNKWcnx7Q_D8vUi4A=.110c85d6-37b0-4440-bdb8-107e2e487738@github.com> On Wed, 5 Jul 2023 17:45:37 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 41 commits: >> >> - fix windows build >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - Remove adaptive stepdown coding >> - Merge master >> - wip >> - Merge branch 'master' into JDK-8293114-GC-trim-native >> - wip >> - ... and 31 more: https://git.openjdk.org/jdk/compare/22e17c29...162b880a > > src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 918: > >> 916: } >> 917: >> 918: TrimNative::PauseMark trim_native_pause("gc"); > > Maybe instead of whack-a-mole game of putting the pause in most GC safepoint ops, we should "just" put this mark straight into `VMOperation`, so that _all_ safepoint ops are covered? Or, if we want to limit to GC ops, `VM_GC_Operation`? Oh, this is a good idea. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/10085#discussion_r1254232411 From stuefe at openjdk.org Thu Jul 6 10:49:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 10:49:55 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 10:08:03 GMT, Aleksey Shipilev wrote: > I don't think the comments from my yesterday's review were addressed :) There are some old comments (marked with `(Outdated)`, helpfully), but some are new. Please take a look at them? Of course, sorry. Seems I missed most of them. About Pause in all VM ops, an alternative would be to just check in the trimmer if we are at safepoint, and if yes treat it as pause. I'll see if that's easier (I'm worried about pulling a mutex or atomic increasing the pauser variable in every VM op we run). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623455016 From shade at openjdk.org Thu Jul 6 10:56:52 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Jul 2023 10:56:52 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 10:47:25 GMT, Thomas Stuefe wrote: > About Pause in all VM ops, an alternative would be to just check in the trimmer if we are at safepoint, and if yes treat it as pause. I'll see if that's easier (I'm worried about pulling a mutex or atomic increasing the pauser variable in every VM op we run). Oh yes, I like it. We can just check `SafepointSynchronize::is_at_safepoint()` and `SafepointSynchronize::is_synchronizing()`. Would be even better, because we could stop trimming when there is safepoint sync pending. Makes the whole thing much less intrusive. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623462984 From jsjolen at openjdk.org Thu Jul 6 11:00:54 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 6 Jul 2023 11:00:54 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 06:54:22 GMT, Thomas Stuefe wrote: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... src/hotspot/share/memory/arena.cpp line 105: > 103: } > 104: return true; > 105: } Something seems wrong here? Having only empty pools means that `::prune()` is a no-op. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254275721 From jpbempel at openjdk.org Thu Jul 6 11:05:54 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Thu, 6 Jul 2023 11:05:54 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: <2IK4-mDcKnsZr6krAGeX1Yi_8WmTL9TMB5JCovHhKCU=.6b24c922-a875-4320-a0b4-62521dd36909@github.com> On Thu, 6 Jul 2023 06:45:15 GMT, David Holmes wrote: > Sorry I'm having a lot of trouble trying to understand the fix in the context of the problem description as outlined in the bug report. The issue pertained only to the treatment of Throwable due to it being pre-resolved by the verifier, but your fix is looking at Field and MethodRefs ?? For the Klass in itself, there is this method `is_unresolved_class_mismatch` that compares it correctly if entry differs by resolved and unresolved state. Except that for MethodRef and FieldRef, the Klass here is also compared but strictly without taking into account the already resolved state (in this case of the Throwable). So that's why I am adding the call to `is_unresolved_class_mismatch` for those cases. > There are also remaining comments about resolved and unresolved class entries deliberately not being considered the same. Sorry I don't understand what you mean, can you elaborate please? > Has this been moved verbatim from jvmtiRedefineClasses.cpp? yes > There are a couple of style nits with that existing code that don't fit this file: > > * parameters should line up ie. const under int > * no comment on the closing brace of the method body. Will adjust accordingly ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1623474415 PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1254281202 From rkennke at openjdk.org Thu Jul 6 12:02:29 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 6 Jul 2023 12:02:29 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v43] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: More reverts of non-essential changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/6ea90df0..c263e362 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=42 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=41-42 Stats: 55 lines in 11 files changed: 14 ins; 27 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From djelinski at openjdk.org Thu Jul 6 12:09:06 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Thu, 6 Jul 2023 12:09:06 GMT Subject: RFR: 8311575: Fix invalid format parameters Message-ID: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Please review this change that fixes a few issues with printf-like function parameters. No new tests. I'll run tier1 & tier2 before integrating, just to make sure. ------------- Commit messages: - Fix invalid format arguments Changes: https://git.openjdk.org/jdk/pull/14783/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14783&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311575 Stats: 5 lines in 3 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14783.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14783/head:pull/14783 PR: https://git.openjdk.org/jdk/pull/14783 From rkennke at openjdk.org Thu Jul 6 12:22:27 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 6 Jul 2023 12:22:27 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v44] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Fix copyrights - More reverts of non-essential stuff ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/c263e362..be7e6e6d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=43 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=42-43 Stats: 65 lines in 8 files changed: 20 ins; 30 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From stuefe at openjdk.org Thu Jul 6 12:24:53 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 12:24:53 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 10:58:01 GMT, Johan Sj?len wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > src/hotspot/share/memory/arena.cpp line 105: > >> 103: } >> 104: return true; >> 105: } > > Something seems wrong here? Having only empty pools means that `::prune()` is a no-op. Thanks for catching this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254357340 From stuefe at openjdk.org Thu Jul 6 12:24:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 12:24:55 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 12:20:10 GMT, Thomas Stuefe wrote: >> src/hotspot/share/memory/arena.cpp line 105: >> >>> 103: } >>> 104: return true; >>> 105: } >> >> Something seems wrong here? Having only empty pools means that `::prune()` is a no-op. > > Thanks for catching this. All tests green though - somewhat scary that we don't have a test for that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254358962 From rkennke at openjdk.org Thu Jul 6 12:42:26 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 6 Jul 2023 12:42:26 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v45] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: More copyright fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/be7e6e6d..36844dc7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=44 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=43-44 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From shade at openjdk.org Thu Jul 6 12:54:20 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Jul 2023 12:54:20 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v45] In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 12:42:26 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > More copyright fixes I like how much smaller this changeset got! I have no other comments, except a few stylistic nits: src/hotspot/cpu/aarch64/c1_MacroAssembler_aarch64.cpp line 196: > 194: if (len->is_valid()) { > 195: strw(len, Address(obj, arrayOopDesc::length_offset_in_bytes())); > 196: if (!is_aligned(arrayOopDesc::header_size_in_bytes(), BytesPerLong)) { `BytesPerLong` or `BytesPerWord`? Later code aligns for `BytesPerWord`. src/hotspot/cpu/x86/c1_MacroAssembler_x86.cpp line 182: > 180: movl(Address(obj, arrayOopDesc::length_offset_in_bytes()), len); > 181: #ifdef _LP64 > 182: if (!is_aligned(arrayOopDesc::header_size_in_bytes(), BytesPerLong)) { `BytesPerWord` here, to be consistent with other assemblers? ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/11044#pullrequestreview-1516467837 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1254390868 PR Review Comment: https://git.openjdk.org/jdk/pull/11044#discussion_r1254388809 From stuefe at openjdk.org Thu Jul 6 13:01:15 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 13:01:15 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v2] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: - seconds->ms - rename pause, unpause -> suspend, resume - fix ChunkPool::needs_cleaning ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/9ba4448e..dd4cb477 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=00-01 Stats: 47 lines in 15 files changed: 6 ins; 0 del; 41 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From cslucas at openjdk.org Thu Jul 6 13:06:30 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 6 Jul 2023 13:06:30 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v21] In-Reply-To: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: > Can I please get reviews for this PR? > > The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. > > With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: > > ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) > > What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: > > ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) > > This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. > > The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. > > The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. > > I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: - Merge branch 'openjdk:master' into rematerialization-of-merges - Addressing PR feedback. - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges - Merge branch 'openjdk:master' into rematerialization-of-merges - Rome minor refactorings. - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges Catching up with master. - Address PR review 6: debug format output & some refactoring. - Catching up with master branch. Merge remote-tracking branch 'origin/master' into rematerialization-of-merges - Address PR review 6: refactoring around rematerialization & improve test cases. - Address PR review 5: refactor on rematerialization & add tests. - ... and 12 more: https://git.openjdk.org/jdk/compare/97e99f01...25b683d6 ------------- Changes: https://git.openjdk.org/jdk/pull/12897/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12897&range=20 Stats: 2733 lines in 26 files changed: 2485 ins; 108 del; 140 mod Patch: https://git.openjdk.org/jdk/pull/12897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12897/head:pull/12897 PR: https://git.openjdk.org/jdk/pull/12897 From cslucas at openjdk.org Thu Jul 6 13:06:30 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 6 Jul 2023 13:06:30 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v18] In-Reply-To: <72OcyhmFKGyTwDy8LQ0blp5HG5dg5l9OsU5dh9osVxo=.73b3a79e-ff24-4f41-b39b-650a9036ee76@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> <-A7bd8C0q5o1WuRSeSkYYnUoApV4s9uijPmiNB2Wteo=.c5bc944c-88a3-4228-bd41-091ac6c8fb1d@github.com> <72OcyhmFKGyTwDy8LQ0blp5HG5dg5l9OsU5dh9osVxo=.73b3a79e-ff24-4f41-b39b-650a9036ee76@github.com> Message-ID: <1gB4pzC79wZ9fs7t5eWE4yTlyYkz4oK1K36wc7MWgBo=.cc02b3a7-3b4d-4909-8013-746008f50058@github.com> On Tue, 20 Jun 2023 16:44:28 GMT, Vladimir Ivanov wrote: >> Thank you once more for the comments @iwanowww . I?ll address them asap. >> >> Can I ask what requirements are there for a product flag? > >> Can I ask what requirements are there for a product flag? > > Product flags are treated as part of public API of the JVM. So, changes in behavior have to go through CSR process. Also, a product flag has to be deprecated/obsoleted first before it can be removed which takes multiple releases to happen. Better to avoid introducing new product flags unless it is well-justified or necessary. @iwanowww - I believe I've addressed all your comments so far. Is everything still looking good? ------------- PR Comment: https://git.openjdk.org/jdk/pull/12897#issuecomment-1623641674 From rkennke at openjdk.org Thu Jul 6 13:13:38 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 6 Jul 2023 13:13:38 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v46] In-Reply-To: References: Message-ID: <8qZTCEK9bHrm7JUyNG-g10fLXh-4FUhDWFUAOCgTytw=.10f6433f-97a6-4402-9915-adce7fda07e4@github.com> > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Revert arm changes, they have been cosmetic and are not needed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/36844dc7..594f9ee1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=45 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=44-45 Stats: 10 lines in 3 files changed: 1 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From rkennke at openjdk.org Thu Jul 6 13:19:10 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 6 Jul 2023 13:19:10 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v47] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: - Use BytesPerWord - Revert unnecessary change in s390 - Revert unnecessary change in PPC ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/594f9ee1..ad244e42 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=46 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=45-46 Stats: 4 lines in 4 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From rkennke at openjdk.org Thu Jul 6 13:24:14 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Thu, 6 Jul 2023 13:24:14 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v47] In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 13:19:10 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: > > - Use BytesPerWord > - Revert unnecessary change in s390 > - Revert unnecessary change in PPC I've simplified the PR significantly: - The gap is now usually cleared as part of the header initialization (as is already done for instance oops). This allows to use simple and fast word-sized clearing of the rest of the array. - Calculations of min and max array and tlab sizes are all reverted, because they are still conservatively correct. Optimizing those for a few bytes extra seems to be a rather complex task and should be done as separate PR. - The ARM parts could be reverted wholesale (sorry, @tstuefe) because 32 bit arch doesn't require any changes anymore. @RealFYang can you please check the RISCV port and bring it into the same shape as aarch64/x86? @TheRealMDoerr Can you ack PPC and s390 ports? I've only done very minor cleanups there, compared to early version. @coleenp please review again? Maybe bring others in as you see fit and/or run Mach5 testing (preferably also with -XX:-UseCompressedClassPointers, because that is what this PR is about) ------------- PR Comment: https://git.openjdk.org/jdk/pull/11044#issuecomment-1623674137 From coleenp at openjdk.org Thu Jul 6 13:38:58 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 6 Jul 2023 13:38:58 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 05:18:01 GMT, Jean-Philippe Bempel wrote: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. I wonder if compare_entry_to can just handle this case directly and not have this function. src/hotspot/share/oops/constantPool.cpp line 1302: > 1300: if (strcmp(s1, s2) != 0) { > 1301: return false; // strings don't match; not our special case > 1302: } klass_name_at() returns a Symbol* that's in the SymbolTable so this never needed to do a strcmp. It can test for equality. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14780#pullrequestreview-1516550061 PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1254440703 From stuefe at openjdk.org Thu Jul 6 15:11:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 15:11:56 GMT Subject: RFR: JDK-8310233: Linux: THP initialization is incorrect [v3] In-Reply-To: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: <5jcCSQifCgROrn4kH7aXsXchFSbSQJOaph9s5_xiJaM=.8568e0d8-cde1-4308-a5c6-4dd0edd6cdd5@github.com> On Tue, 4 Jul 2023 13:02:22 GMT, Thomas Stuefe wrote: >> Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) >> [0.001s][info][pagesize] default pagesize: 1G >> [0.001s][info][pagesize] Transparent hugepage (THP) suppor... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback johan Gentle ping, second review needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1623851429 From stuefe at openjdk.org Thu Jul 6 15:19:24 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 15:19:24 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v3] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <0jVrtltVHAxcPHn11Co5KBr_iNsFtgidx4svffvuF7Q=.9c17931d-184c-48b3-a270-1ed16d86e801@github.com> > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: restructuring ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/dd4cb477..cd8c0f1b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=01-02 Stats: 234 lines in 18 files changed: 88 ins; 106 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Thu Jul 6 15:25:03 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 15:25:03 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: last cleanups and shade feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/cd8c0f1b..9b47c64b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=02-03 Stats: 7 lines in 3 files changed: 0 ins; 3 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Thu Jul 6 15:29:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 6 Jul 2023 15:29:56 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 09:20:25 GMT, Johan Sj?len wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > >>And app's malloc load can fluctuate wildly, with temporary spikes and long idle periods. > > Are you talking about allocations into native memory that a Java application does on its own accord and not as a consequence of the JVM doing its own allocs? For compiling, for example. @jdksjolen @shipilev New version: - I removed all manual pause calls from all GC code and replaced those with just not trimming when at or near a safe point. This is less invasive since I expect the typical trim interval to be much larger than the interval we do our VM operations in. - The pauses in runtime code remain - I restored the original arena coding, I just added the pause. Though this coding could greatly benefit from cleanups, I want to keep this patch focused and easy to downport. - Since we no longer have close ties to GC coding, the tests don't need to run per gc, so I dumbed down the tests. @shipilev : I kept the SuspendMark inside TrimNative, because I like it that way. Otherwise, I would name it something like TrimNativeSuspendMark, so nothing gained but another symbol at global scope. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623878277 From coleenp at openjdk.org Thu Jul 6 15:32:01 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 6 Jul 2023 15:32:01 GMT Subject: [jdk21] RFR: 8309140: ResourceHashtable failed "assert(~(_allocation_t[0] | allocation_mask) == (uintptr_t)this) failed: lost resource object" Message-ID: This is a clean backport of JDK-[8309140](https://bugs.openjdk.org/browse/JDK-8309140) ------------- Commit messages: - Backport b6c789faad63f18e17ee7e5cefd024b3776fd469 Changes: https://git.openjdk.org/jdk21/pull/102/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=102&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309140 Stats: 99 lines in 11 files changed: 41 ins; 1 del; 57 mod Patch: https://git.openjdk.org/jdk21/pull/102.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/102/head:pull/102 PR: https://git.openjdk.org/jdk21/pull/102 From lkorinth at openjdk.org Thu Jul 6 15:37:58 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 6 Jul 2023 15:37:58 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v2] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 6 Jul 2023 13:01:15 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: > > - seconds->ms > - rename pause, unpause -> suspend, resume > - fix ChunkPool::needs_cleaning src/hotspot/share/runtime/trimNative.cpp line 22: > 20: * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA > 21: * or visit www.oracle.com if you need additional information or have any > 22: * questioSns. This copyright notice differs in "questioSns" (extra 'S') and lack of "DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ", might create problems for Oracle scripts ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254611231 From luhenry at openjdk.org Thu Jul 6 15:39:59 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 6 Jul 2023 15:39:59 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: <1a4johJN_ATjrZQJ-Lbwb5wntLJmOonQEak_UAlAC5c=.82b4d3e8-32fb-4708-af7c-9819c11d2ea1@github.com> References: <1a4johJN_ATjrZQJ-Lbwb5wntLJmOonQEak_UAlAC5c=.82b4d3e8-32fb-4708-af7c-9819c11d2ea1@github.com> Message-ID: On Thu, 29 Jun 2023 05:49:24 GMT, Vladimir Kempik wrote: >>> > One case to consider: lets say I have a system with X big cores ( which support MISALIGNED_FAST) and X small cores ( with MISALIGNED_EMU) >>> > if I run some java workload on all cores, what should hw_prober return ? obvious result here is to use +AvoidUnallignedAccesses. >>> > If I run same java workload but use taskset to run it only on big cores, how will jdk's hw_prober code work ? should it work properly and disable AvoidUnallignedAccesses or it's too much and one need to manually set -XX:-AvoidUnallignedAccesses ? >>> >>> FYI: The only way today using hwprobe in example above is to query each cpu individually to find the cpu set that have fast and the cpu set have emulated. (or vector or any other extension which may differ) I have proposed that hwprobe should be able to also return a cpu set for some set of features. To either inform user of what affinity they should use, or if we want change the affinity of the VM automagically. >> >> I think Robbin brought this up at some meeting type thing, but IMO that's a pretty reasonable ask. We hadn't thought of it when writing the syscall, but we've got some flags for extensibility so I think we could make it work. >> >> Another option might be to tie this to some hueristics in userspace, maybe probing along the CPU topology or something. There's been some vague discussions about having a hwprobe userspace library to handle things like bit->string mappings, maybe we should just have it do this too? > > Hello @palmer-dabbelt, I meant I don't know how this thing should work "properly" when we have cpus with different capabilities (regarding misaligned access) and affinity manually set to misaligned_fast cores. @VladimirKempik the `riscv_hwprobe` is going to return the intersection of the features supported by all cores. In the case of a big.LITTLE set of cores, if the LITTLE cores don't support fast unaligned accesses, then `riscv_hwprobe` will return `SLOW` or `UNSUPPORTED` and `UseUnalignedAccesses` will then be false. That keeps us in the safer case where we will not generate unaligned accesses if we are not guaranteed that all cores support fast unaligned accesses. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1623893449 From lkorinth at openjdk.org Thu Jul 6 15:41:57 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 6 Jul 2023 15:41:57 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > last cleanups and shade feedback The description says `-XX:GCTrimNativeHeapInterval= (defaults to 60)`, but the code says milliseconds. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1623895924 From iklam at openjdk.org Thu Jul 6 15:45:56 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 6 Jul 2023 15:45:56 GMT Subject: [jdk21] RFR: 8309140: ResourceHashtable failed "assert(~(_allocation_t[0] | allocation_mask) == (uintptr_t)this) failed: lost resource object" In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 15:24:45 GMT, Coleen Phillimore wrote: > This is a clean backport of JDK-[8309140](https://bugs.openjdk.org/browse/JDK-8309140) LGTM ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk21/pull/102#pullrequestreview-1516836891 From vkempik at openjdk.org Thu Jul 6 15:48:58 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Thu, 6 Jul 2023 15:48:58 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: <1a4johJN_ATjrZQJ-Lbwb5wntLJmOonQEak_UAlAC5c=.82b4d3e8-32fb-4708-af7c-9819c11d2ea1@github.com> References: <1a4johJN_ATjrZQJ-Lbwb5wntLJmOonQEak_UAlAC5c=.82b4d3e8-32fb-4708-af7c-9819c11d2ea1@github.com> Message-ID: On Thu, 29 Jun 2023 05:49:24 GMT, Vladimir Kempik wrote: >>> > One case to consider: lets say I have a system with X big cores ( which support MISALIGNED_FAST) and X small cores ( with MISALIGNED_EMU) >>> > if I run some java workload on all cores, what should hw_prober return ? obvious result here is to use +AvoidUnallignedAccesses. >>> > If I run same java workload but use taskset to run it only on big cores, how will jdk's hw_prober code work ? should it work properly and disable AvoidUnallignedAccesses or it's too much and one need to manually set -XX:-AvoidUnallignedAccesses ? >>> >>> FYI: The only way today using hwprobe in example above is to query each cpu individually to find the cpu set that have fast and the cpu set have emulated. (or vector or any other extension which may differ) I have proposed that hwprobe should be able to also return a cpu set for some set of features. To either inform user of what affinity they should use, or if we want change the affinity of the VM automagically. >> >> I think Robbin brought this up at some meeting type thing, but IMO that's a pretty reasonable ask. We hadn't thought of it when writing the syscall, but we've got some flags for extensibility so I think we could make it work. >> >> Another option might be to tie this to some hueristics in userspace, maybe probing along the CPU topology or something. There's been some vague discussions about having a hwprobe userspace library to handle things like bit->string mappings, maybe we should just have it do this too? > > Hello @palmer-dabbelt, I meant I don't know how this thing should work "properly" when we have cpus with different capabilities (regarding misaligned access) and affinity manually set to misaligned_fast cores. > @VladimirKempik the `riscv_hwprobe` is going to return the intersection of the features supported by all cores. In the case of a big.LITTLE set of cores, if the LITTLE cores don't support fast unaligned accesses, then `riscv_hwprobe` will return `SLOW` or `UNSUPPORTED` and `UseUnalignedAccesses` will then be false. > > That keeps us in the safer case where we will not generate unaligned accesses if we are not guaranteed that all cores support fast unaligned accesses. > > Hello Ludovic, that sounds good, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1623906174 From lkorinth at openjdk.org Thu Jul 6 16:52:56 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 6 Jul 2023 16:52:56 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: <1Rg80DDM3zyG6aKEyW1EAOgOm7wnWIZVlazod7hyx1U=.ff4d81ff-a72d-452e-8b4f-82b9bc7fe473@github.com> On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > last cleanups and shade feedback src/hotspot/share/runtime/trimNative.cpp line 42: > 40: class NativeTrimmerThread : public NamedThread { > 41: > 42: Monitor* const _lock; Maybe inline this instead of having it a pointer? Even if the thread and lock is not destroyed until VM shutdown, I always feel the need to match new in constructors with delete in destructors. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254691667 From lkorinth at openjdk.org Thu Jul 6 17:00:55 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Thu, 6 Jul 2023 17:00:55 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > last cleanups and shade feedback src/hotspot/share/runtime/trimNative.cpp line 78: > 76: static constexpr int safepoint_poll_ms = 250; > 77: > 78: static int64_t now() { return os::javaTimeMillis(); } I think it would be better to not use CLOCK_REALTIME in case of clock changes by NTP etc. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254699171 From shade at openjdk.org Thu Jul 6 17:53:08 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 6 Jul 2023 17:53:08 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > last cleanups and shade feedback Another read. I think we would need to tighten up style a bit too: the newline spacing style is different across the change. src/hotspot/share/runtime/globals.hpp line 1990: > 1988: "If TrimNativeHeap is enabled: interval, in ms, at which " \ > 1989: "the GC will attempt to trim the native heap.") \ > 1990: range(100, UINT_MAX) \ Still mentions "GC". Should we accept 1ms as valid interval? 100ms might be too big for short-running workloads. Very aggressive trim at 1ms would also be useful for stress-testing trim code. src/hotspot/share/runtime/trimNative.cpp line 44: > 42: Monitor* const _lock; > 43: bool _stop; > 44: unsigned _suspend_count; `uint16_t`, maybe? src/hotspot/share/runtime/trimNative.cpp line 49: > 47: uint64_t _num_trims_performed; > 48: > 49: bool suspended() const { Style: `is_suspended()`. src/hotspot/share/runtime/trimNative.cpp line 66: > 64: } > 65: > 66: bool stopped() const { `is_stopped()`? src/hotspot/share/runtime/trimNative.cpp line 75: > 73: SafepointSynchronize::is_at_safepoint() || > 74: SafepointSynchronize::is_synchronizing(); > 75: } Suggestion: bool at_or_nearing_safepoint() const { return SafepointSynchronize::is_at_safepoint() || SafepointSynchronize::is_synchronizing(); } src/hotspot/share/runtime/trimNative.cpp line 84: > 82: run_inner(); > 83: log_info(trim)("NativeTrimmer stop."); > 84: } Do we need this logging? We can simplify and just inline `run_inner` here. src/hotspot/share/runtime/trimNative.cpp line 99: > 97: > 98: if (trim_result) { > 99: _num_trims_performed++; Simplification: let's just use `Atomic::*` for `_num_trims_performed`, and this `trim_result` dance (which I think is only needed to get the update under lock?) is not needed. src/hotspot/share/runtime/trimNative.cpp line 149: > 147: return true; > 148: } else { > 149: log_info(trim)("Trim native heap (no details)"); Consistency: `Trim native heap: complete, no details`. src/hotspot/share/runtime/trimNative.cpp line 177: > 175: // No need to wakeup trimmer > 176: } > 177: log_debug(trim)("NativeTrimmer pause (%s) (%u)", reason, n); Suggestion: log_debug(trim)("NativeTrimmer pause for %s (%u suspend requests)", reason, n); src/hotspot/share/runtime/trimNative.cpp line 190: > 188: } > 189: } > 190: log_debug(trim)("NativeTrimmer unpause (%s) (%u)", reason, n); Suggestion: log_debug(trim)("NativeTrimmer unpause for %s (%u suspend requests)", reason, n); src/hotspot/share/runtime/trimNative.hpp line 27: > 25: */ > 26: > 27: #ifndef SHARE_GC_SHARED_TRIMNATIVE_HPP This is now `SHARE_RUNTIME_TRIMNATIVE_HPP`. test/hotspot/jtreg/runtime/os/TestTrimNative.java line 243: > 241: } > 242: > 243: if (args[0].equals("RUN")) { The usual trick is to pull this "internal" main into a class, and then reference it. Like: public Test { public void test() { runWith("...", "Test$Workload"); } public class Workload { public static void main(String... args) {} } } test/hotspot/jtreg/runtime/os/TestTrimNative.java line 247: > 245: System.out.println("Will spike now..."); > 246: for (int i = 0; i < numAllocations; i++) { > 247: ptrs[i] = Unsafe.getUnsafe().allocateMemory(szAllocations); Pull `Unsafe.getUnsafe()` into a local or a field? test/hotspot/jtreg/runtime/os/TestTrimNative.java line 270: > 268: runTest(Arrays.copyOfRange(args, 1, args.length)); > 269: } else if (args[0].equals("testOffOnNonCompliantPlatforms")) { > 270: testOffOnNonCompliantPlatforms(); Inline these `test*`? This would give some symmetry against `runTest`. ------------- PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1516837594 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254621393 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254632878 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254622666 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254665880 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254633258 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254639542 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254653161 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254657016 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678136 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254678539 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254749047 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254746310 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254743705 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254744410 From rehn at openjdk.org Thu Jul 6 18:14:55 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 6 Jul 2023 18:14:55 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> Message-ID: <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> On Sun, 2 Jul 2023 18:35:30 GMT, Robbin Ehn wrote: > Hi all, > > This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. > > Thanks! I missed clicking yes on collab in time, I can't push to this branch. I'll close and test backport again. ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624117057 From rehn at openjdk.org Thu Jul 6 18:14:56 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 6 Jul 2023 18:14:56 GMT Subject: [jdk21] Withdrawn: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> Message-ID: On Sun, 2 Jul 2023 18:35:30 GMT, Robbin Ehn wrote: > Hi all, > > This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. > > Thanks! This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk21/pull/90 From matsaave at openjdk.org Thu Jul 6 18:57:16 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 6 Jul 2023 18:57:16 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v7] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Interpreter fix and cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/aabd5b45..733860f3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=05-06 Stats: 12 lines in 2 files changed: 2 ins; 9 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From dlong at openjdk.org Thu Jul 6 19:03:59 2023 From: dlong at openjdk.org (Dean Long) Date: Thu, 6 Jul 2023 19:03:59 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite [v3] In-Reply-To: References: Message-ID: <8q-44D-yY1-_40Y5qlhg-Y6ADgV9l3YieJix5qhSQSQ=.83714e4d-6514-4876-983a-740287059488@github.com> On Wed, 5 Jul 2023 15:35:16 GMT, Patricio Chilano Mateo wrote: >> Please review the following fix. Runtime methods called through the SharedRuntime::generate_resolve_blob() stub always return the value stored in _from_compiled_entry as the entry point to the callee method. This will either be the entry point to the compiled version of callee if there is one or the c2i adapter entry point. But this doesn't consider the case where an EnterInterpOnlyModeClosure handshake catches the JavaThread in the transition back to Java on those methods. In that case we should return the c2i adapter entry point even if there is a compiled entry point. Otherwise the JavaThread will continue calling the compiled versions of methods without noticing it's in interpreted only mode until it either calls a method that hasn't been compiled yet or it returns to the caller of that resolved callee where the change to interpreter only mode happened (since the EnterInterpOnlyModeClosure handshake marked all the frames on the stack for deoptimization). >> >> This is a long standing bug but has been made visible with the assert added as part of 8288949 where a related issue was fixed. There are more details in the bug comments about how this specific crash happens and its relation with 8288949. I also attached a reproducer. >> >> These runtime methods are already using JRT_BLOCK_ENTRY/JRT_BLOCK so that the entry point to the callee is fetched only after the last possible safepoint in JRT_BLOCK_END. This guarantees that we will not return an entry point to compiled code that has already been removed. So the fix just adds a check to verify if the JavaThread entered into interpreted only mode in that transition back to Java and to return the c2i entry point instead. >> >> I tested the patch in mach5 tiers 1-6. I also verified it with the reproducer I attached to the bug. I didn't include it as an additional test but I can do that if desired. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > revert to just remove assert added in 8288949 Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14108#pullrequestreview-1517141810 From rehn at openjdk.org Thu Jul 6 19:11:58 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 6 Jul 2023 19:11:58 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > last cleanups and shade feedback Thanks, looks good to me. src/hotspot/share/logging/logTag.hpp line 199: > 197: LOG_TAG(tlab) \ > 198: LOG_TAG(tracking) \ > 199: LOG_TAG(trim) \ Not sure if 'trim' is the best tag name for an average user? src/hotspot/share/runtime/trimNative.cpp line 137: > 135: os::size_change_t sc; > 136: Ticks start = Ticks::now(); > 137: log_debug(trim)("Trim native heap started..."); TraceTime not a good fit? ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1517121458 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254801114 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254805330 From rehn at openjdk.org Thu Jul 6 19:12:02 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 6 Jul 2023 19:12:02 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: <9TFQ9gFbpnvV1auQVmccys3RBvejX_z0HsBoN675jVM=.8300c365-2f71-4d8e-b4d5-0977853db25a@github.com> On Thu, 6 Jul 2023 16:14:20 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> last cleanups and shade feedback > > src/hotspot/share/runtime/trimNative.cpp line 149: > >> 147: return true; >> 148: } else { >> 149: log_info(trim)("Trim native heap (no details)"); > > Consistency: `Trim native heap: complete, no details`. I would still like to know the trim_time. > test/hotspot/jtreg/runtime/os/TestTrimNative.java line 247: > >> 245: System.out.println("Will spike now..."); >> 246: for (int i = 0; i < numAllocations; i++) { >> 247: ptrs[i] = Unsafe.getUnsafe().allocateMemory(szAllocations); > > Pull `Unsafe.getUnsafe()` into a local or a field? Maybe use WB instead of unsafe? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254816943 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1254806836 From pchilanomate at openjdk.org Thu Jul 6 19:16:00 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 6 Jul 2023 19:16:00 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite [v3] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 15:35:16 GMT, Patricio Chilano Mateo wrote: >> Please review the following fix. Runtime methods called through the SharedRuntime::generate_resolve_blob() stub always return the value stored in _from_compiled_entry as the entry point to the callee method. This will either be the entry point to the compiled version of callee if there is one or the c2i adapter entry point. But this doesn't consider the case where an EnterInterpOnlyModeClosure handshake catches the JavaThread in the transition back to Java on those methods. In that case we should return the c2i adapter entry point even if there is a compiled entry point. Otherwise the JavaThread will continue calling the compiled versions of methods without noticing it's in interpreted only mode until it either calls a method that hasn't been compiled yet or it returns to the caller of that resolved callee where the change to interpreter only mode happened (since the EnterInterpOnlyModeClosure handshake marked all the frames on the stack for deoptimization). >> >> This is a long standing bug but has been made visible with the assert added as part of 8288949 where a related issue was fixed. There are more details in the bug comments about how this specific crash happens and its relation with 8288949. I also attached a reproducer. >> >> These runtime methods are already using JRT_BLOCK_ENTRY/JRT_BLOCK so that the entry point to the callee is fetched only after the last possible safepoint in JRT_BLOCK_END. This guarantees that we will not return an entry point to compiled code that has already been removed. So the fix just adds a check to verify if the JavaThread entered into interpreted only mode in that transition back to Java and to return the c2i entry point instead. >> >> I tested the patch in mach5 tiers 1-6. I also verified it with the reproducer I attached to the bug. I didn't include it as an additional test but I can do that if desired. >> >> Thanks, >> Patricio > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > revert to just remove assert added in 8288949 Thanks for the review Dean! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14108#issuecomment-1624187227 From pchilanomate at openjdk.org Thu Jul 6 19:19:06 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 6 Jul 2023 19:19:06 GMT Subject: Integrated: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite In-Reply-To: References: Message-ID: <1Or-sUALN4emQe2aQIu11mGEgHM6zj537dqhFutWfck=.a4d79db6-51e9-446c-b253-38f817f3fadc@github.com> On Tue, 23 May 2023 22:07:37 GMT, Patricio Chilano Mateo wrote: > Please review the following fix. Runtime methods called through the SharedRuntime::generate_resolve_blob() stub always return the value stored in _from_compiled_entry as the entry point to the callee method. This will either be the entry point to the compiled version of callee if there is one or the c2i adapter entry point. But this doesn't consider the case where an EnterInterpOnlyModeClosure handshake catches the JavaThread in the transition back to Java on those methods. In that case we should return the c2i adapter entry point even if there is a compiled entry point. Otherwise the JavaThread will continue calling the compiled versions of methods without noticing it's in interpreted only mode until it either calls a method that hasn't been compiled yet or it returns to the caller of that resolved callee where the change to interpreter only mode happened (since the EnterInterpOnlyModeClosure handshake marked all the frames on the stack for deoptimization). > > This is a long standing bug but has been made visible with the assert added as part of 8288949 where a related issue was fixed. There are more details in the bug comments about how this specific crash happens and its relation with 8288949. I also attached a reproducer. > > These runtime methods are already using JRT_BLOCK_ENTRY/JRT_BLOCK so that the entry point to the callee is fetched only after the last possible safepoint in JRT_BLOCK_END. This guarantees that we will not return an entry point to compiled code that has already been removed. So the fix just adds a check to verify if the JavaThread entered into interpreted only mode in that transition back to Java and to return the c2i entry point instead. > > I tested the patch in mach5 tiers 1-6. I also verified it with the reproducer I attached to the bug. I didn't include it as an additional test but I can do that if desired. > > Thanks, > Patricio This pull request has now been integrated. Changeset: 0c86c31b Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk/commit/0c86c31bccd676e1cfbd35898ee16e89d5752688 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite Reviewed-by: dcubed, sspitsyn, dlong ------------- PR: https://git.openjdk.org/jdk/pull/14108 From zsong at openjdk.org Thu Jul 6 19:21:53 2023 From: zsong at openjdk.org (Zhao Song) Date: Thu, 6 Jul 2023 19:21:53 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> Message-ID: On Thu, 6 Jul 2023 18:12:14 GMT, Robbin Ehn wrote: > I missed clicking yes on collab in time, so I can't push to this branch. > > It seems I'm in some limbo, maybe @zhaosongzs can me help? Just back from lunch. Let me have a try. ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624194146 From zsong at openjdk.org Thu Jul 6 19:26:52 2023 From: zsong at openjdk.org (Zhao Song) Date: Thu, 6 Jul 2023 19:26:52 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> Message-ID: On Thu, 6 Jul 2023 19:19:30 GMT, Zhao Song wrote: > I missed clicking yes on collab in time, so I can't push to this branch. > > It seems I'm in some limbo, maybe @zhaosongzs can me help? It's weird, I couldn't see branch protection rule for this branch. So anyone should be able to push to this branch ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624199214 From zsong at openjdk.org Thu Jul 6 19:31:04 2023 From: zsong at openjdk.org (Zhao Song) Date: Thu, 6 Jul 2023 19:31:04 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> Message-ID: On Thu, 6 Jul 2023 19:19:30 GMT, Zhao Song wrote: > I missed clicking yes on collab in time, so I can't push to this branch. > > It seems I'm in some limbo, maybe @zhaosongzs can me help? Ah, I have invited you to be a collaborator in openjdk-bots/jdk21 ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624203139 From rehn at openjdk.org Thu Jul 6 19:44:54 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 6 Jul 2023 19:44:54 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> Message-ID: On Thu, 6 Jul 2023 19:27:47 GMT, Zhao Song wrote: > > I missed clicking yes on collab in time, so I can't push to this branch. > > It seems I'm in some limbo, maybe @zhaosongzs can me help? > > Ah, I have invited you to be a collaborator in openjdk-bots/jdk21 Thank you. But I still get perm denied :) I guess I'm doing something wrong? [rehn at rehn jdk21]$ git remote -vv origin ssh://github.com/openjdk-bots/jdk21 (fetch) origin ssh://github.com/openjdk-bots/jdk21 (push) [rehn at rehn jdk21]$ git branch -vv master aced11446ea [origin/master] 8310128: Switch with unnamed patterns erroneously non-exhaustive * robehn-backport-faf1b822 be7724a44e0 [origin/robehn-backport-faf1b822: ahead 1] Added missing global defines in jdk 21 [rehn at rehn jdk21]$ git push Enter passphrase for key '/home/rehn/.ssh/github': ERROR: Permission to openjdk-bots/jdk21.git denied to robehn. ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624216860 From zsong at openjdk.org Thu Jul 6 19:44:54 2023 From: zsong at openjdk.org (Zhao Song) Date: Thu, 6 Jul 2023 19:44:54 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> Message-ID: On Thu, 6 Jul 2023 19:40:35 GMT, Robbin Ehn wrote: > > > I missed clicking yes on collab in time, so I can't push to this branch. > > > It seems I'm in some limbo, maybe @zhaosongzs can me help? > > > > > > Ah, I have invited you to be a collaborator in openjdk-bots/jdk21 > > Thank you. But I still get perm denied :) I guess I'm doing something wrong? > > ``` > [rehn at rehn jdk21]$ git remote -vv > origin ssh://github.com/openjdk-bots/jdk21 (fetch) > origin ssh://github.com/openjdk-bots/jdk21 (push) > [rehn at rehn jdk21]$ git branch -vv > master aced11446ea [origin/master] 8310128: Switch with unnamed patterns erroneously non-exhaustive > * robehn-backport-faf1b822 be7724a44e0 [origin/robehn-backport-faf1b822: ahead 1] Added missing global defines in jdk 21 > [rehn at rehn jdk21]$ git push > Enter passphrase for key '/home/rehn/.ssh/github': > ERROR: Permission to openjdk-bots/jdk21.git denied to robehn. > ``` Have you accepted the invitation? From what I can see, the invitation is still pending. ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624218801 From rehn at openjdk.org Thu Jul 6 19:48:56 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 6 Jul 2023 19:48:56 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <6axzirORGK8G89CQtd3RUEeJHcvBGSH6fqUWty02oaw=.96571948-ace8-4ff4-8bb2-9da129499421@github.com> Message-ID: <-Pmk8RCfPROdw1k4N9tqC-JQ9uHeND5M-a-eR7MDa0Y=.f616c172-7e33-477f-93ed-ac0bc3a950e3@github.com> On Thu, 6 Jul 2023 19:42:24 GMT, Zhao Song wrote: > > > > I missed clicking yes on collab in time, so I can't push to this branch. > > > > It seems I'm in some limbo, maybe @zhaosongzs can me help? > > > > > > > > > Ah, I have invited you to be a collaborator in openjdk-bots/jdk21 > > > > > > Thank you. But I still get perm denied :) I guess I'm doing something wrong? > > ``` > > [rehn at rehn jdk21]$ git remote -vv > > origin ssh://github.com/openjdk-bots/jdk21 (fetch) > > origin ssh://github.com/openjdk-bots/jdk21 (push) > > [rehn at rehn jdk21]$ git branch -vv > > master aced11446ea [origin/master] 8310128: Switch with unnamed patterns erroneously non-exhaustive > > * robehn-backport-faf1b822 be7724a44e0 [origin/robehn-backport-faf1b822: ahead 1] Added missing global defines in jdk 21 > > [rehn at rehn jdk21]$ git push > > Enter passphrase for key '/home/rehn/.ssh/github': > > ERROR: Permission to openjdk-bots/jdk21.git denied to robehn. > > ``` > > Have you accepted the invitation? From what I can see, the invitation is still pending. I clicked on a check mark :) Maybe it takes a while for it to progate in the "eventually consistent world". ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624222466 From coleenp at openjdk.org Thu Jul 6 19:48:56 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 6 Jul 2023 19:48:56 GMT Subject: [jdk21] RFR: 8309140: ResourceHashtable failed "assert(~(_allocation_t[0] | allocation_mask) == (uintptr_t)this) failed: lost resource object" In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 15:24:45 GMT, Coleen Phillimore wrote: > This is a clean backport of JDK-[8309140](https://bugs.openjdk.org/browse/JDK-8309140) Thanks Ioi! ------------- PR Comment: https://git.openjdk.org/jdk21/pull/102#issuecomment-1624221798 From coleenp at openjdk.org Thu Jul 6 19:48:56 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 6 Jul 2023 19:48:56 GMT Subject: [jdk21] Integrated: 8309140: ResourceHashtable failed "assert(~(_allocation_t[0] | allocation_mask) == (uintptr_t)this) failed: lost resource object" In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 15:24:45 GMT, Coleen Phillimore wrote: > This is a clean backport of JDK-[8309140](https://bugs.openjdk.org/browse/JDK-8309140) This pull request has now been integrated. Changeset: 98ad856a Author: Coleen Phillimore URL: https://git.openjdk.org/jdk21/commit/98ad856a98bb302c4321662c2f5a3650369fae7f Stats: 99 lines in 11 files changed: 41 ins; 1 del; 57 mod 8309140: ResourceHashtable failed "assert(~(_allocation_t[0] | allocation_mask) == (uintptr_t)this) failed: lost resource object" Reviewed-by: iklam Backport-of: b6c789faad63f18e17ee7e5cefd024b3776fd469 ------------- PR: https://git.openjdk.org/jdk21/pull/102 From rehn at openjdk.org Thu Jul 6 20:16:14 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 6 Jul 2023 20:16:14 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> Message-ID: <0cOr4cGicJ3M49nFwLBTIt9w2aFpV1sUwl6xyf1Htfk=.6cb7355f-5760-477f-8420-4a9032563159@github.com> > Hi all, > > This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. > > Thanks! Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Added missing global defines in jdk 21 Signed-off-by: Robbin Ehn ------------- Changes: - all: https://git.openjdk.org/jdk21/pull/90/files - new: https://git.openjdk.org/jdk21/pull/90/files/be817c39..be7724a4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk21&pr=90&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk21&pr=90&range=00-01 Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk21/pull/90.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/90/head:pull/90 PR: https://git.openjdk.org/jdk21/pull/90 From iklam at openjdk.org Thu Jul 6 23:47:16 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 6 Jul 2023 23:47:16 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects Message-ID: This PR attempts to clean up some of the cruds in the existing code: - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` Also: - Moved SerializeClosure to its own header file to improve build time. - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. ------------- Commit messages: - 8311604: Simplify NOCOOPS requested addresses for archived heap objects Changes: https://git.openjdk.org/jdk/pull/14792/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14792&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311604 Stats: 369 lines in 23 files changed: 157 ins; 172 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/14792.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14792/head:pull/14792 PR: https://git.openjdk.org/jdk/pull/14792 From dholmes at openjdk.org Fri Jul 7 00:37:55 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 7 Jul 2023 00:37:55 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 05:18:01 GMT, Jean-Philippe Bempel wrote: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. src/hotspot/share/oops/constantPool.cpp line 1361: > 1359: int recur2 = cp2->uncached_klass_ref_index_at(index2); > 1360: bool match = compare_entry_to(recur1, cp2, recur2); > 1361: match |= is_unresolved_class_mismatch(recur1, cp2, recur2); This is changing the definition of a "match" such that in all cases an unresolved class entry is considered equal to a resolved class entry - but only for these Field and MethodRefs. I don't see how this connects back to the original problem. Nor do I see why this is alway a correct thing to do. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1255064413 From dholmes at openjdk.org Fri Jul 7 00:48:54 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 7 Jul 2023 00:48:54 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: <2IK4-mDcKnsZr6krAGeX1Yi_8WmTL9TMB5JCovHhKCU=.6b24c922-a875-4320-a0b4-62521dd36909@github.com> References: <2IK4-mDcKnsZr6krAGeX1Yi_8WmTL9TMB5JCovHhKCU=.6b24c922-a875-4320-a0b4-62521dd36909@github.com> Message-ID: On Thu, 6 Jul 2023 11:02:48 GMT, Jean-Philippe Bempel wrote: > Sorry I don't understand what you mean, can you elaborate please? In `ConstantPool::compare_entry_to` it states: if (t1 != t2) { // Not the same entry type so there is nothing else to check. Note // that this style of checking will consider resolved/unresolved // class pairs as different. // From the ConstantPool* API point of view, this is correct // behavior. See VM_RedefineClasses::merge_constant_pools() to see how this // plays out in the context of ConstantPool* merging. return false; } and in ` VM_RedefineClasses::merge_constant_pools()` it says // Pass 0: // The old_cp is copied to *merge_cp_p; this means that any code // using old_cp does not have to change. This work looks like a // perfect fit for ConstantPool*::copy_cp_to(), but we need to // handle one special case: // - revert JVM_CONSTANT_Class to JVM_CONSTANT_UnresolvedClass // This will make verification happy. these comments both pertain to the problem at hand: the merge reverts the resolved class entry to be unresolved; unresolved entries are not considered matches for resolved ones, hence we grow the constant pool. Your fix is addressing the second part of the issue by forcing a match, but as per my other comment, it is not clear to me how the placement of that fix addresses the issue originally reported with the catch block, nor is it obvious to me that we should always consider resolved and unresolved class entries to be equal. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1624480658 From dholmes at openjdk.org Fri Jul 7 04:09:54 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 7 Jul 2023 04:09:54 GMT Subject: RFR: 8311575: Fix invalid format parameters In-Reply-To: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: On Thu, 6 Jul 2023 11:25:11 GMT, Daniel Jeli?ski wrote: > Please review this change that fixes a few issues with printf-like function parameters. > > No new tests. I'll run tier1 & tier2 before integrating, just to make sure. Looks good - thanks for fixing. One suggested further fix. src/hotspot/share/runtime/arguments.cpp line 3951: > 3949: if (lvl == NMT_unknown) { > 3950: jio_fprintf(defaultStream::error_stream(), > 3951: "Syntax error, expecting -XX:NativeMemoryTracking=[off|summary|detail]"); I think it also needs a trailing `\n` ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14783#pullrequestreview-1517740505 PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255220735 From dholmes at openjdk.org Fri Jul 7 04:45:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 7 Jul 2023 04:45:56 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Thu, 6 Jul 2023 15:25:03 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > last cleanups and shade feedback I had an initial look at this. Seems okay in principle. The naming/terminology needs some updating IMO: "trimNative" doesn't convey enough information, please use "trimNativeHeap". "trim" for logging tag is also too non-descript. As this is initially experimental I'm not overly concerned about the impact, though I do cringe at yet-another-VM-thread. FYI I will be away until next Thursday, but no need to wait for me for further comments. ------------- PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1517807289 From dholmes at openjdk.org Fri Jul 7 04:55:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 7 Jul 2023 04:55:52 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. Yes I guess dumping the whole environment is a bit too extreme. As it is we have to be cautious about what external information we capture in the hs_err log. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14767#issuecomment-1624737626 From djelinski at openjdk.org Fri Jul 7 06:20:55 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Fri, 7 Jul 2023 06:20:55 GMT Subject: RFR: 8311575: Fix invalid format parameters In-Reply-To: References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: On Fri, 7 Jul 2023 04:05:05 GMT, David Holmes wrote: >> Please review this change that fixes a few issues with printf-like function parameters. >> >> No new tests. I'll run tier1 & tier2 before integrating, just to make sure. > > src/hotspot/share/runtime/arguments.cpp line 3951: > >> 3949: if (lvl == NMT_unknown) { >> 3950: jio_fprintf(defaultStream::error_stream(), >> 3951: "Syntax error, expecting -XX:NativeMemoryTracking=[off|summary|detail]"); > > I think it also needs a trailing `\n` Good point. I triggered this error and it didn't look pretty: > java.exe -XX:NativeMemoryTracking=blah Syntax error, expecting -XX:NativeMemoryTracking=[off|summary|detail]Error: Could not create the Java Virtual Machine. Error: A fatal exception has occurred. Program will exit. I noticed a few other spots in this file where trailing `\n` is missing. I'll fix them here as well. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255296233 From djelinski at openjdk.org Fri Jul 7 06:42:11 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Fri, 7 Jul 2023 06:42:11 GMT Subject: RFR: 8311575: Fix invalid format parameters [v2] In-Reply-To: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: > Please review this change that fixes a few issues with printf-like function parameters. > > No new tests. I'll run tier1 & tier2 before integrating, just to make sure. Daniel Jeli?ski has updated the pull request incrementally with one additional commit since the last revision: Add newlines ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14783/files - new: https://git.openjdk.org/jdk/pull/14783/files/db287ada..538d494f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14783&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14783&range=00-01 Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/14783.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14783/head:pull/14783 PR: https://git.openjdk.org/jdk/pull/14783 From stuefe at openjdk.org Fri Jul 7 06:51:58 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 06:51:58 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v2] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <4Bvm3M-_Gyq2n6hN_efnLWRwyvLo7jHmjRCm7OJm-6I=.264bfd97-407d-4d1c-badb-62202092b446@github.com> On Thu, 6 Jul 2023 15:35:12 GMT, Leo Korinth wrote: >> Thomas Stuefe has updated the pull request incrementally with three additional commits since the last revision: >> >> - seconds->ms >> - rename pause, unpause -> suspend, resume >> - fix ChunkPool::needs_cleaning > > src/hotspot/share/runtime/trimNative.cpp line 22: > >> 20: * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA >> 21: * or visit www.oracle.com if you need additional information or have any >> 22: * questioSns. > > This copyright notice differs in "questioSns" (extra 'S') and lack of "DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. ", might create problems for Oracle scripts Done. Thanks for catching that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1255329662 From stuefe at openjdk.org Fri Jul 7 06:58:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 06:58:56 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <1Rg80DDM3zyG6aKEyW1EAOgOm7wnWIZVlazod7hyx1U=.ff4d81ff-a72d-452e-8b4f-82b9bc7fe473@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> <1Rg80DDM3zyG6aKEyW1EAOgOm7wnWIZVlazod7hyx1U=.ff4d81ff-a72d-452e-8b4f-82b9bc7fe473@github.com> Message-ID: <_ZLVJk9-B9d9fkE0sHJPfU_IJtUnWpI1AS4lqNtf1L4=.43209323-b199-4c58-bb0b-b1e0100c7bfd@github.com> On Thu, 6 Jul 2023 16:49:47 GMT, Leo Korinth wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> last cleanups and shade feedback > > src/hotspot/share/runtime/trimNative.cpp line 42: > >> 40: class NativeTrimmerThread : public NamedThread { >> 41: >> 42: Monitor* const _lock; > > Maybe inline this instead of having it a pointer? Even if the thread and lock is not destroyed until VM shutdown, I always feel the need to match new in constructors with delete in destructors. I rather avoid that. There is a timing factor here: JVM issues ThreadNative::cleanup, which stops the thread, but the thread is not guaranteed to stop before VM ends (may be in a middle of a trim operation). I don't want to have to think about this. Ultimately, I don't see the point of the cleanup. I prefer a fast shutdown. Which is in line with a lot of code in the hotspot (e.g. we don't clean global mutexes either). But I'll add an assert to make sure only one NativeTrimmerThread is ever created. So we know its a singleton. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1255337316 From pli at openjdk.org Fri Jul 7 07:20:18 2023 From: pli at openjdk.org (Pengfei Li) Date: Fri, 7 Jul 2023 07:20:18 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: <5KRqQ2tEGzQRaxk8cYsu7iPXjYjeACidrtHFwDqhxDw=.36a3f286-4c1a-4c59-966a-b79e5ec7a21b@github.com> On Mon, 3 Jul 2023 14:37:03 GMT, Emanuel Peter wrote: >> src/hotspot/share/opto/vmaskloop.cpp line 978: >> >>> 976: >>> 977: // Update loop increment/decrement to the vector mask true count >>> 978: Node* true_cnt = new VectorMaskTrueCountNode(root_vmask, TypeInt::INT); >> >> This seems expensive to have to use inside the loop. Is there a way we could move this outside the loop? Because if we do take the backedge then we know that we have to take the full `stride`, right? > > I guess you would have to separate out the loop-internal uses and the outside uses of the `incr`. The inside uses would use the `stride` (or is there an exception?) and the outside ones could use the `VectorMaskTrueCountNode`. > > Doing something like that could have better performance. > This seems expensive to have to use inside the loop. Is there a way we could move this outside the loop? Because if we do take the backedge then we know that we have to take the full stride, right? It's not completely right. We have tried using multiplied stride inside the loop and just handle out-of-loop uses of the `incr` node. Mis-compilation happens in some very corner cases where the loop limit value is very close to the max value of `int`, like in below case. for (int i = 2147483600; i < 2147483645; i++) { // ... } If we always take the full stride inside the vectorized loop, the induction variable may overflow and is rotated to a negative value before it reaches the loop limit. This causes the backedge is taken forever and the finite loop is optimized to an infinite loop. I see that for general counted loops, C2 inserts some limit check predicate in the counted loop construction phase to avoid this issue (it's implemented in `PhaseIdealLoop::insert_loop_limit_check_predicate()`). But I'm not sure if it is possible (and worthy) to add similar limit check predicate for post loops. It looks that current C2 post loops have no place to add extra loop predicates. What's your suggestion for this? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14581#discussion_r1255363034 From stuefe at openjdk.org Fri Jul 7 07:28:58 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 07:28:58 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Tue, 4 Jul 2023 13:02:22 GMT, Thomas Stuefe wrote: >> Today, if we use UseTransparentHugePages, we assume that the static hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in /proc/memlimit Hugepagesize) >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect. THPs are enabled depending on the mode in `/sys/kernel/mm/transparent_hugepage/enabled` (tri-state). And the pagesize used by khugepaged is the one set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. The latter can differ from the default large page size on the system (e.g. static hugepage default size could be 1g, whereas THP hugepage size is 2m). >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) >> [0.001s][info][pagesize] default pagesize: 1G >> [0.001s][info][pagesize] Transparent hugepage (THP) suppor... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback johan src/hotspot/os/linux/hugepages.cpp line 51: > 49: } > 50: > 51: // Scan /proc/meminfo and return value of Hugepagesize Reviewer notes: `scan_default_hugepagesize` is just a code move, it used to live in `scan_default_large_page_size` in os_linux.cpp. It is responsible for reading the default *static* hugepage size from /proc/meminfo. This code is mostly untouched to reduce chance for regressions (though the code could be tightened and cleaned for sure). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1255372374 From kbarrett at openjdk.org Fri Jul 7 07:46:01 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 7 Jul 2023 07:46:01 GMT Subject: RFR: 8311575: Fix invalid format parameters [v2] In-Reply-To: References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: On Fri, 7 Jul 2023 06:42:11 GMT, Daniel Jeli?ski wrote: >> Please review this change that fixes a few issues with printf-like function parameters. >> >> No new tests. I'll run tier1 & tier2 before integrating, just to make sure. > > Daniel Jeli?ski has updated the pull request incrementally with one additional commit since the last revision: > > Add newlines A couple pre-existing indentation nits, but otherwise looks good. src/hotspot/os/windows/perfMemory_windows.cpp line 225: > 223: // unexpected error, declare the path insecure > 224: if (PrintMiscellaneous && Verbose) { > 225: warning("could not get attributes for file %s: ", Too bad ATTRIBUTE_PRINTF is empty for Windows. Visual Studio doesn't seem to be an equivalent to gcc's `__attribute__((format(printf(...)))`. src/hotspot/share/runtime/arguments.cpp line 2841: > 2839: if (strncmp(jvmci_compiler, "graal", strlen("graal")) != 0) { > 2840: jio_fprintf(defaultStream::error_stream(), > 2841: "Value of jvmci.Compiler incompatible with +UseGraalJIT: %s", jvmci_compiler); [pre-existing] Unusual indentation. src/hotspot/share/runtime/arguments.cpp line 2858: > 2856: if (!JVMCIGlobals::enable_jvmci_product_mode(origin, use_graal_jit)) { > 2857: jio_fprintf(defaultStream::error_stream(), > 2858: "Unable to enable JVMCI in product mode"); [pre-existing] Unusual indentation. src/hotspot/share/runtime/arguments.cpp line 3951: > 3949: if (lvl == NMT_unknown) { > 3950: jio_fprintf(defaultStream::error_stream(), > 3951: "Syntax error, expecting -XX:NativeMemoryTracking=[off|summary|detail]", nullptr); This would have been caught already if JDK-8198918 had been fixed. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14783#pullrequestreview-1518088854 PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255388370 PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255344865 PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255344723 PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255371424 From pli at openjdk.org Fri Jul 7 07:53:15 2023 From: pli at openjdk.org (Pengfei Li) Date: Fri, 7 Jul 2023 07:53:15 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization In-Reply-To: <8KPkr2loby3RVIrYQBiXWv3Ph2E0saSLVDBMFHi88LQ=.b1ffb28d-54a8-4dcc-9472-e53b055a72ee@github.com> References: <8KPkr2loby3RVIrYQBiXWv3Ph2E0saSLVDBMFHi88LQ=.b1ffb28d-54a8-4dcc-9472-e53b055a72ee@github.com> Message-ID: On Thu, 29 Jun 2023 10:54:29 GMT, Emanuel Peter wrote: >> Hi @eme64, >> >> I guess you have done your first round of review. @fg1417 and I really appreciate all your constructive inputs. By reading your comments, I believe you have reviewed this patch in very detail. Thanks again! >> >> What I am doing now: >> >> - I'm trying to fix the issues which I think can be fixed immediately. >> - I'm trying to answer all your simple questions ASAP. >> >> For your request of big refactoring work, I feel like I personally may not have enough time and capability to complete it in a short time. We may need some discussion about it. But it's great to know more about your "hybrid vectorizer" plan from your feedback. It looks like a grand plan, and requires significant effort and cooperation. I strongly agree that we need some conversation to discuss where we should move forward and what we can cooperate. Could you give us a moment to digest your idea before we schedule a conversation? >> >> BTW: What's your preferred time for a conversation? We are based in Shanghai (GMT+8) > > Hi @pfustc ! > > I'm grad you appreciate my review. > >> For your request of big refactoring work, I feel like I personally may not have enough time and capability to complete it in a short time. > > Are you under some time constraint? No pressure from my side, take the time you need. > > I would very much love to have a conversation over a video call with you. I think that would be beneficial for all of us. The problem from our side (Oracle) are intellectual property concerns. OpenJDK emails and PR's are all under the Oracle Contributor Agreement. So there I'm free to have conversations. I'm trying to figure out if we can have a similar frame for a video call, sadly it may take a few weeks or months to get that sorted, as many people are on summer vacation. > > Please take some time to digest the feedback. This is a big change set, it will take a while to be ready for integration at any rate. And again, I would really urge you to consider some refactoring of SuperWord in a separate RFE before this change here. > > I'm looking forward to more collaboration - over PR comments, emails, and hopefully eventually video calls as well ? > Emanuel Hi @eme64, I just experimented some initial SuperWord refactoring work but found the refactoring process may cause more crashes/bugs with `PostLoopMultiversioning`. It seems that nobody is currently using this experimental feature and/or has interest to maintain it. If we have already reached a consensus that we will abandon it eventually, shall we propose a PR to remove it first before doing the refactoring? I think this way may speedup our refactoring process. An alternative approach is keeping the legacy code in SuperWord for now but tolerating new bugs of `PostLoopMultiversiong`, which already has many bugs. What's your opinion on this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1624936112 From rehn at openjdk.org Fri Jul 7 08:03:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 7 Jul 2023 08:03:05 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: <0cOr4cGicJ3M49nFwLBTIt9w2aFpV1sUwl6xyf1Htfk=.6cb7355f-5760-477f-8420-4a9032563159@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <0cOr4cGicJ3M49nFwLBTIt9w2aFpV1sUwl6xyf1Htfk=.6cb7355f-5760-477f-8420-4a9032563159@github.com> Message-ID: On Thu, 6 Jul 2023 20:16:14 GMT, Robbin Ehn wrote: >> Hi all, >> >> This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. >> >> The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. >> >> Thanks! > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Added missing global defines in jdk 21 > > Signed-off-by: Robbin Ehn @tstuefe @luhenry @RealFYang any takers ? :) ------------- PR Comment: https://git.openjdk.org/jdk21/pull/90#issuecomment-1624957247 From clanger at openjdk.org Fri Jul 7 08:18:53 2023 From: clanger at openjdk.org (Christoph Langer) Date: Fri, 7 Jul 2023 08:18:53 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. This looks like useful additions, I'd also vote for adding FONTCONFIG_USE_MMAP. ------------- Marked as reviewed by clanger (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14767#pullrequestreview-1518272552 From stuefe at openjdk.org Fri Jul 7 08:18:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 08:18:54 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: <63oYEz_DZFLl-oTb5P1K2aVjcMaukHKIFHeS9LhSPZE=.c6833c2e-8893-47b9-a783-77d55900a563@github.com> On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. LGTM, modulo David's advice (remove the comment, add FONTCONFIG_USE_MMAP). Cheers, Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14767#pullrequestreview-1518279517 From mli at openjdk.org Fri Jul 7 08:41:02 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 7 Jul 2023 08:41:02 GMT Subject: RFR: 8311575: Fix invalid format parameters [v2] In-Reply-To: References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: On Fri, 7 Jul 2023 06:42:11 GMT, Daniel Jeli?ski wrote: >> Please review this change that fixes a few issues with printf-like function parameters. >> >> No new tests. I'll run tier1 & tier2 before integrating, just to make sure. > > Daniel Jeli?ski has updated the pull request incrementally with one additional commit since the last revision: > > Add newlines Marked as reviewed by mli (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14783#pullrequestreview-1518349954 From djelinski at openjdk.org Fri Jul 7 08:41:05 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Fri, 7 Jul 2023 08:41:05 GMT Subject: RFR: 8311575: Fix invalid format parameters [v2] In-Reply-To: References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: On Fri, 7 Jul 2023 07:40:35 GMT, Kim Barrett wrote: >> Daniel Jeli?ski has updated the pull request incrementally with one additional commit since the last revision: >> >> Add newlines > > src/hotspot/os/windows/perfMemory_windows.cpp line 225: > >> 223: // unexpected error, declare the path insecure >> 224: if (PrintMiscellaneous && Verbose) { >> 225: warning("could not get attributes for file %s: ", > > Too bad ATTRIBUTE_PRINTF is empty for Windows. Visual Studio doesn't seem to be an equivalent to > gcc's `__attribute__((format(printf(...)))`. The best I could find in Visual Studio was [_Printf_format_string_](https://learn.microsoft.com/en-us/cpp/code-quality/annotating-function-parameters-and-return-values?view=msvc-170#format-string-parameters) It needs to be specified next to the parameter, and it's only checked when compiling with `-analyze`, so not really useful for us at this moment. > src/hotspot/share/runtime/arguments.cpp line 2858: > >> 2856: if (!JVMCIGlobals::enable_jvmci_product_mode(origin, use_graal_jit)) { >> 2857: jio_fprintf(defaultStream::error_stream(), >> 2858: "Unable to enable JVMCI in product mode"); > > [pre-existing] Unusual indentation. This kind of indentation is pretty common in this file. I'd rather not touch it here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255462927 PR Review Comment: https://git.openjdk.org/jdk/pull/14783#discussion_r1255469856 From duke at openjdk.org Fri Jul 7 08:47:11 2023 From: duke at openjdk.org (sid8606) Date: Fri, 7 Jul 2023 08:47:11 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) Message-ID: Implementation of "Foreign Function & Memory API" for s390x. ------------- Commit messages: - Fix whitespace errors and boilerplate text - 8311630:[s390] Implementation of Foreign Function & Memory API (Preview) Changes: https://git.openjdk.org/jdk/pull/14801/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311630 Stats: 1630 lines in 20 files changed: 1567 ins; 3 del; 60 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From stuefe at openjdk.org Fri Jul 7 08:53:58 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 08:53:58 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Thu, 6 Jul 2023 16:57:56 GMT, Leo Korinth wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> last cleanups and shade feedback > > src/hotspot/share/runtime/trimNative.cpp line 78: > >> 76: static constexpr int safepoint_poll_ms = 250; >> 77: >> 78: static int64_t now() { return os::javaTimeMillis(); } > > I think it would be better to not use CLOCK_REALTIME in case of clock changes by NTP etc. Okay, changed to os::elapsedTime(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1255488363 From amitkumar at openjdk.org Fri Jul 7 08:56:02 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 7 Jul 2023 08:56:02 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 07:55:03 GMT, sid8606 wrote: > Implementation of "Foreign Function & Memory API" for s390x. Will run test, maybe you want to adopt these changes. That's it for now. src/hotspot/cpu/s390/downcallLinker_s390.cpp line 2: > 1: /* > 2: * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. Please add the header-year back src/hotspot/cpu/s390/downcallLinker_s390.cpp line 78: > 76: _frame_complete(0), > 77: _frame_size_slots(0), > 78: _oop_maps(NULL) { replace NULL with `nullptr` src/hotspot/cpu/s390/downcallLinker_s390.cpp line 105: > 103: bool needs_return_buffer, > 104: int captured_state_mask, > 105: bool needs_transition) { maybe indent it back (?) src/hotspot/cpu/s390/foreignGlobals_s390.cpp line 2: > 1: /* > 2: * Copyright (c) 2022, Oracle and/or its affiliates. All rights reserved. Please add the header-year back src/hotspot/cpu/s390/foreignGlobals_s390.cpp line 46: > 44: bool ABIDescriptor::is_volatile_reg(FloatRegister reg) const { > 45: return _float_argument_registers.contains(reg) > 46: || _float_additional_volatile_registers.contains(reg); maybe alignment needed (?) src/hotspot/cpu/s390/frame_s390.cpp line 231: > 229: UpcallStub* blob = _cb->as_upcall_stub(); > 230: JavaFrameAnchor* jfa = blob->jfa_for_frame(*this); > 231: return jfa->last_Java_sp() == NULL; Replace NULL with nullptr src/hotspot/cpu/s390/frame_s390.cpp line 235: > 233: > 234: frame frame::sender_for_upcall_stub_frame(RegisterMap* map) const { > 235: assert(map != NULL, "map must be set"); use nullptr src/java.base/share/classes/jdk/internal/foreign/abi/s390/S390Architecture.java line 3: > 1: /* > 2: * Copyright (c) 2020, 2023, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2023 SAP SE. All rights reserved. Add IBM's Copyrights ;-) ` Copyright (c) 2023, IBM Corp.` src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390CallArranger.java line 3: > 1: /* > 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2023 SAP SE. All rights reserved. Add IBM's copyright src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390Linker.java line 3: > 1: /* > 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2023 SAP SE. All rights reserved. Add IBM's copyright src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/TypeClass.java line 3: > 1: /* > 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright (c) 2023 SAP SE. All rights reserved. Add IBM's copyright ------------- Changes requested by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/14801#pullrequestreview-1518282531 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255434276 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255447838 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255473422 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255435187 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255440455 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255449488 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255449998 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255464266 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255465193 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255465767 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255467623 From stuefe at openjdk.org Fri Jul 7 08:57:02 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 08:57:02 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: <9TFQ9gFbpnvV1auQVmccys3RBvejX_z0HsBoN675jVM=.8300c365-2f71-4d8e-b4d5-0977853db25a@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> <9TFQ9gFbpnvV1auQVmccys3RBvejX_z0HsBoN675jVM=.8300c365-2f71-4d8e-b4d5-0977853db25a@github.com> Message-ID: On Thu, 6 Jul 2023 19:04:46 GMT, Robbin Ehn wrote: >> src/hotspot/share/runtime/trimNative.cpp line 149: >> >>> 147: return true; >>> 148: } else { >>> 149: log_info(trim)("Trim native heap (no details)"); >> >> Consistency: `Trim native heap: complete, no details`. > > I would still like to know the trim_time. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1255492490 From luhenry at openjdk.org Fri Jul 7 09:10:07 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 7 Jul 2023 09:10:07 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: <0cOr4cGicJ3M49nFwLBTIt9w2aFpV1sUwl6xyf1Htfk=.6cb7355f-5760-477f-8420-4a9032563159@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <0cOr4cGicJ3M49nFwLBTIt9w2aFpV1sUwl6xyf1Htfk=.6cb7355f-5760-477f-8420-4a9032563159@github.com> Message-ID: On Thu, 6 Jul 2023 20:16:14 GMT, Robbin Ehn wrote: >> Hi all, >> >> This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. >> >> The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. >> >> Thanks! > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Added missing global defines in jdk 21 > > Signed-off-by: Robbin Ehn Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk21/pull/90#pullrequestreview-1518400983 From jsjolen at openjdk.org Fri Jul 7 09:28:07 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 7 Jul 2023 09:28:07 GMT Subject: RFR: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS [v2] In-Reply-To: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> References: <2S1OiClWEjbIrH09T0TYODUlJ3rgbTElAHRq7DTb0f8=.5e8313b7-1c01-43a7-b075-bf89a83ca919@github.com> Message-ID: <6ewFBLFNR4sG5mGbGeWt0RCT4yU8TxMvh-4TbGZ8ZJQ=.3b6af7fd-fceb-4e5e-a5d8-dbaa2200198d@github.com> On Tue, 4 Jul 2023 12:24:10 GMT, Johan Sj?len wrote: >> Hi, >> >> Please consider this small enhancement. `sp` takes a count argument, this was never used by `indent`, let's just use it. > > Johan Sj?len has updated the pull request incrementally with one additional commit since the last revision: > > Remove SP_USE_TABS One structured concurrency test fails on GHA on Linux-x86. That can't be thanks to this patch. Integrating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14502#issuecomment-1625121627 From jsjolen at openjdk.org Fri Jul 7 09:28:08 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 7 Jul 2023 09:28:08 GMT Subject: Integrated: 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS In-Reply-To: References: Message-ID: On Thu, 15 Jun 2023 20:50:11 GMT, Johan Sj?len wrote: > Hi, > > Please consider this small enhancement. `sp` takes a count argument, this was never used by `indent`, let's just use it. This pull request has now been integrated. Changeset: 92ca670b Author: Johan Sj?len URL: https://git.openjdk.org/jdk/commit/92ca670bf3342aa6d50ddb35e55daed16a285d10 Stats: 11 lines in 1 file changed: 0 ins; 9 del; 2 mod 8310170: Use sp's argument to improve performance of outputStream::indent and remove SP_USE_TABS Reviewed-by: shade, dholmes, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/14502 From duke at openjdk.org Fri Jul 7 10:01:20 2023 From: duke at openjdk.org (sid8606) Date: Fri, 7 Jul 2023 10:01:20 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for s390x. sid8606 has updated the pull request incrementally with one additional commit since the last revision: Address Amit's review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/68b2470b..e719c164 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=00-01 Stats: 17 lines in 7 files changed: 1 ins; 0 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From duke at openjdk.org Fri Jul 7 10:01:21 2023 From: duke at openjdk.org (sid8606) Date: Fri, 7 Jul 2023 10:01:21 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 08:53:06 GMT, Amit Kumar wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > Will run test, maybe you want to adopt these changes. That's it for now. Thank you @offamitkumar for quick reviews. I have addressed comments in new commit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1625163831 From mbaesken at openjdk.org Fri Jul 7 10:32:12 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 7 Jul 2023 10:32:12 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file [v2] In-Reply-To: References: Message-ID: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: add FONTCONFIG_USE_MMAP ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14767/files - new: https://git.openjdk.org/jdk/pull/14767/files/76583497..4d19b2d7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14767&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14767&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14767.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14767/head:pull/14767 PR: https://git.openjdk.org/jdk/pull/14767 From mbaesken at openjdk.org Fri Jul 7 10:32:12 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 7 Jul 2023 10:32:12 GMT Subject: RFR: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: <9zKcjLhKHYecpsqUAlcfplJ_epLw5QeJsQ95LeIDc6s=.de0b7c3c-aa1c-4843-a899-1d50766933b6@github.com> On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. Hi Thomas and Christoph, thanks for the reviews ! I removed the comment, added FONTCONFIG_USE_MMAP. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14767#issuecomment-1625201822 From mbaesken at openjdk.org Fri Jul 7 10:32:13 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 7 Jul 2023 10:32:13 GMT Subject: Integrated: JDK-8311285: report some fontconfig related environment variables in hs_err file In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 11:47:49 GMT, Matthias Baesken wrote: > There are a number of important environment variables influencing how fontconfig works. > See for example > https://man.archlinux.org/man/fonts-conf.5 > Some of them should be added to the list of reported environment variables in hs_err file because e.g. a bad setting for some of them can even lead sometimes to crashes. This pull request has now been integrated. Changeset: 0ef03f12 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/0ef03f122866f010ebf50683097e9b92e41cdaad Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod 8311285: report some fontconfig related environment variables in hs_err file Reviewed-by: clanger, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/14767 From stuefe at openjdk.org Fri Jul 7 11:29:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 11:29:55 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Fri, 7 Jul 2023 04:42:53 GMT, David Holmes wrote: > I had an initial look at this. Seems okay in principle. The naming/terminology needs some updating IMO: "trimNative" doesn't convey enough information, please use "trimNativeHeap". "trim" for logging tag is also too non-descript. > > As this is initially experimental I'm not overly concerned about the impact, though I do cringe at yet-another-VM-thread. > > FYI I will be away until next Thursday, but no need to wait for me for further comments. Thank you @dholmes-ora. I dislike the introduction of another thread too, but the benefits are undeniably there, and we need to keep the invocation of malloc_trim away from other code paths. That said, maybe a future glibc will offer us a ways to influence trim to make its runtime predictable. Which would remove the need of an own thread. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1625271681 From stuefe at openjdk.org Fri Jul 7 11:38:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 11:38:56 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: <_I0WXej4rpV3-EkJXKg6N9ISkOWxfE_axZ11SVM8FxM=.4083d7b7-06b7-4e1e-8177-fe64b7759161@github.com> On Thu, 6 Jul 2023 18:50:45 GMT, Robbin Ehn wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> last cleanups and shade feedback > > src/hotspot/share/runtime/trimNative.cpp line 137: > >> 135: os::size_change_t sc; >> 136: Ticks start = Ticks::now(); >> 137: log_debug(trim)("Trim native heap started..."); > > TraceTime not a good fit? I'll revert to using elapsedTime for all timing needs here. TraceTime seems too much hassle. I want the time combined with other output in one line, and I don't want to provide a TraceTimerLogPrintFunc. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1255690649 From djelinski at openjdk.org Fri Jul 7 12:17:10 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Fri, 7 Jul 2023 12:17:10 GMT Subject: RFR: 8311575: Fix invalid format parameters [v2] In-Reply-To: References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: On Fri, 7 Jul 2023 06:42:11 GMT, Daniel Jeli?ski wrote: >> Please review this change that fixes a few issues with printf-like function parameters. >> >> No new tests. I'll run tier1 & tier2 before integrating, just to make sure. > > Daniel Jeli?ski has updated the pull request incrementally with one additional commit since the last revision: > > Add newlines Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14783#issuecomment-1625325327 From djelinski at openjdk.org Fri Jul 7 12:17:11 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Fri, 7 Jul 2023 12:17:11 GMT Subject: Integrated: 8311575: Fix invalid format parameters In-Reply-To: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> References: <-9xQE9UAMLnueaceqy4IhaMqprnPU-UAHBvPpRfqRgo=.f6810a10-7259-4781-87ce-b3bafbbb2767@github.com> Message-ID: On Thu, 6 Jul 2023 11:25:11 GMT, Daniel Jeli?ski wrote: > Please review this change that fixes a few issues with printf-like function parameters. > > No new tests. I'll run tier1 & tier2 before integrating, just to make sure. This pull request has now been integrated. Changeset: 34004e16 Author: Daniel Jeli?ski URL: https://git.openjdk.org/jdk/commit/34004e1666f6adf0e52af553c30b6b0006b4cfb6 Stats: 12 lines in 3 files changed: 0 ins; 0 del; 12 mod 8311575: Fix invalid format parameters Reviewed-by: dholmes, kbarrett, mli ------------- PR: https://git.openjdk.org/jdk/pull/14783 From rkennke at openjdk.org Fri Jul 7 13:03:28 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 7 Jul 2023 13:03:28 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v48] In-Reply-To: References: Message-ID: <3FIoBLM3h8gnTDj1DPvMSEDFiMfK2_oo1jLwz7SuJu4=.437c806e-41c2-4899-be84-9482c620ebc4@github.com> > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: - Merge remote-tracking branch 'origin/JDK-8139457' into JDK-8139457 - Re-instance ZGC changes that initialize the gap in arrays ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/ad244e42..3e37e785 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=47 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=46-47 Stats: 22 lines in 2 files changed: 20 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From jvernee at openjdk.org Fri Jul 7 13:13:06 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 7 Jul 2023 13:13:06 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 10:01:20 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x. > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address Amit's review comments Overall looks great! I'd mostly like to understand what's going on with: https://github.com/openjdk/jdk/pull/14801#discussion_r1255718844 src/hotspot/cpu/s390/downcallLinker_s390.cpp line 159: > 157: int allocated_frame_size = 0; > 158: assert(_abi._shadow_space_bytes == frame::z_abi_160_size, "expected space according to ABI"); > 159: allocated_frame_size = frame::z_abi_160_size; I'm assuming that the 160 byte space is part of the C ABI, not the Java ABI, right? In that case you could just use `_abi._shadow_space_bytes` here, as this code doesn't necessarily know it's calling into C. i.e. that knowledge is captured in the Java code. src/hotspot/cpu/s390/downcallLinker_s390.cpp line 162: > 160: allocated_frame_size += arg_shuffle.out_arg_bytes(); > 161: > 162: bool should_save_return_value = !_needs_return_buffer && _needs_transition;; Since return buffer is not implemented here, I suggest adding an assert that checks that `_needs_return_buffer` is always false. src/hotspot/cpu/s390/downcallLinker_s390.cpp line 207: > 205: __ z_lg(callerSP, _z_abi(callers_sp), Z_SP); // preset (used to access caller frame argument slots) > 206: __ block_comment("{ argument shuffle"); > 207: arg_shuffle.generate(_masm, as_VMStorage(callerSP), frame::z_jit_out_preserve_size, _abi._shadow_space_bytes, locs); I'm not sure exactly what `callerSP` is doing, but it seems to be Z_SP + bias? Why can't the `in_stk_bias` parameter be used for that? (and then use `tmp` for the shuffle reg). src/hotspot/cpu/s390/foreignGlobals_s390.cpp line 115: > 113: switch (to_reg.type()) { > 114: case StorageType::INTEGER: > 115: if (to_reg.segment_mask() == REG64_MASK && from_reg.segment_mask() == REG32_MASK ) { Since this deals with 32 bit regs as well, might want to rename the function to just `move_reg` (i.e drop the `64`) src/hotspot/cpu/s390/foreignGlobals_s390.cpp line 186: > 184: case StorageType::FRAME_DATA: { > 185: switch (from_reg.stack_size()) { > 186: case 8: __ mem2reg_opt(Z_R0_scratch, Address (callerSP, reg2offset(from_reg, in_stk_bias)), true); break; A potential issue here is that Z_R0_scratch could be used by the target ABI, that's why the shuffle register is passed as an argument on other platforms. (It also makes it clearer in the calling code that that register is used somehow). src/hotspot/cpu/s390/upcallLinker_s390.cpp line 141: > 139: > 140: // The Java call uses the JIT ABI, but we also call C. > 141: int out_arg_area = MAX2(frame::z_jit_out_preserve_size + arg_shuffle.out_arg_bytes(), (int)frame::z_abi_160_size); What do you mean here with "but we also call C"? Upcall stubs are always calling into Java, though the source ABI is unknown. src/hotspot/cpu/s390/upcallLinker_s390.cpp line 172: > 170: // |---------------------| = frame_bottom_offset = frame_size > 171: // | (optional) | > 172: // | ret_buf | There's no return buffer, so this can be removed. src/hotspot/cpu/s390/upcallLinker_s390.cpp line 232: > 230: > 231: // return value shuffle > 232: if (!needs_return_buffer) { I suggest using an assert here instead. src/hotspot/cpu/s390/upcallLinker_s390.cpp line 243: > 241: case T_CHAR: > 242: case T_INT: > 243: __ z_lgfr(Z_RET, Z_RET); // Clear garbage in high half. Same as PPC here; this should really be done on the Java side. (See: https://github.com/openjdk/jdk/pull/12708#issuecomment-1440433079 and related discussion) src/java.base/share/classes/jdk/internal/foreign/abi/AbstractLinker.java line 78: > 76: @CallerSensitive > 77: public final MethodHandle downcallHandle(MemorySegment symbol, FunctionDescriptor function, Option... options) { > 78: Reflection.ensureNativeAccess(Reflection.getCallerClass(), Linker.class, "downcallHandle"); Looks spurious? Please undo src/java.base/share/classes/jdk/internal/foreign/abi/s390/S390Architecture.java line 115: > 113: > 114: private static VMStorage floatRegister(int index) { > 115: return new VMStorage(StorageType.FLOAT, REG64_MASK, index, "v" + index); Maybe this should be `"f"` instead of `"v"`? (given the names of the variables above) Suggestion: return new VMStorage(StorageType.FLOAT, REG64_MASK, index, "f" + index); src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390CallArranger.java line 136: > 134: return returnLayout > 135: .filter(GroupLayout.class::isInstance) > 136: .filter(layout -> layout instanceof GroupLayout) These lines both do the same, so one can be removed. src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/TypeClass.java line 114: > 112: return false; > 113: } > 114: } I believe this loop is not needed, since above it's determined that `scalarLayouts` has only 1 element. test/jdk/java/foreign/TestClassLoaderFindNative.java line 63: > 61: public void testVariableSymbolLookup() { > 62: MemorySegment segment = SymbolLookup.loaderLookup().find("c").get().reinterpret(ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN ? 1 : 4); > 63: assertEquals(segment.get(JAVA_BYTE, ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN ? 0 : 3), 42); Could you explain why this is needed? It looks like the lookup is returning the wrong address? test/jdk/java/foreign/TestIllegalLink.java line 57: > 55: > 56: private static final boolean IS_SYSV = CABI.current() == CABI.SYS_V; > 57: private static final boolean byteorder = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN; Please rename this field to something more descriptive, e.g. `IS_LE` (and capitalize). test/jdk/java/foreign/TestLayouts.java line 46: > 44: public class TestLayouts { > 45: > 46: boolean byteorder = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN; Same here test/jdk/java/foreign/callarranger/platform/PlatformLayouts.java line 312: > 310: * This class defines layout constants modelling standard primitive types supported by the S390 ABI. > 311: */ > 312: public static final class S390 { These are only needed if you plan on adding a CallArranger test as well. ------------- PR Review: https://git.openjdk.org/jdk/pull/14801#pullrequestreview-1518533244 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255680103 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255712765 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255718844 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255660955 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255723226 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255733518 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255735380 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255735815 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255743888 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255645400 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255640717 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255629673 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255624777 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255619049 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255619320 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255620308 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255621583 From shade at openjdk.org Fri Jul 7 13:30:13 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 7 Jul 2023 13:30:13 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v48] In-Reply-To: <3FIoBLM3h8gnTDj1DPvMSEDFiMfK2_oo1jLwz7SuJu4=.437c806e-41c2-4899-be84-9482c620ebc4@github.com> References: <3FIoBLM3h8gnTDj1DPvMSEDFiMfK2_oo1jLwz7SuJu4=.437c806e-41c2-4899-be84-9482c620ebc4@github.com> Message-ID: On Fri, 7 Jul 2023 13:03:28 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with two additional commits since the last revision: > > - Merge remote-tracking branch 'origin/JDK-8139457' into JDK-8139457 > - Re-instance ZGC changes that initialize the gap in arrays Still looks fine to me. Others need to review in likely case we missed something. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/11044#pullrequestreview-1518887145 From stuefe at openjdk.org Fri Jul 7 13:36:22 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 13:36:22 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v5] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 19 additional commits since the last revision: - rework test - Test Unsafe->WB - Renamings - Fix include guard name - Aleksey: changes 3 - Aleksey: cosmetic changes 2 - Aleksey: cosmetic changes 1 - change to os::elapsedTime - Add missing gtest - Purge all mentionings of GC from patch - ... and 9 more: https://git.openjdk.org/jdk/compare/312f640b...c919203b ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/9b47c64b..c919203b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=03-04 Stats: 2794 lines in 43 files changed: 1466 ins; 910 del; 418 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Fri Jul 7 13:41:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 7 Jul 2023 13:41:04 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Thu, 6 Jul 2023 15:38:56 GMT, Leo Korinth wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> last cleanups and shade feedback > > The description says `-XX:GCTrimNativeHeapInterval= (defaults to 60)`, but the code says milliseconds. Thanks @lkorinth @shipilev @robehn for the reviews. Next version: - renamed TrimNative namespace to NativeHeapTrimmer, the log tag to "trimnh", files, include guards etc - Reworked the trimmer thread to: - uniformly use elapsedTime - Use atomics for the trim count - get rid of run_inner - made suspend count 16bit - tightened code around the trim loop - added minimal gtest - reworked jtreg test to use whitebox (which required a new WB method for pre-touching memory) and reshaped the test according to Alekseys suggestions. - and lots of other smaller stuff. Just went through the remarks again and I hope I got everything. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1625430752 From duke at openjdk.org Fri Jul 7 14:38:00 2023 From: duke at openjdk.org (sid8606) Date: Fri, 7 Jul 2023 14:38:00 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: <6QyvVMmV5y98wD5_BsFRVEL9-GFXUcOm5_pc0dRtQBw=.8a7e4b93-0633-4333-a1aa-f3af865f073d@github.com> On Fri, 7 Jul 2023 10:32:54 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > test/jdk/java/foreign/TestClassLoaderFindNative.java line 63: > >> 61: public void testVariableSymbolLookup() { >> 62: MemorySegment segment = SymbolLookup.loaderLookup().find("c").get().reinterpret(ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN ? 1 : 4); >> 63: assertEquals(segment.get(JAVA_BYTE, ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN ? 0 : 3), 42); > > Could you explain why this is needed? It looks like the lookup is returning the wrong address? Since s390x runs in Big Endian mode. We get LSB on higher address of integer size. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255921714 From duke at openjdk.org Fri Jul 7 14:43:05 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Jul 2023 14:43:05 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v4] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <2pqzHiCRT7vsx__-YySEPBQzWhFGrY1ubzFEezhBBig=.c2677fbc-c461-4ccb-8807-028ba18e23a0@github.com> Message-ID: On Fri, 7 Jul 2023 13:38:34 GMT, Thomas Stuefe wrote: > (just noticed the patch adds +666 lines, bad sign, I should add another line somewhere). It also deletes 2 lines so that makes it 664 ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1625515410 From duke at openjdk.org Fri Jul 7 14:58:00 2023 From: duke at openjdk.org (sid8606) Date: Fri, 7 Jul 2023 14:58:00 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 12:15:39 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > src/hotspot/cpu/s390/upcallLinker_s390.cpp line 141: > >> 139: >> 140: // The Java call uses the JIT ABI, but we also call C. >> 141: int out_arg_area = MAX2(frame::z_jit_out_preserve_size + arg_shuffle.out_arg_bytes(), (int)frame::z_abi_160_size); > > What do you mean here with "but we also call C"? Upcall stubs are always calling into Java, though the source ABI is unknown. We do native calls in this stub, so make sure allocated stack is big enough. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1255953719 From iklam at openjdk.org Fri Jul 7 14:58:02 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 14:58:02 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 23:23:07 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > use OS specific native stack printing in class load cause native stack logging Changes requested by iklam (Reviewer). src/hotspot/share/oops/instanceKlass.cpp line 3749: > 3747: > 3748: print_class_load_cause_logging(); > 3749: This prints the stacks before printing the name of the class. I think the name should be printed first. ------------- PR Review: https://git.openjdk.org/jdk/pull/14553#pullrequestreview-1519101603 PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1255952066 From duke at openjdk.org Fri Jul 7 15:14:55 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 7 Jul 2023 15:14:55 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 23:39:34 GMT, Ioi Lam wrote: > This PR attempts to clean up some of the cruds in the existing code: > > - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 > - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" > - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp > - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` > - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` > > Also: > - Moved SerializeClosure to its own header file to improve build time. > - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. Marked as reviewed by ashu-mehra at github.com (no known OpenJDK username). minor nitpicks, otherwise looks good! src/hotspot/share/cds/archiveHeapWriter.hpp line 80: > 78: // > 79: // - "buffered objects" are copies of the "source objects", and are stored in into > 80: // ArchiveHeapWriter::_buffer, which is a GrowableArray that sites outside of s/sites/sits src/hotspot/share/cds/filemap.cpp line 2072: > 2070: // We can avoid relocation if each region is mapped into the exact same address > 2071: // where it was at dump time. > 2072: return (address)ArchiveHeapWriter::NOCOOPS_REQUESTED_BASE; Can you please update the comment above this line as well to remove `each region` as we only have single heap region now. ------------- PR Review: https://git.openjdk.org/jdk/pull/14792#pullrequestreview-1519128072 PR Comment: https://git.openjdk.org/jdk/pull/14792#issuecomment-1625561347 PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1255974422 PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1255981846 From kvn at openjdk.org Fri Jul 7 15:16:30 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 7 Jul 2023 15:16:30 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 07:37:22 GMT, Pengfei Li wrote: >> ## TL;DR >> >> This patch completely re-implements C2's experimental post loop vectorization for better stability, maintainability and performance. Compared with the original implementation, this new implementation adds a standalone loop phase in C2's ideal loop phases and can vectorize more post loops. The original implementation and all code related to multi-versioned post loops are deleted in this patch. More details about this patch can be found in the document replied in this pull request. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Address part of comments from Emanuel Yes, you can remove old code first. And work on new implementation after that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1625560237 From epeter at openjdk.org Fri Jul 7 15:16:30 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 7 Jul 2023 15:16:30 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 15:11:20 GMT, Vladimir Kozlov wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > Yes, you can remove old code first. And work on new implementation after that. Thanks for weighing in @vnkozlov . I think that is the best way as well. It makes refactoring much easier. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1625560727 From dnsimon at openjdk.org Fri Jul 7 15:30:04 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 7 Jul 2023 15:30:04 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 14:53:50 GMT, Ioi Lam wrote: >> Doug Simon has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> use OS specific native stack printing in class load cause native stack logging > > src/hotspot/share/oops/instanceKlass.cpp line 3749: > >> 3747: >> 3748: print_class_load_cause_logging(); >> 3749: > > This prints the stacks before printing the name of the class. I think the name should be printed first. Not sure what you mean. The `class+load+cause` log output includes the name before the stack. See the updated examples in this PR's description. If you mean `class+load+cause` log output precedes `class+load` log output, then yes, that is the case and it's intentional to keep these logging features independent of each other as David suggested (https://github.com/openjdk/jdk/pull/14553#discussion_r1249847278). Since they are independent, it should not matter which output comes first. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1256006266 From simonis at openjdk.org Fri Jul 7 15:48:53 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 7 Jul 2023 15:48:53 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: <3VRIubx6RtYXAe87vPtEjh0jOFaK6W9xGEY4_mATMRA=.0e77fd3a-6d7c-4d0f-a7b8-094a91849f02@github.com> On Wed, 5 Jul 2023 19:14:07 GMT, Mandy Chung wrote: >> Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename new parameter according to the HS coding conventions > > Thanks for catching this issue. I agree that `Method::invoke` should be skipped the caller-sensitive test in this case but the fix isn't quite right. The caller-sensitive test should apply in any batch. For example, `CSM` calls `getCallerClass` reflectively, I think the stack would look like this: > > > java.lang.StackWalker::getCallerClass > java.lang.invoke.DirectMethodHandle$Holder::invokeStatic > java.lang.invoke.LambdaForm$MH/0x0000000800002c00::invoke > : > : > jdk.internal.reflect.DirectMethodHandleAccessor::invokeImpl > jdk.internal.reflect.DirectMethodHandleAccessor::invoke > java.lang.reflect.Method::invoke > CSM <--------- caller-sensitive method and UOE should be thrown > > > > In this case, UOE should be thrown. By the way, great that you raised this issue @mlchung :) While writing a test for it I realized the the current implementation is *already failing* to trow an `UnsupportedOperationException` if `StackWalker::getCallerClass()` is called reflectively from a `@CallerSensitive` method (because the caller sensitive method is not at the first method at `index == start_index`): [0,161s][debug][stackwalk] Start walking: mode 6 skip 0 frames batch size 6 [0,161s][debug][stackwalk] [0,161s][debug][stackwalk] skip java.lang.StackStreamFactory$AbstractStackWalker::callStackWalk [0,161s][debug][stackwalk] skip java.lang.StackStreamFactory$AbstractStackWalker::beginStackWalk [0,161s][debug][stackwalk] skip java.lang.StackStreamFactory$AbstractStackWalker::walkHelper [0,161s][debug][stackwalk] skip java.lang.StackStreamFactory$AbstractStackWalker::walk [0,161s][debug][stackwalk] skip java.lang.StackStreamFactory$CallerClassFinder::findCaller [0,161s][debug][stackwalk] skip java.lang.StackWalker::getCallerClass [0,161s][debug][stackwalk] fill_in_frames limit=6 start=2 frames length=8 [0,161s][debug][stackwalk] hidden method: java.lang.invoke.DirectMethodHandle$Holder::invokeSpecial [0,161s][debug][stackwalk] hidden method: java.lang.invoke.LambdaForm$MH/0x00000007c0084c00::invoke [0,161s][debug][stackwalk] hidden method: java.lang.invoke.Invokers$Holder::invokeExact_MT [0,161s][debug][stackwalk] hidden method: java.util.CallerSensitiveMethod$$InjectedInvoker/0x00000007c0085000::reflect_invoke_V [0,161s][debug][stackwalk] hidden method: java.lang.invoke.DirectMethodHandle$Holder::invokeStatic [0,161s][debug][stackwalk] hidden method: java.lang.invoke.LambdaForm$MH/0x00000007c0085400::invokeExact_MT [0,161s][debug][stackwalk] hidden method: jdk.internal.reflect.DirectMethodHandleAccessor::invokeImpl [0,161s][debug][stackwalk] 2: frame method: jdk.internal.reflect.DirectMethodHandleAccessor::invoke bci=24 [0,161s][debug][stackwalk] 2: done frame method: jdk.internal.reflect.DirectMethodHandleAccessor::invoke [0,161s][debug][stackwalk] 3: frame method: java.lang.reflect.Method::invoke bci=90 [0,161s][debug][stackwalk] 3: done frame method: java.lang.reflect.Method::invoke [0,161s][debug][stackwalk] 4: frame method: java.util.CallerSensitiveMethod::getCallerClass bci=110 [0,161s][debug][stackwalk] 4: done frame method: java.util.CallerSensitiveMethod::getCallerClass [0,161s][debug][stackwalk] 5: frame method: ReflectiveGetCallerClassTest::main bci=38 [0,161s][debug][stackwalk] 5: done frame method: ReflectiveGetCallerClassTest::main [0,161s][debug][stackwalk] fill_in_frames done frames_decoded=4 at_end=1 doStackWalk: skip 0 start 2 end 6 frame 2: class jdk.internal.reflect.DirectMethodHandleAccessor skip: frame 2 class jdk.internal.reflect.DirectMethodHandleAccessor next frame at 2: class jdk.internal.reflect.DirectMethodHandleAccessor (origin 2 fence 6) skip: frame 3 class java.lang.reflect.Method next frame at 3: class java.lang.reflect.Method (origin 3 fence 6) next frame at 4: class java.util.CallerSensitiveMethod (origin 4 fence 6) next frame at 5: class ReflectiveGetCallerClassTest (origin 5 fence 6) CallerSensitiveMethod::getCallerClass() called from class ReflectiveGetCallerClassTest Exception in thread "main" java.lang.RuntimeException: Expected UnsupportedOperationException when calling StackWalker::getCallerClass() from @CallerSensitive method reflectively at java.base/java.util.CallerSensitiveMethod.getCallerClass(CallerSensitiveMethod.java:79) at ReflectiveGetCallerClassTest.main(ReflectiveGetCallerClassTest.java:84) The existing test under `test/jdk/java/lang/StackWalker/CallerSensitiveMethod/src/java.base/java/util/CSM.java` for calls from `@CallerSensitive` methods only handles direct but not reflective calls. I'll post a patch which will hopefully fix both cases later today. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1625607182 From iklam at openjdk.org Fri Jul 7 15:54:59 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 15:54:59 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 15:27:04 GMT, Doug Simon wrote: >> src/hotspot/share/oops/instanceKlass.cpp line 3749: >> >>> 3747: >>> 3748: print_class_load_cause_logging(); >>> 3749: >> >> This prints the stacks before printing the name of the class. I think the name should be printed first. > > Not sure what you mean. The `class+load+cause` log output includes the name before the stack. See the updated examples in this PR's description. > > If you mean `class+load+cause` log output precedes `class+load` log output, then yes, that is the case and it's intentional to keep these logging features independent of each other as David suggested (https://github.com/openjdk/jdk/pull/14553#discussion_r1249847278). Since they are independent, it should not matter which output comes first. Even if they are independent, relevant output should be grouped together. Currently, if we specify `-Xlog:class+load*=debug`, the output would look like this: Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView ... [many many lines of output] ... java.util.concurrent.ConcurrentHashMap$ValuesView source: shared objects file klass: 0x000000080033a0f8 super: 0x000000080033a320 loader: [loader data: 0x00007faa34113890 of 'bootstrap'] This makes it difficult to see all the information of this class at a glance. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1256041975 From mchung at openjdk.org Fri Jul 7 16:12:56 2023 From: mchung at openjdk.org (Mandy Chung) Date: Fri, 7 Jul 2023 16:12:56 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 17:25:24 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Rename new parameter according to the HS coding conventions Yes that's the bug. `index == start_index` assumes the first frame is the caller of `getCallerClass` as it misses the reflective case. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1625638581 From dnsimon at openjdk.org Fri Jul 7 16:32:22 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 7 Jul 2023 16:32:22 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v8] In-Reply-To: References: Message-ID: > In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. > > This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: > > > product(ccstr, LogClassLoadingCauseFor, nullptr, \ > "Apply -Xlog:class+load+cause* to classes whose fully " \ > "qualified name contains this string ("*" matches " \ > "any class).") \ > > > Example usage: > > java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version > [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: > [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) > [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) > [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystemProvider.(java.b... Doug Simon has updated the pull request incrementally with one additional commit since the last revision: log class+load+cause after class+load ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14553/files - new: https://git.openjdk.org/jdk/pull/14553/files/e6ffca32..d8418684 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=06-07 Stats: 91 lines in 1 file changed: 22 ins; 25 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/14553.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14553/head:pull/14553 PR: https://git.openjdk.org/jdk/pull/14553 From dnsimon at openjdk.org Fri Jul 7 16:32:23 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 7 Jul 2023 16:32:23 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 15:51:58 GMT, Ioi Lam wrote: >> Not sure what you mean. The `class+load+cause` log output includes the name before the stack. See the updated examples in this PR's description. >> >> If you mean `class+load+cause` log output precedes `class+load` log output, then yes, that is the case and it's intentional to keep these logging features independent of each other as David suggested (https://github.com/openjdk/jdk/pull/14553#discussion_r1249847278). Since they are independent, it should not matter which output comes first. > > Even if they are independent, relevant output should be grouped together. Currently, if we specify `-Xlog:class+load*=debug`, the output would look like this: > > > Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView > ... > [many many lines of output] > ... > java.util.concurrent.ConcurrentHashMap$ValuesView source: shared objects file > klass: 0x000000080033a0f8 super: 0x000000080033a320 loader: [loader data: 0x00007faa34113890 of 'bootstrap'] > > This makes it difficult to see all the information of this class at a glance. If I'm interested in class load cause logging of `java.util.concurrent.ConcurrentHashMap$ValuesView`, I don't particularly care about the `class+load` output for that class. Note that some of the `class+load+placeholders` output is going to be shown after `class+load+cause` anyway. That said, I've pushed a commit to implement your request: d84186840b874ba74e270f6ac51cc0ed71d655f5 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1256098870 From iklam at openjdk.org Fri Jul 7 17:02:02 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 17:02:02 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 16:27:30 GMT, Doug Simon wrote: >> Even if they are independent, relevant output should be grouped together. Currently, if we specify `-Xlog:class+load*=debug`, the output would look like this: >> >> >> Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView >> ... >> [many many lines of output] >> ... >> java.util.concurrent.ConcurrentHashMap$ValuesView source: shared objects file >> klass: 0x000000080033a0f8 super: 0x000000080033a320 loader: [loader data: 0x00007faa34113890 of 'bootstrap'] >> >> This makes it difficult to see all the information of this class at a glance. > > If I'm interested in class load cause logging of `java.util.concurrent.ConcurrentHashMap$ValuesView`, I don't particularly care about the `class+load` output for that class. > Note that some of the `class+load+placeholders` output is going to be shown after `class+load+cause` anyway. > That said, I've pushed a commit to implement your request: d84186840b874ba74e270f6ac51cc0ed71d655f5 It's not about personal preferences. We should organize the logging so that they are grouped logically. That way, they can be more useful for others who might use the log tags in a different way. The `class+load+placeholders` logs are for dealing with parallel loading of the same class by different threads. This happens before the class is actually loaded, so those logs cannot appear after the main `class+load` tag. *** To minimize the delta in this PR, and limit code indentation, it's better to move the old code to a new function, so you have something like: void InstanceKlass::print_class_load_logging(ClassLoaderData* loader_data, const ModuleEntry* module_entry, const ClassFileStream* cfs) const { if (ClassListWriter::is_enabled()) { ClassListWriter::write(this, cfs); } >>> add print_class_load_helper(); print_class_load_cause_logging(); } void InstanceKlass::print_class_load_helper(ClassLoaderData* loader_data, const ModuleEntry* module_entry, const ClassFileStream* cfs) const { <<< end ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1256138240 From iklam at openjdk.org Fri Jul 7 17:12:09 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 17:12:09 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects [v2] In-Reply-To: References: Message-ID: > This PR attempts to clean up some of the cruds in the existing code: > > - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 > - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" > - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp > - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` > - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` > > Also: > - Moved SerializeClosure to its own header file to improve build time. > - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @ashu-mehra review; updated source code comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14792/files - new: https://git.openjdk.org/jdk/pull/14792/files/88d2b338..fb8a2165 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14792&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14792&range=00-01 Stats: 4 lines in 2 files changed: 1 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14792.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14792/head:pull/14792 PR: https://git.openjdk.org/jdk/pull/14792 From iklam at openjdk.org Fri Jul 7 17:12:11 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 17:12:11 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 15:10:20 GMT, Ashutosh Mehra wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @ashu-mehra review; updated source code comments > > src/hotspot/share/cds/filemap.cpp line 2072: > >> 2070: // We can avoid relocation if each region is mapped into the exact same address >> 2071: // where it was at dump time. >> 2072: return (address)ArchiveHeapWriter::NOCOOPS_REQUESTED_BASE; > > Can you please update the comment above this line as well to remove `each region` as we only have single heap region now. Thanks for the review. I updated the comment to clarify that this address was hard-coded at dump time, and explained the reason for that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1256146742 From dnsimon at openjdk.org Fri Jul 7 18:01:14 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 7 Jul 2023 18:01:14 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v9] In-Reply-To: References: Message-ID: > In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. > > This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: > > > product(ccstr, LogClassLoadingCauseFor, nullptr, \ > "Apply -Xlog:class+load+cause* to classes whose fully " \ > "qualified name contains this string ("*" matches " \ > "any class).") \ > > > Example usage: > > java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version > [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: > [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) > [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) > [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystemProvider.(java.b... Doug Simon has updated the pull request incrementally with one additional commit since the last revision: refactor body of print_class_load_logging in print_class_load_helper ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14553/files - new: https://git.openjdk.org/jdk/pull/14553/files/d8418684..4b910b16 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14553&range=07-08 Stats: 100 lines in 2 files changed: 34 ins; 22 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/14553.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14553/head:pull/14553 PR: https://git.openjdk.org/jdk/pull/14553 From dnsimon at openjdk.org Fri Jul 7 18:01:15 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 7 Jul 2023 18:01:15 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 16:59:05 GMT, Ioi Lam wrote: > We should organize the logging so that they are grouped logically. There's nothing ungrouped about the original order; the order of `class+load` and `class+load+cause` were swapped but they were still grouped together. > This happens before the class is actually loaded, so those logs cannot appear after the main class+load tag. Maybe I'm interpreting the output incorrectly, but that's not what I see: > java "-Xlog:class+load=debug,class+load+cause=info,class+load+placeholders=debug" -XX:LogClassLoadingCauseFor=java.lang.StringCoding --version ... [1.917s][debug][class,load,placeholders] entry java/lang/StringCoding : find_and_add LOAD_INSTANCE [1.917s][debug][class,load,placeholders] loadInstanceThreadQ threads:0x000000013080a000, [1.917s][debug][class,load,placeholders] superThreadQ threads: [1.917s][debug][class,load,placeholders] defineThreadQ threads: [1.917s][info ][class,load ] java.lang.StringCoding source: /Users/dnsimon/dev/jdk-jdk/open/build/macosx-aarch64/jdk/modules/java.base [1.917s][debug][class,load ] klass: 0x000000030007acf8 super: 0x0000000300041170 loader: [loader data: 0x00006000034ccaa0 of 'bootstrap'] bytes: 1361 checksum: af8c723a [1.917s][info ][class,load,cause ] Java stack when loading java.lang.StringCoding: [1.917s][info ][class,load,cause ] at java.lang.String.encodeUTF8(java.base/String.java:1302) [1.917s][info ][class,load,cause ] at java.lang.String.encode(java.base/String.java:867) [1.917s][info ][class,load,cause ] at java.lang.String.getBytes(java.base/String.java:1818) [1.917s][info ][class,load,cause ] at sun.nio.fs.Util.toBytes(java.base/Util.java:55) [1.917s][info ][class,load,cause ] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:82) [1.917s][info ][class,load,cause ] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) [1.917s][info ][class,load,cause ] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) [1.917s][info ][class,load,cause ] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) [1.917s][info ][class,load,cause ] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) [1.917s][info ][class,load,cause ] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) [1.917s][info ][class,load,cause ] at sun.nio.fs.BsdFileSystemProvider.(java.base/BsdFileSystemProvider.java:38) [1.917s][info ][class,load,cause ] at sun.nio.fs.MacOSXFileSystemProvider.(java.base/MacOSXFileSystemProvider.java:39) [1.917s][info ][class,load,cause ] at sun.nio.fs.DefaultFileSystemProvider.(java.base/DefaultFileSystemProvider.java:35) [1.917s][info ][class,load,cause ] at java.nio.file.FileSystems.getDefault(java.base/FileSystems.java:186) [1.917s][info ][class,load,cause ] at java.nio.file.Path.of(java.base/Path.java:148) [1.917s][info ][class,load,cause ] at jdk.internal.module.SystemModuleFinders.ofSystem(java.base/SystemModuleFinders.java:188) [1.917s][info ][class,load,cause ] at jdk.internal.module.ModuleBootstrap.boot2(java.base/ModuleBootstrap.java:244) [1.917s][info ][class,load,cause ] at jdk.internal.module.ModuleBootstrap.boot(java.base/ModuleBootstrap.java:174) [1.917s][info ][class,load,cause ] at java.lang.System.initPhase2(java.base/System.java:2227) [1.917s][debug][class,load,placeholders] entry java/lang/StringCoding : find_and_add DEFINE_CLASS [1.917s][debug][class,load,placeholders] loadInstanceThreadQ threads:0x000000013080a000, [1.917s][debug][class,load,placeholders] superThreadQ threads: [1.917s][debug][class,load,placeholders] defineThreadQ threads:0x000000013080a000, [1.917s][debug][class,load,placeholders] entry java/lang/StringCoding : find_and_remove DEFINE_CLASS , InstanceKlass 'java/lang/StringCoding' [1.917s][debug][class,load,placeholders] loadInstanceThreadQ threads:0x000000013080a000, [1.917s][debug][class,load,placeholders] superThreadQ threads: [1.917s][debug][class,load,placeholders] defineThreadQ threads:0x000000013080a000, [1.917s][debug][class,load,placeholders] entry java/lang/StringCoding : find_and_remove LOAD_INSTANCE , InstanceKlass 'java/lang/StringCoding' ... I've pushed another commit with your suggested change: 4b910b16dbd6f72a3d74fc0c72703aec5e7d9671 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1256216296 From jvernee at openjdk.org Fri Jul 7 18:02:55 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 7 Jul 2023 18:02:55 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 14:54:54 GMT, sid8606 wrote: >> src/hotspot/cpu/s390/upcallLinker_s390.cpp line 141: >> >>> 139: >>> 140: // The Java call uses the JIT ABI, but we also call C. >>> 141: int out_arg_area = MAX2(frame::z_jit_out_preserve_size + arg_shuffle.out_arg_bytes(), (int)frame::z_abi_160_size); >> >> What do you mean here with "but we also call C"? Upcall stubs are always calling into Java, though the source ABI is unknown. > > We do native calls in this stub, so make sure allocated stack frame size is big enough. Ah, right thanks. That's good. >> test/jdk/java/foreign/TestClassLoaderFindNative.java line 63: >> >>> 61: public void testVariableSymbolLookup() { >>> 62: MemorySegment segment = SymbolLookup.loaderLookup().find("c").get().reinterpret(ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN ? 1 : 4); >>> 63: assertEquals(segment.get(JAVA_BYTE, ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN ? 0 : 3), 42); >> >> Could you explain why this is needed? It looks like the lookup is returning the wrong address? > > Since s390x runs in Big Endian mode. We get LSB on higher address of integer size. Ah, wait, now I see. The native side uses `int` as a type, but we try to load it as a `JAVA_BYTE`. I think this is a bug in the test. The Java side should use `JAVA_INT` instead, and the size passed to `reinterpret` should be `4` (which matches the native type). What happens if you make that change instead? Does the test pass then? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1256224573 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1256219721 From lmesnik at openjdk.org Fri Jul 7 18:13:02 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 7 Jul 2023 18:13:02 GMT Subject: RFR: JDK-8305962: update jcstress to 0.16 [v2] In-Reply-To: References: Message-ID: > The fix changes jcstress version and update some parameters used by the jtreg wrapper. Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14742/files - new: https://git.openjdk.org/jdk/pull/14742/files/d7ef2058..c21f4e65 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14742&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14742&range=00-01 Stats: 5 lines in 1 file changed: 3 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14742.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14742/head:pull/14742 PR: https://git.openjdk.org/jdk/pull/14742 From lmesnik at openjdk.org Fri Jul 7 18:13:04 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 7 Jul 2023 18:13:04 GMT Subject: RFR: JDK-8305962: update jcstress to 0.16 In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 20:09:50 GMT, Leonid Mesnik wrote: > The fix changes jcstress version and update some parameters used by the jtreg wrapper. Aleksey, thank you for for useful comments. Our infra is still required to use jtreg wrapper to run jcstress. I verified that it still work. We are going to vary stress level in testing (with jtreg factor). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14742#issuecomment-1625784269 From lmesnik at openjdk.org Fri Jul 7 18:13:07 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 7 Jul 2023 18:13:07 GMT Subject: RFR: JDK-8305962: update jcstress to 0.16 [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 10:15:26 GMT, Aleksey Shipilev wrote: >> Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: >> >> fixes > > test/hotspot/jtreg/applications/jcstress/JcstressRunner.java line 114: > >> 112: // The "default" preset might take days for some tests >> 113: // so use sanity testing by default. >> 114: String mode = "sanity"; > > `sanity` mode is incorrect for actual testing runs. Should be at least `quick`. fixed > test/hotspot/jtreg/applications/jcstress/JcstressRunner.java line 129: > >> 127: >> 128: extraFlags.add("-sc"); >> 129: extraFlags.add("false"); > > Is this to shorten the execution time? I'd recommend `-af GLOBAL` too then, since we are reducing the test matrix for it anyway. fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14742#discussion_r1256240981 PR Review Comment: https://git.openjdk.org/jdk/pull/14742#discussion_r1256241706 From shade at openjdk.org Fri Jul 7 18:14:56 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 7 Jul 2023 18:14:56 GMT Subject: RFR: JDK-8305962: update jcstress to 0.16 [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 18:13:02 GMT, Leonid Mesnik wrote: >> The fix changes jcstress version and update some parameters used by the jtreg wrapper. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > fixes Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14742#pullrequestreview-1519460567 From shade at openjdk.org Fri Jul 7 18:17:53 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 7 Jul 2023 18:17:53 GMT Subject: RFR: JDK-8305962: update jcstress to 0.16 [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 18:13:02 GMT, Leonid Mesnik wrote: >> The fix changes jcstress version and update some parameters used by the jtreg wrapper. > > Leonid Mesnik has updated the pull request incrementally with one additional commit since the last revision: > > fixes All right then! Talk to me for next jcstress update, you'll need adjustments to these options. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14742#issuecomment-1625798668 From duke at openjdk.org Fri Jul 7 18:28:56 2023 From: duke at openjdk.org (sid8606) Date: Fri, 7 Jul 2023 18:28:56 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 17:55:35 GMT, Jorn Vernee wrote: >> Since s390x runs in Big Endian mode. We get LSB on higher address of integer size. > > Ah, wait, now I see. The native side uses `int` as a type, but we try to load it as a `JAVA_BYTE`. I think this is a bug in the test. The Java side should use `JAVA_INT` instead, and the size passed to `reinterpret` should be `4` (which matches the native type). What happens if you make that change instead? Does the test pass then? Yes, it does pass the test with above changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1256271583 From lmesnik at openjdk.org Fri Jul 7 19:17:10 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Fri, 7 Jul 2023 19:17:10 GMT Subject: Integrated: JDK-8305962: update jcstress to 0.16 In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 20:09:50 GMT, Leonid Mesnik wrote: > The fix changes jcstress version and update some parameters used by the jtreg wrapper. This pull request has now been integrated. Changeset: 292ee630 Author: Leonid Mesnik URL: https://git.openjdk.org/jdk/commit/292ee630ae32c3b50363b10ffa6090e57ffef1e8 Stats: 64 lines in 5 files changed: 53 ins; 0 del; 11 mod 8305962: update jcstress to 0.16 Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/14742 From iklam at openjdk.org Fri Jul 7 19:29:58 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 19:29:58 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v9] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 18:01:14 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > refactor body of print_class_load_logging in print_class_load_helper Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14553#pullrequestreview-1519651288 From iklam at openjdk.org Fri Jul 7 19:30:00 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 19:30:00 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v7] In-Reply-To: References: Message-ID: <2aIK2jwi8HKRv2NHg8xF9x4b8apZz6AOauk8NGSsTWI=.305e38e4-70ee-4d5a-87b8-1c4b3411a522@github.com> On Fri, 7 Jul 2023 17:54:34 GMT, Doug Simon wrote: > > We should organize the logging so that they are grouped logically. > > There's nothing ungrouped about the original order; the order of `class+load` and `class+load+cause` were swapped but they were still grouped together. My point is if you want to see both the stack and the loader for a class, now you can see them right next to each other in the log file, without having to scroll through pages of call stacks. The new version looks good to me. > > This happens before the class is actually loaded, so those logs cannot appear after the main class+load tag. > > Maybe I'm interpreting the output incorrectly, but that's not what I see: > > ``` > > java "-Xlog:class+load=debug,class+load+cause=info,class+load+placeholders=debug" -XX:LogClassLoadingCauseFor=java.lang.StringCoding --version > ... > [1.917s][debug][class,load,placeholders] entry java/lang/StringCoding : find_and_add LOAD_INSTANCE > [1.917s][debug][class,load,placeholders] loadInstanceThreadQ threads:0x000000013080a000, > [1.917s][debug][class,load,placeholders] superThreadQ threads: > [1.917s][debug][class,load,placeholders] defineThreadQ threads: > [1.917s][info ][class,load ] java.lang.StringCoding source: /Users/dnsimon/dev/jdk-jdk/open/build/macosx-aarch64/jdk/modules/java.base > [1.917s][debug][class,load ] klass: 0x000000030007acf8 super: 0x0000000300041170 loader: [loader data: 0x00006000034ccaa0 of 'bootstrap'] bytes: 1361 checksum: af8c723a > [1.917s][info ][class,load,cause ] Java stack when loading java.lang.StringCoding: Before the class is actually parsed and loaded, its name is first entered into the place holder table. The logging reflects that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1256353304 From jvernee at openjdk.org Fri Jul 7 19:38:55 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 7 Jul 2023 19:38:55 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 18:25:58 GMT, sid8606 wrote: >> Ah, wait, now I see. The native side uses `int` as a type, but we try to load it as a `JAVA_BYTE`. I think this is a bug in the test. The Java side should use `JAVA_INT` instead, and the size passed to `reinterpret` should be `4` (which matches the native type). What happens if you make that change instead? Does the test pass then? > > Yes, it does pass the test with above changes. Ok, please use the above changes instead then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1256361842 From coleenp at openjdk.org Fri Jul 7 19:59:02 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 7 Jul 2023 19:59:02 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v9] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 18:01:14 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > refactor body of print_class_load_logging in print_class_load_helper Thanks for implementing the suggestions to do this with logging. Looks good. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14553#pullrequestreview-1519699004 From ccheung at openjdk.org Fri Jul 7 22:33:58 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Fri, 7 Jul 2023 22:33:58 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 17:12:09 GMT, Ioi Lam wrote: >> This PR attempts to clean up some of the cruds in the existing code: >> >> - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 >> - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" >> - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp >> - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` >> - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` >> >> Also: >> - Moved SerializeClosure to its own header file to improve build time. >> - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @ashu-mehra review; updated source code comments Just couple of nits. src/hotspot/share/cds/filemap.cpp line 290: > 288: st->print_cr("- serialized_data_offset: " SIZE_FORMAT_X, _serialized_data_offset); > 289: st->print_cr("- heap_begin: " INTPTR_FORMAT, p2i(_heap_begin)); > 290: st->print_cr("- heap_end: " INTPTR_FORMAT, p2i(_heap_end)); Maybe print the _heap_roots_offset? src/hotspot/share/cds/serializeClosure.hpp line 26: > 24: > 25: #ifndef SHARED_CDS_SERIALIZECLOSURE_HPP > 26: #define SHARED_CDS_SERIALIZECLOSURE_HPP Since the directory name is `share`, I think the above should be `SHARE_` instead of `SHARED_`. src/hotspot/share/cds/serializeClosure.hpp line 68: > 66: }; > 67: > 68: #endif // SHARED_CDS_SERIALIZECLOSURE_HPP Should be `SHARE_` instead of `SHARED_`. ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14792#pullrequestreview-1519956788 PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1256561142 PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1256526160 PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1256526970 From iklam at openjdk.org Fri Jul 7 23:18:55 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 7 Jul 2023 23:18:55 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Wed, 5 Jul 2023 21:34:12 GMT, Ashutosh Mehra wrote: > > I first ran java -Xshare:dump so all the subsequent java --version runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. > > Okay, thanks for clarifying. I thought `java --version` runs were using the default archive. I haven't done any optimizations yet, but I fixed a few problems in the slow-path code. https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 # Before: no relocation $ perf stat -r 40 java --version > /dev/null 0.015872 +- 0.000238 seconds time elapsed ( +- 1.50% ) # Before: force relocation (quick) $ perf stat -r 40 java -Xmx4g --version > /dev/null 0.016691 +- 0.000385 seconds time elapsed ( +- 2.31% ) # Before: force relocation ("quick relocation not possible") $ perf stat -r 40 java -Xmx2g --version > /dev/null 0.017385 +- 0.000230 seconds time elapsed ( +- 1.32% ) # After $ perf stat -r 40 java -XX:+NewArchiveHeapLoading --version > /dev/null 0.018780 +- 0.000225 seconds time elapsed ( +- 1.20% ) So the slow path is just about 3ms slower than the fastest "before" case. Looking at the detailed timing break down (`os::thread_cpu_time()` = ns): $ java -XX:+NewArchiveHeapLoading -Xlog:cds+gc --version [0.006s][info][cds,gc] Num objs : 24184 [0.006s][info][cds,gc] Num bytes : 1074640 [0.006s][info][cds,gc] Per obj bytes : 44 [0.006s][info][cds,gc] Num references (incl nulls) : 87109 [0.006s][info][cds,gc] Num references relocated : 43225 [0.006s][info][cds,gc] Allocation Time : 1605084 <<<< A [0.006s][info][cds,gc] Relocation Time : 1246894 [0.006s][info][cds,gc] Table(s) dispose Time : 1306 $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=2 -Xlog:cds+gc --version [0.006s][info][cds,gc] Allocation Time : 2203781 <<<< B $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=-1 -Xlog:cds+gc --version [0.003s][info][cds,gc] Allocation Time : 282125 <<<< C $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=0 -Xlog:cds+gc --version [0.004s][info][cds,gc] Allocation Time : 854249 <<<< D The cost of allocating all the object is about 0.6 ms (B - A). Determining the size of the objects cost about 0.3ms (C). The cost of storing the relocation into the hastables is about 0.6 ms (D - C). I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1626352390 From jiangli at openjdk.org Sat Jul 8 00:22:09 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Sat, 8 Jul 2023 00:22:09 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking Message-ID: Move StringTable to JavaClassFile namespace. ------------- Commit messages: - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking Changes: https://git.openjdk.org/jdk/pull/14808/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14808&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311661 Stats: 59 lines in 21 files changed: 14 ins; 0 del; 45 mod Patch: https://git.openjdk.org/jdk/pull/14808.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14808/head:pull/14808 PR: https://git.openjdk.org/jdk/pull/14808 From iklam at openjdk.org Sat Jul 8 00:25:23 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 8 Jul 2023 00:25:23 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects [v3] In-Reply-To: References: Message-ID: > This PR attempts to clean up some of the cruds in the existing code: > > - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 > - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" > - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp > - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` > - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` > > Also: > - Moved SerializeClosure to its own header file to improve build time. > - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: - @calvinccheung review: print heap_roots_offset - @calvinccheung review: fixed include guard in headers; fixed misaligned line escapes in cds_globals.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14792/files - new: https://git.openjdk.org/jdk/pull/14792/files/fb8a2165..7f37fd2b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14792&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14792&range=01-02 Stats: 34 lines in 9 files changed: 1 ins; 0 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/14792.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14792/head:pull/14792 PR: https://git.openjdk.org/jdk/pull/14792 From iklam at openjdk.org Sat Jul 8 00:25:24 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 8 Jul 2023 00:25:24 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects [v3] In-Reply-To: References: Message-ID: <-8qacZ1rJG-cRuJaclHyofQ6QvOVE77K54WkWhF0_iE=.093a60cd-0964-4210-8b2c-e9530d977725@github.com> On Fri, 7 Jul 2023 22:27:27 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with two additional commits since the last revision: >> >> - @calvinccheung review: print heap_roots_offset >> - @calvinccheung review: fixed include guard in headers; fixed misaligned line escapes in cds_globals.hpp > > src/hotspot/share/cds/filemap.cpp line 290: > >> 288: st->print_cr("- serialized_data_offset: " SIZE_FORMAT_X, _serialized_data_offset); >> 289: st->print_cr("- heap_begin: " INTPTR_FORMAT, p2i(_heap_begin)); >> 290: st->print_cr("- heap_end: " INTPTR_FORMAT, p2i(_heap_end)); > > Maybe print the _heap_roots_offset? Hi Calvin, thanks for the review. I added the `_heap_roots_offset` into the debug output. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1256647933 From iklam at openjdk.org Sat Jul 8 00:25:24 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 8 Jul 2023 00:25:24 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 21:56:51 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @ashu-mehra review; updated source code comments > > src/hotspot/share/cds/serializeClosure.hpp line 68: > >> 66: }; >> 67: >> 68: #endif // SHARED_CDS_SERIALIZECLOSURE_HPP > > Should be `SHARE_` instead of `SHARED_`. I fixed the `SHARED_` include guards in this file, plus a few other headers in the `share/cds/`directory. I also took this opportunity to fix some indent issues in cds_globals.hpp. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14792#discussion_r1256647900 From duke at openjdk.org Sat Jul 8 04:31:56 2023 From: duke at openjdk.org (sid8606) Date: Sat, 8 Jul 2023 04:31:56 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 19:35:53 GMT, Jorn Vernee wrote: >> Yes, it does pass the test with above changes. > > Ok, please use the above changes instead then. Sure, Thank you ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1256959506 From stuefe at openjdk.org Sat Jul 8 06:03:11 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 8 Jul 2023 06:03:11 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v5] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 7 Jul 2023 13:36:22 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 19 additional commits since the last revision: > > - rework test > - Test Unsafe->WB > - Renamings > - Fix include guard name > - Aleksey: changes 3 > - Aleksey: cosmetic changes 2 > - Aleksey: cosmetic changes 1 > - change to os::elapsedTime > - Add missing gtest > - Purge all mentionings of GC from patch > - ... and 9 more: https://git.openjdk.org/jdk/compare/451722a5...c919203b > > (just noticed the patch adds +666 lines, bad sign, I should add another line somewhere). > > It also deletes 2 lines so that makes it 664 wink We are good then :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1626879059 From stuefe at openjdk.org Sat Jul 8 06:17:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 8 Jul 2023 06:17:43 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v6] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: fix 32-bit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/c919203b..34233069 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From dnsimon at openjdk.org Sat Jul 8 08:00:11 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Sat, 8 Jul 2023 08:00:11 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v9] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 18:01:14 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > refactor body of print_class_load_logging in print_class_load_helper Thanks for all the reviews and useful input. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14553#issuecomment-1626901135 From dnsimon at openjdk.org Sat Jul 8 08:00:13 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Sat, 8 Jul 2023 08:00:13 GMT Subject: Integrated: 8193513: add support for printing a stack trace on class loading In-Reply-To: References: Message-ID: On Tue, 20 Jun 2023 09:19:42 GMT, Doug Simon wrote: > In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. > > This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: > > > product(ccstr, LogClassLoadingCauseFor, nullptr, \ > "Apply -Xlog:class+load+cause* to classes whose fully " \ > "qualified name contains this string ("*" matches " \ > "any class).") \ > > > Example usage: > > java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version > [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: > [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) > [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) > [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) > [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) > [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) > [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystemProvider.(java.b... This pull request has now been integrated. Changeset: 4a1fcb60 Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/4a1fcb6063fd5fad9ff9763359e7c79401e4fa92 Stats: 110 lines in 6 files changed: 105 ins; 0 del; 5 mod 8193513: add support for printing a stack trace on class loading Reviewed-by: dholmes, iklam, coleenp ------------- PR: https://git.openjdk.org/jdk/pull/14553 From stuefe at openjdk.org Sat Jul 8 08:01:32 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 8 Jul 2023 08:01:32 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - Add test with 1ms trim interval - No need for atomics ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/34233069..aa4dbc0b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=05-06 Stats: 47 lines in 2 files changed: 23 ins; 6 del; 18 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Sat Jul 8 09:41:59 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 8 Jul 2023 09:41:59 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 23:16:00 GMT, Ioi Lam wrote: >>> I first ran java -Xshare:dump so all the subsequent java --version runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. >> >> Okay, thanks for clarifying. I thought `java --version` runs were using the default archive. > >> > I first ran java -Xshare:dump so all the subsequent java --version runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. >> >> Okay, thanks for clarifying. I thought `java --version` runs were using the default archive. > > I haven't done any optimizations yet, but I fixed a few problems in the slow-path code. > > https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 > > > # Before: no relocation > $ perf stat -r 40 java --version > /dev/null > 0.015872 +- 0.000238 seconds time elapsed ( +- 1.50% ) > > # Before: force relocation (quick) > $ perf stat -r 40 java -Xmx4g --version > /dev/null > 0.016691 +- 0.000385 seconds time elapsed ( +- 2.31% ) > > # Before: force relocation ("quick relocation not possible") > $ perf stat -r 40 java -Xmx2g --version > /dev/null > 0.017385 +- 0.000230 seconds time elapsed ( +- 1.32% ) > > # After > $ perf stat -r 40 java -XX:+NewArchiveHeapLoading --version > /dev/null > 0.018780 +- 0.000225 seconds time elapsed ( +- 1.20% ) > > > So the slow path is just about 3ms slower than the fastest "before" case. > > Looking at the detailed timing break down (`os::thread_cpu_time()` = ns): > > > $ java -XX:+NewArchiveHeapLoading -Xlog:cds+gc --version > [0.006s][info][cds,gc] Num objs : 24184 > [0.006s][info][cds,gc] Num bytes : 1074640 > [0.006s][info][cds,gc] Per obj bytes : 44 > [0.006s][info][cds,gc] Num references (incl nulls) : 87109 > [0.006s][info][cds,gc] Num references relocated : 43225 > [0.006s][info][cds,gc] Allocation Time : 1605084 <<<< A > [0.006s][info][cds,gc] Relocation Time : 1246894 > [0.006s][info][cds,gc] Table(s) dispose Time : 1306 > > $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=2 -Xlog:cds+gc --version > [0.006s][info][cds,gc] Allocation Time : 2203781 <<<< B > > $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=-1 -Xlog:cds+gc --version > [0.003s][info][cds,gc] Allocation Time : 282125 <<<< C > > $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=0 -Xlog:cds+gc --version > [0.004s][info][cds,gc] Allocation Time : 854249 <<<< D > > > The cost of allocating a... > > @iklam in your performance tests for "java -version" what is the "heap data relocation delta"? Is it non-zero? If so, can you also add the numbers for the runs with -Xmx128m which would correspond to the best case where no relocation is done just to add another data point. > > I first ran `java -Xshare:dump` so all the subsequent `java --version` runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. > I must be missing something basic, but I don't understand Ashu's question nor the answer. You now allocate space for each object individually, then memcopy it to the allotted space, right? How could there *not* be relocation involved, for either absolute or narrow oop? The chance that the object is allocated at exactly the same address or at the same offset to heap start as in the dump JVM must be vanishingly low, right? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1626985939 From duke at openjdk.org Sat Jul 8 10:48:15 2023 From: duke at openjdk.org (sid8606) Date: Sat, 8 Jul 2023 10:48:15 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for s390x (Big Endian). sid8606 has updated the pull request incrementally with one additional commit since the last revision: Address suggestions from Jorn Vernee ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/e719c164..f916864d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=01-02 Stats: 137 lines in 11 files changed: 1 ins; 77 del; 59 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From duke at openjdk.org Sat Jul 8 10:59:58 2023 From: duke at openjdk.org (sid8606) Date: Sat, 8 Jul 2023 10:59:58 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: <72UoBC1ZL2QmnHBoEj1SoH2xn-Aj2UZj29HKiLQs-IY=.84b05a3a-aaf2-4d25-88d6-af6c5a3d33ad@github.com> On Fri, 7 Jul 2023 10:33:10 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > test/jdk/java/foreign/TestIllegalLink.java line 57: > >> 55: >> 56: private static final boolean IS_SYSV = CABI.current() == CABI.SYS_V; >> 57: private static final boolean byteorder = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN; > > Please rename this field to something more descriptive, e.g. `IS_LE` (and capitalize). Thanks, Made the changes. > test/jdk/java/foreign/TestLayouts.java line 46: > >> 44: public class TestLayouts { >> 45: >> 46: boolean byteorder = ByteOrder.nativeOrder() == ByteOrder.LITTLE_ENDIAN; > > Same here Fixed > test/jdk/java/foreign/callarranger/platform/PlatformLayouts.java line 312: > >> 310: * This class defines layout constants modelling standard primitive types supported by the S390 ABI. >> 311: */ >> 312: public static final class S390 { > > These are only needed if you plan on adding a CallArranger test as well. Removed For now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257226257 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257226355 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257226562 From duke at openjdk.org Sat Jul 8 11:06:00 2023 From: duke at openjdk.org (sid8606) Date: Sat, 8 Jul 2023 11:06:00 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 11:56:28 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > src/hotspot/cpu/s390/downcallLinker_s390.cpp line 162: > >> 160: allocated_frame_size += arg_shuffle.out_arg_bytes(); >> 161: >> 162: bool should_save_return_value = !_needs_return_buffer && _needs_transition;; > > Since return buffer is not implemented here, I suggest adding an assert that checks that `_needs_return_buffer` is always false. Added assert. > src/hotspot/cpu/s390/foreignGlobals_s390.cpp line 115: > >> 113: switch (to_reg.type()) { >> 114: case StorageType::INTEGER: >> 115: if (to_reg.segment_mask() == REG64_MASK && from_reg.segment_mask() == REG32_MASK ) { > > Since this deals with 32 bit regs as well, might want to rename the function to just `move_reg` (i.e drop the `64`) Renamed. > src/java.base/share/classes/jdk/internal/foreign/abi/AbstractLinker.java line 78: > >> 76: @CallerSensitive >> 77: public final MethodHandle downcallHandle(MemorySegment symbol, FunctionDescriptor function, Option... options) { >> 78: Reflection.ensureNativeAccess(Reflection.getCallerClass(), Linker.class, "downcallHandle"); > > Looks spurious? Please undo Fixed > src/java.base/share/classes/jdk/internal/foreign/abi/s390/S390Architecture.java line 115: > >> 113: >> 114: private static VMStorage floatRegister(int index) { >> 115: return new VMStorage(StorageType.FLOAT, REG64_MASK, index, "v" + index); > > Maybe this should be `"f"` instead of `"v"`? (given the names of the variables above) > Suggestion: > > return new VMStorage(StorageType.FLOAT, REG64_MASK, index, "f" + index); Fixed > src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390CallArranger.java line 136: > >> 134: return returnLayout >> 135: .filter(GroupLayout.class::isInstance) >> 136: .filter(layout -> layout instanceof GroupLayout) > > These lines both do the same, so one can be removed. Fixed > src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/TypeClass.java line 114: > >> 112: return false; >> 113: } >> 114: } > > I believe this loop is not needed, since above it's determined that `scalarLayouts` has only 1 element. Removed, Thank you ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257228786 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257228670 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257228415 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257228265 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257228164 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257228044 From duke at openjdk.org Sat Jul 8 11:09:59 2023 From: duke at openjdk.org (sid8606) Date: Sat, 8 Jul 2023 11:09:59 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 12:02:33 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > src/hotspot/cpu/s390/downcallLinker_s390.cpp line 207: > >> 205: __ z_lg(callerSP, _z_abi(callers_sp), Z_SP); // preset (used to access caller frame argument slots) >> 206: __ block_comment("{ argument shuffle"); >> 207: arg_shuffle.generate(_masm, as_VMStorage(callerSP), frame::z_jit_out_preserve_size, _abi._shadow_space_bytes, locs); > > I'm not sure exactly what `callerSP` is doing, but it seems to be Z_SP + bias? Why can't the `in_stk_bias` parameter be used for that? (and then use `tmp` for the shuffle reg). Here we saving the caller frame i.e java stack pointer to use in Argument shuffling, I made the changes to use frame pointer Z_R11 now. hence removed this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257229839 From duke at openjdk.org Sat Jul 8 11:13:58 2023 From: duke at openjdk.org (sid8606) Date: Sat, 8 Jul 2023 11:13:58 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 12:06:39 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > src/hotspot/cpu/s390/foreignGlobals_s390.cpp line 186: > >> 184: case StorageType::FRAME_DATA: { >> 185: switch (from_reg.stack_size()) { >> 186: case 8: __ mem2reg_opt(Z_R0_scratch, Address (callerSP, reg2offset(from_reg, in_stk_bias)), true); break; > > A potential issue here is that Z_R0_scratch could be used by the target ABI, that's why the shuffle register is passed as an argument on other platforms. (It also makes it clearer in the calling code that that register is used somehow). Now using Z_R11 frame pointer and passing shuffle register to use as temp. > src/hotspot/cpu/s390/upcallLinker_s390.cpp line 172: > >> 170: // |---------------------| = frame_bottom_offset = frame_size >> 171: // | (optional) | >> 172: // | ret_buf | > > There's no return buffer, so this can be removed. Fixed > src/hotspot/cpu/s390/upcallLinker_s390.cpp line 232: > >> 230: >> 231: // return value shuffle >> 232: if (!needs_return_buffer) { > > I suggest using an assert here instead. Added an assert, thanks ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257230962 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257231066 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1257231195 From jvernee at openjdk.org Sat Jul 8 14:24:56 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Sat, 8 Jul 2023 14:24:56 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee The changes look good, thanks. I'll run some testing in our CI. Not sure if you want to fix https://github.com/openjdk/jdk/pull/14801#discussion_r1255743888 now or later. ------------- PR Review: https://git.openjdk.org/jdk/pull/14801#pullrequestreview-1520712793 From iklam at openjdk.org Sat Jul 8 14:39:56 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 8 Jul 2023 14:39:56 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Mon, 19 Jun 2023 20:06:23 GMT, Ashutosh Mehra wrote: >> This PR adds GC APIs to be implemented by all collectors for proper handling of archive heap space. Currently only G1 is updated to use these APIs which just involves renaming the existing G1 APIs. >> In addition to that filemap.cpp is updated to replace calls to `G1CollectedHeap::heap()` with `Universe::heap()` to avoid G1 specific code as much as possible. >> >> At many places in filemap.cpp heap range is requested from GC. All collectors except ZGC have contiguous heap and set `CollectedHeap::_reserved` to the heap range, so it can be easily exposed to the CDS code. This is done in this patch through `CollectedHeap::reserved` API. But for ZGC the heap can be discontiguous which makes it tricky to expose the heap range. >> Another point to note is that most of the usage for heap range is for logging purpose, but there is one place where it is used for setting the `mapping_offset` in `FileMapInfo::write_region()` based on the heap start. So purely based on the functional requirement, we only need the heap start address, not the range. >> >> To keep things simple and considering ZGC does not currently support archive heap, i refrained from tackling the issue of discontiguous heap range in this PR. > > Ashutosh Mehra has updated the pull request incrementally with three additional commits since the last revision: > > - Remove unnecessary assert and condition for UseG1GC > > Signed-off-by: Ashutosh Mehra > - Rename CollectedHeap::reserved() to CollectedHeap::reserved_range() > > Signed-off-by: Ashutosh Mehra > - Rename alloc_archive_space to allocate_archive_space > > Signed-off-by: Ashutosh Mehra > > I first ran `java -Xshare:dump` so all the subsequent `java --version` runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. > > I must be missing something basic, but I don't understand Ashu's question nor the answer. > > You now allocate space for each object individually, then memcopy it to the allotted space, right? How could there _not_ be relocation involved, for either absolute or narrow oop? The chance that the object is allocated at exactly the same address or at the same offset to heap start as in the dump JVM must be vanishingly low, right? In my command lines, `java --version` runs the existing code (mmap the archive heap). The new code (individual oop allocation) is used only with the `-XX:+NewArchiveHeapLoading` flag. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1627354368 From iklam at openjdk.org Sun Jul 9 04:25:26 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 9 Jul 2023 04:25:26 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects [v4] In-Reply-To: References: Message-ID: <0fh4UzgTdWJvvUPzUGAs2_2ACB6I3bCjx0bfOnvBuqA=.e3cc7190-78bc-4857-abf8-62ef2b6294a6@github.com> > This PR attempts to clean up some of the cruds in the existing code: > > - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 > - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" > - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp > - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` > - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` > > Also: > - Moved SerializeClosure to its own header file to improve build time. > - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into 8311604-simplify-nocoops-requested-addr-for-archived-heap - @calvinccheung review: print heap_roots_offset - @calvinccheung review: fixed include guard in headers; fixed misaligned line escapes in cds_globals.hpp - @ashu-mehra review; updated source code comments - 8311604: Simplify NOCOOPS requested addresses for archived heap objects ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14792/files - new: https://git.openjdk.org/jdk/pull/14792/files/7f37fd2b..d1b56a2a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14792&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14792&range=02-03 Stats: 2826 lines in 55 files changed: 1410 ins; 1080 del; 336 mod Patch: https://git.openjdk.org/jdk/pull/14792.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14792/head:pull/14792 PR: https://git.openjdk.org/jdk/pull/14792 From iklam at openjdk.org Sun Jul 9 15:21:15 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 9 Jul 2023 15:21:15 GMT Subject: RFR: 8311604: Simplify NOCOOPS requested addresses for archived heap objects In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 15:12:08 GMT, Ashutosh Mehra wrote: >> This PR attempts to clean up some of the cruds in the existing code: >> >> - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 >> - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" >> - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp >> - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` >> - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` >> >> Also: >> - Moved SerializeClosure to its own header file to improve build time. >> - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. > > minor nitpicks, otherwise looks good! Thanks @ashu-mehra and @calvinccheung for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14792#issuecomment-1627744432 From iklam at openjdk.org Sun Jul 9 15:21:17 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 9 Jul 2023 15:21:17 GMT Subject: Integrated: 8311604: Simplify NOCOOPS requested addresses for archived heap objects In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 23:39:34 GMT, Ioi Lam wrote: > This PR attempts to clean up some of the cruds in the existing code: > > - Simplified the calculation of "requested address" when `UseCompressedOops` is disabled -- the archived heap objects are always written starting from 0x10000000 > - Removed `HeapShared::to_requested_address()` so we don't have two kinds of "requested address" > - Updated the comments about "source" vs "buffered" vs "requested" addresses in archiveHeapWriter.hpp > - Removed `SerializeClosure::oop()` as the only oop we need to store into the archive header is `HeapShared::roots()`, which can be handled more easily with `FileMapHeader::_heap_roots_offset` > - Removed some unnecessary dependencies on `G1CollectedHeap::heap()->reserved()` > > Also: > - Moved SerializeClosure to its own header file to improve build time. > - Fixed DeterministicDump.java, which wasn't archiving Java objects when `UseCompressedOops` was disabled. This pull request has now been integrated. Changeset: 581f90e2 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/581f90e242b8a943215a223189d171b7ede37785 Stats: 403 lines in 30 files changed: 159 ins; 172 del; 72 mod 8311604: Simplify NOCOOPS requested addresses for archived heap objects Reviewed-by: ccheung ------------- PR: https://git.openjdk.org/jdk/pull/14792 From haosun at openjdk.org Mon Jul 10 01:46:07 2023 From: haosun at openjdk.org (Hao Sun) Date: Mon, 10 Jul 2023 01:46:07 GMT Subject: RFR: 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 04:45:31 GMT, Hao Sun wrote: > Three groups of runtime routines, i.e. arraycopy, copy and fill, are tested inside function `initialize_final_stubs()`. The test runs every time the debug VM is started. > > I think it's a usual convention that it's better not to run functional tests on startup. Hence, this patch proposes to move the stub test under `test/hotspot/gtest`. > > It's one copy-paste patch, except the following two minor changes: > > 1) Remove `ASSERT` condition check, and the gtest case will be run for release build as well. > > 2) Use the gtest helper `ASSERT_TRUE()` to replace the `assert()`. > > Note that the downside is that we won't catch stub implementation errors immediately on startup. > > Test: > > 1) Cross compilations on arm32/s390/ppc/riscv passed. > > 2) tier1 test passed on Linux/AArch64, Linux/x86_64 and macOS/Apple silicon. Note that hotspot/gtest is run in tier1. > > 3) VM builds with several different options (e.g., zero build, release build, fastdebug build, client variant, no C1/c2) passed on Linux/AArch64 and Linux/x86_64. > > 4) I manually injected errors in arraycopy/copy/fill stubs and verified `make test TEST="gtest:StubRoutines"` fails or not. But this check only worked for array fill stub. For the remaining two stubs, VM build (`make images`) failed firstly before running the gtest, because the built VM with these "erroneous" stubs were executed to do something such as loading Java core library code. Ping? Can anyone help take a look at this patch? Thanks in advance. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14765#issuecomment-1627943335 From iklam at openjdk.org Mon Jul 10 05:39:08 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 10 Jul 2023 05:39:08 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 23:16:00 GMT, Ioi Lam wrote: > I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled. I've implemented a fast relocation lookup. It currently uses a table of the same size as the archived heap objects, but I can reduce that to 1/2 the size. See https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 This is implemented by about 330 lines of code in archiveHeapLoader.cpp. The code is templatized to try out different approaches (like `-XX:+NahlRawAlloc` and `-XX:+NahlUseAccessAPI`), so it can be further simplified. There's only one thing that's not yet implemented -- the equivalent of `ArchiveHeapLoader::patch_native_pointers()`. I'll do that next. $ java -XX:+NewArchiveHeapLoading -Xmx128m -Xlog:cds+gc --version [0.004s][info][cds,gc] Delayed allocation records alloced: 640 [0.004s][info][cds,gc] Load Time: 1388458 The whole allocation + reloc takes about 1.4ms. It's about 1.25ms slower in the worst case (when the "old" code doesn't have to relocate -- see the `(**)` in the table below). It's 0.8ms slower when the "old" code has to relocate. All times are in ms, for "java --version" ==================================== Dump: java -Xshare:dump -Xmx128m G1 old new diff 128m 14.476 15.754 +1.277 (**) 8192m 15.359 16.085 +0.726 Serial old new 128m 13.442 14.241 +0.798 8192m 13.740 14.532 +0.791 ==================================== Dump: java -Xshare:dump -Xmx8192m G1 old new diff 128m 14.975 15.787 +0.812 2048m 16.239 17.035 +0.796 8192m 14.821 16.042 +1.221 (**) Serial old new 128m 13.444 14.167 +0.723 8192m 13.717 14.502 +0.785 While the code is slower than before, it's a lot simpler. It works on all collectors. I tested on ZGC, but I think Shenandoah should work as well. The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future. The extra memory cost is: - a temporary in-memory copy of the archived heap objects - a temporary table of 1/2 the size of the archived heap objects The former can be reduced by reading the archived objects in a stream. The latter can be reduced by a more elaborate relocation algorithm that assumes that most of the allocated objects are in a contiguous block. Such changes may cause further slow down. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1628247922 From fyang at openjdk.org Mon Jul 10 06:28:02 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 10 Jul 2023 06:28:02 GMT Subject: [jdk21] RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v2] In-Reply-To: <0cOr4cGicJ3M49nFwLBTIt9w2aFpV1sUwl6xyf1Htfk=.6cb7355f-5760-477f-8420-4a9032563159@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> <0cOr4cGicJ3M49nFwLBTIt9w2aFpV1sUwl6xyf1Htfk=.6cb7355f-5760-477f-8420-4a9032563159@github.com> Message-ID: <5EQ9ia2tsksrpIWOSXjNlWWLGQ01wrdHQ3rHZ9KfbLU=.4a085c87-6670-48bf-9f74-521e01c88271@github.com> On Thu, 6 Jul 2023 20:16:14 GMT, Robbin Ehn wrote: >> Hi all, >> >> This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. >> >> The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. >> >> Thanks! > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Added missing global defines in jdk 21 > > Signed-off-by: Robbin Ehn Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk21/pull/90#pullrequestreview-1521305089 From dzhang at openjdk.org Mon Jul 10 07:13:00 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Mon, 10 Jul 2023 07:13:00 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v7] In-Reply-To: References: Message-ID: <58-yt8-aJSeLF3WAAd0TMMNrNTOz7NbS73zNXO2FMy0=.fb39c8fc-5762-485e-8bdd-5d508c56d52d@github.com> On Thu, 6 Jul 2023 18:57:16 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Interpreter fix and cleanup > It looks like I misunderstood what `tbz` does! I believe you are correct in suggesting that the `andr` is not necessary. Hi, @matias9927, Thanks for update! We have already done the adaptation for RISC-V locally and we are currently testing. I will update the test results and give the corresponding patch later. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1628369561 From shade at openjdk.org Mon Jul 10 11:12:23 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Jul 2023 11:12:23 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> On Sat, 8 Jul 2023 08:01:32 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Add test with 1ms trim interval > - No need for atomics This is a nice progress! Another read follow. Bikeshedding: I think we also need to decide how we call this thing (and related symbols): "native heap trim" or "trim native heap". AFAICT, from there, the log tag name follows, the options name follow, the VM symbol names follow. src/hotspot/share/logging/logTag.hpp line 199: > 197: LOG_TAG(tlab) \ > 198: LOG_TAG(tracking) \ > 199: LOG_TAG(trimnh) /* trim native heap */ \ `nh` is confusing. `trimnative`, `nativetrim`, or `nativeheaptrim`? I think there is a precedent for long multi-word tags with `valuebasedclasses` :) src/hotspot/share/runtime/globals.hpp line 1990: > 1988: "If TrimNativeHeap is enabled: interval, in ms, at which " \ > 1989: "the to trim the native heap.") \ > 1990: range(1, UINT_MAX) \ I have a suggestion that simplifies UX, I think. In other cases where we do these kinds of intervals, we just use `0` as the "off" value. See for example `AsyncDeflationInterval`, `GuaranteedSafepointInterval`. This would allow users to supply one option only. We would then need to decide if we turn this thing on by default (probably with large interval), or we default to `0` for "off". I would prefer to go for `0`. I understand that would force users to decide on proper trim interval when enabling, but I think that's a feature, not a bug. We would not have to go into discussions if the 60 second default is good enough or not. Something like this: product(uintx, TrimNativeHeapInterval, 0, EXPERIMENTAL, \ ?Attempt to trim the native heap every so many milliseconds, ? \ ?if platform supports it. Lower values provide better footprint ? \ ?under native allocation spikes, while higher values come with ? \ ?less overhead. Use 0 to disable trimming.? \ src/hotspot/share/runtime/trimNativeHeap.cpp line 54: > 52: } > 53: > 54: unsigned inc_suspend_count() { If `_suspend_count` is `uint16_t`, the methods that use it should also return `uint16_t`? src/hotspot/share/runtime/trimNativeHeap.cpp line 79: > 77: // in seconds > 78: static double now() { return os::elapsedTime(); } > 79: static double to_ms(double seconds) { return seconds * 1000.0; } Would you like to just do it in `int64_t` representing milliseconds? The common way to get it is `nanos_to_millis(os::elapsed_counter())`. Turning this to integer would also obviate stuff like `MAX2(1.0, now - time)`. src/hotspot/share/runtime/trimNativeHeap.cpp line 88: > 86: > 87: void run() override { > 88: LogStartStop logStartStop; Suggestion: LogStartStopMark lssm; src/hotspot/share/runtime/trimNativeHeap.cpp line 92: > 90: for (;;) { > 91: double tnow = now(); > 92: const double interval_secs = (double)TrimNativeHeapInterval / 1000; This division can be outside the loop. src/hotspot/share/runtime/trimNativeHeap.cpp line 106: > 104: ml.wait(wait_ms); > 105: } else if (at_or_nearing_safepoint()) { > 106: ml.wait(safepoint_poll_ms); OK, so here is a little problem. Suppose I want to run trims very often, like every 10ms. This loop would stall for 250ms when safepoint is detected, which throws off this guarantee. Can we instead go and sleep for `TrimNativeHeapInterval`? AFAICs, this plays nicely with heuristic guidance (short intervals -> more interference), and it would best-effort stall for twice the interval when safepoint interjects. ------------- PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1521753659 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258038321 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258056866 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258061662 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258074902 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258079941 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258063727 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258086409 From eastigeevich at openjdk.org Mon Jul 10 11:30:28 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Jul 2023 11:30:28 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > 8307352: AARCH64: Improve itable_stub src/hotspot/cpu/aarch64/vtableStubs_aarch64.cpp line 200: > 198: temp_reg, temp_reg2, temp_reg3, itable_index, L_no_such_interface); > 199: > 200: const ptrdiff_t lookupSize = __ pc() - start_pc; I think you can rename `lookupSize` into `codesize`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1258108283 From mgronlun at openjdk.org Mon Jul 10 11:35:42 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 10 Jul 2023 11:35:42 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v3] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: adjustments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/aa498f7e..07428b28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=01-02 Stats: 17 lines in 4 files changed: 4 ins; 3 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From eastigeevich at openjdk.org Mon Jul 10 11:36:21 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Jul 2023 11:36:21 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 20 Jun 2023 13:34:45 GMT, Boris Ulasevich wrote: >>> Yes, there are similar calls to lookup_interface_method stubs from TemplateTable::invokeinterface and from VtableStubs::create_itable_stub. Unfortunately, the instructions in between the lookup_interface_method calls are different, making it difficult to create a common stub for them. Performance is not an issue for the interpreter. >> >> Really? if performance were not an issue for the interpreter we'd all use the C++ interpreter. > > What I am trying to say is that I have not found any benchmark or application where a performance impact of TemplateTable::invokeinterface is visible. I would be happy to rework it as well once we see a benchmark which clearly demonstrates the gain, and the gain is substantial to justify the code complexity. @bulasevich > What I am trying to say is that I have not found any benchmark IMHO you can run the benchmark mentioned in the description with `-Xint`. > Unfortunately, the instructions in between the lookup_interface_method calls are different, making it difficult to create a common stub for them. If it needs a lot of work, we can create a JBS issue to implement the same approach for the interpreter. When it is being implemented some common parts might be identified and refactored. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1258115516 From stuefe at openjdk.org Mon Jul 10 12:31:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 10 Jul 2023 12:31:21 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> Message-ID: On Mon, 10 Jul 2023 11:06:23 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add test with 1ms trim interval >> - No need for atomics > > src/hotspot/share/runtime/trimNativeHeap.cpp line 106: > >> 104: ml.wait(wait_ms); >> 105: } else if (at_or_nearing_safepoint()) { >> 106: ml.wait(safepoint_poll_ms); > > OK, so here is a little problem. Suppose I want to run trims very often, like every 10ms. This loop would stall for 250ms when safepoint is detected, which throws off this guarantee. Can we instead go and sleep for `TrimNativeHeapInterval`? AFAICs, this plays nicely with heuristic guidance (short intervals -> more interference), and it would best-effort stall for twice the interval when safepoint interjects. But then we have a problem for larger trim intervals. Loosing one or multiple trim attempts because a safepoint happened to happen hurts if the interval is e.g. 5 minutes. We could either wait for `MIN2(TrimNativeHeapInterval, safepoint_poll_ms)`. Or, at the cost of one Mutex grab per safepoint, I could do a `notify_all()` at the end of a safepoint. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258176657 From stuefe at openjdk.org Mon Jul 10 12:36:16 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 10 Jul 2023 12:36:16 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> Message-ID: On Mon, 10 Jul 2023 10:37:18 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add test with 1ms trim interval >> - No need for atomics > > src/hotspot/share/runtime/globals.hpp line 1990: > >> 1988: "If TrimNativeHeap is enabled: interval, in ms, at which " \ >> 1989: "the to trim the native heap.") \ >> 1990: range(1, UINT_MAX) \ > > I have a suggestion that simplifies UX, I think. In other cases where we do these kinds of intervals, we just use `0` as the "off" value. See for example `AsyncDeflationInterval`, `GuaranteedSafepointInterval`. This would allow users to supply one option only. > > We would then need to decide if we turn this thing on by default (probably with large interval), or we default to `0` for "off". I would prefer to go for `0`. I understand that would force users to decide on proper trim interval when enabling, but I think that's a feature, not a bug. We would not have to go into discussions if the 60 second default is good enough or not. > > Something like this: > > > product(uintx, TrimNativeHeapInterval, 0, EXPERIMENTAL, \ > ?Attempt to trim the native heap every so many milliseconds, ? \ > ?if platform supports it. Lower values provide better footprint ? \ > ?under native allocation spikes, while higher values come with ? \ > ?less overhead. Use 0 to disable trimming.? \ Makes sense. I was afraid of people shooting themselves in the foot if not presented with a sensible default, but you are right we have precedence with similar switches. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258185263 From duke at openjdk.org Mon Jul 10 12:59:02 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Jul 2023 12:59:02 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 14:32:03 GMT, Ashutosh Mehra wrote: > Please review this PR that enables ClassWriter to write annotations to the class file being dumped. > > The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. > The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. Looking for reviewers for this. TIA. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14735#issuecomment-1628904428 From stuefe at openjdk.org Mon Jul 10 13:06:12 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 10 Jul 2023 13:06:12 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> Message-ID: On Mon, 10 Jul 2023 10:54:11 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add test with 1ms trim interval >> - No need for atomics > > src/hotspot/share/runtime/trimNativeHeap.cpp line 79: > >> 77: // in seconds >> 78: static double now() { return os::elapsedTime(); } >> 79: static double to_ms(double seconds) { return seconds * 1000.0; } > > Would you like to just do it in `int64_t` representing milliseconds? The common way to get it is `nanos_to_millis(os::elapsed_counter())`. > > Turning this to integer would also obviate stuff like `MAX2(1.0, now - time)`. I would like to retain sub-ms precision for printing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258230419 From rkennke at openjdk.org Mon Jul 10 13:28:50 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 10 Jul 2023 13:28:50 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v49] In-Reply-To: References: Message-ID: <3-ULrfsZinkdVexeYoJl7A8hsQMdjTNAh0JXONjOn3k=.d0af1f99-81aa-4c59-824f-e6e41d576ce1@github.com> > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: Fix GetObjectSizeIntrinsicsTest.java to work correctly with +/-UseCCP ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/3e37e785..89fd83f5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=48 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=47-48 Stats: 11 lines in 1 file changed: 3 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From shade at openjdk.org Mon Jul 10 13:42:18 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 10 Jul 2023 13:42:18 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> Message-ID: On Mon, 10 Jul 2023 12:27:44 GMT, Thomas Stuefe wrote: >> src/hotspot/share/runtime/trimNativeHeap.cpp line 106: >> >>> 104: ml.wait(wait_ms); >>> 105: } else if (at_or_nearing_safepoint()) { >>> 106: ml.wait(safepoint_poll_ms); >> >> OK, so here is a little problem. Suppose I want to run trims very often, like every 10ms. This loop would stall for 250ms when safepoint is detected, which throws off this guarantee. Can we instead go and sleep for `TrimNativeHeapInterval`? AFAICs, this plays nicely with heuristic guidance (short intervals -> more interference), and it would best-effort stall for twice the interval when safepoint interjects. > > But then we have a problem for larger trim intervals. Loosing one or multiple trim attempts because a safepoint happened to happen hurts if the interval is e.g. 5 minutes. > > We could either wait for `MIN2(TrimNativeHeapInterval, safepoint_poll_ms)`. > > Or, at the cost of one Mutex grab per safepoint, I could do a `notify_all()` at the end of a safepoint. Yes, waiting for `MIN2(TNHI, )` would be my preference. Not sure how 250ms was chosen, probably to be slightly above `MaxGCPauseMillis`? Should document the reasoning a bit. Let's not grab more mutexes during safepoint. This is opportunistic feature, we should not risk deadlock/longer safepoints. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258282439 From stuefe at openjdk.org Mon Jul 10 13:53:36 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 10 Jul 2023 13:53:36 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: - Make test spikes more pronounced - Dont query procfs if logging is off - rename logtag again - When probing for safepoint end, use the smaller of (interval, 250ms) - Remove TrimNativeHeap and expand TrimNativeHeapInterval - Improve comments for non-supportive platforms - Aleksey cosmetics - suspend count return 16 bits - Fix linker errors - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap - ... and 22 more: https://git.openjdk.org/jdk/compare/a892b88f...15566761 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/aa4dbc0b..15566761 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=06-07 Stats: 2080 lines in 82 files changed: 1021 ins; 878 del; 181 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Mon Jul 10 13:53:39 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 10 Jul 2023 13:53:39 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Sat, 8 Jul 2023 08:01:32 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Add test with 1ms trim interval > - No need for atomics New Version: Mostly all feedback for @shipilev. Other than that: - slightly changed the way we log when trimming (one line only) - only calculated RSS reduction if we actually log, saves an access to procfs, which may matter for small intervals - slight test tweaks Re-executed the tests for release, fastdebug, fastdebug x86. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1629006971 From stuefe at openjdk.org Mon Jul 10 13:54:27 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 10 Jul 2023 13:54:27 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v7] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <9Y8PuWRM5tHL9gPGjZ3K991gSFH4AJZqD2tQVZbCQHA=.0a711197-4c7b-4bc4-93b0-1cf53a706261@github.com> Message-ID: On Mon, 10 Jul 2023 13:38:50 GMT, Aleksey Shipilev wrote: >> But then we have a problem for larger trim intervals. Loosing one or multiple trim attempts because a safepoint happened to happen hurts if the interval is e.g. 5 minutes. >> >> We could either wait for `MIN2(TrimNativeHeapInterval, safepoint_poll_ms)`. >> >> Or, at the cost of one Mutex grab per safepoint, I could do a `notify_all()` at the end of a safepoint. > > Yes, waiting for `MIN2(TNHI, )` would be my preference. Not sure how 250ms was chosen, probably to be slightly above `MaxGCPauseMillis`? Should document the reasoning a bit. > > Let's not grab more mutexes during safepoint. This is opportunistic feature, we should not risk deadlock/longer safepoints. It was arbitrarily chosen to have a higher chance of "slipping" in between safepoints for larger trim intervals, but not be too small to save CPU. Admittedly I thought about this less time that this sentence takes writing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258304917 From coleenp at openjdk.org Mon Jul 10 14:05:08 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 10 Jul 2023 14:05:08 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 00:15:01 GMT, Jiangli Zhou wrote: > Move StringTable to JavaClassFile namespace. This seems fine with me. We don't seem to have a namespace name convention in the style guide (was expecting lower case). But this would be an appropriate name to move SymbolTable to also if need be. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14808#pullrequestreview-1522169606 From duke at openjdk.org Mon Jul 10 14:07:22 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Jul 2023 14:07:22 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Mon, 10 Jul 2023 13:53:36 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: > > - Make test spikes more pronounced > - Dont query procfs if logging is off > - rename logtag again > - When probing for safepoint end, use the smaller of (interval, 250ms) > - Remove TrimNativeHeap and expand TrimNativeHeapInterval > - Improve comments for non-supportive platforms > - Aleksey cosmetics > - suspend count return 16 bits > - Fix linker errors > - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap > - ... and 22 more: https://git.openjdk.org/jdk/compare/061a10c7...15566761 src/hotspot/share/runtime/trimNativeHeap.cpp line 139: > 137: double t2 = now(); > 138: if (sc.after != SIZE_MAX) { > 139: const size_t delta = sc.after < sc.before ? (sc.before - sc.after) : (sc.after - sc.before); @tstuefe under what situations can `sc.after` be more than `sc.before` after trimming? Is it to handle the case where memory allocations happened in-between the malloc_trim() and the calls to get process memory? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258323486 From pchilanomate at openjdk.org Mon Jul 10 14:45:45 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 10 Jul 2023 14:45:45 GMT Subject: [jdk21] RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite Message-ID: Clean backport of [JDK-8302351](https://bugs.openjdk.org/browse/JDK-8302351). Thanks, Patricio ------------- Commit messages: - Backport 0c86c31bccd676e1cfbd35898ee16e89d5752688 Changes: https://git.openjdk.org/jdk21/pull/106/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=106&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8302351 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk21/pull/106.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/106/head:pull/106 PR: https://git.openjdk.org/jdk21/pull/106 From duke at openjdk.org Mon Jul 10 14:55:07 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Mon, 10 Jul 2023 14:55:07 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 05:35:53 GMT, Ioi Lam wrote: >>> > I first ran java -Xshare:dump so all the subsequent java --version runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. >>> >>> Okay, thanks for clarifying. I thought `java --version` runs were using the default archive. >> >> I haven't done any optimizations yet, but I fixed a few problems in the slow-path code. >> >> https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 >> >> >> # Before: no relocation >> $ perf stat -r 40 java --version > /dev/null >> 0.015872 +- 0.000238 seconds time elapsed ( +- 1.50% ) >> >> # Before: force relocation (quick) >> $ perf stat -r 40 java -Xmx4g --version > /dev/null >> 0.016691 +- 0.000385 seconds time elapsed ( +- 2.31% ) >> >> # Before: force relocation ("quick relocation not possible") >> $ perf stat -r 40 java -Xmx2g --version > /dev/null >> 0.017385 +- 0.000230 seconds time elapsed ( +- 1.32% ) >> >> # After >> $ perf stat -r 40 java -XX:+NewArchiveHeapLoading --version > /dev/null >> 0.018780 +- 0.000225 seconds time elapsed ( +- 1.20% ) >> >> >> So the slow path is just about 3ms slower than the fastest "before" case. >> >> Looking at the detailed timing break down (`os::thread_cpu_time()` = ns): >> >> >> $ java -XX:+NewArchiveHeapLoading -Xlog:cds+gc --version >> [0.006s][info][cds,gc] Num objs : 24184 >> [0.006s][info][cds,gc] Num bytes : 1074640 >> [0.006s][info][cds,gc] Per obj bytes : 44 >> [0.006s][info][cds,gc] Num references (incl nulls) : 87109 >> [0.006s][info][cds,gc] Num references relocated : 43225 >> [0.006s][info][cds,gc] Allocation Time : 1605084 <<<< A >> [0.006s][info][cds,gc] Relocation Time : 1246894 >> [0.006s][info][cds,gc] Table(s) dispose Time : 1306 >> >> $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=2 -Xlog:cds+gc --version >> [0.006s][info][cds,gc] Allocation Time : 2203781 <<<< B >> >> $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=-1 -Xlog:cds+gc --version >> [0.003s][info][cds,gc] Allocation Time : 282125 <<<< C >> >> $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=0 -Xlog:cds+gc --version >> [0.004s][inf... > >> I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled. > > I've implemented a fast relocation lookup. It currently uses a table of the same size as the archived heap objects, but I can reduce that to 1/2 the size. > > See https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 > > This is implemented by about 330 lines of code in archiveHeapLoader.cpp. The code is templatized to try out different approaches (like `-XX:+NahlRawAlloc` and `-XX:+NahlUseAccessAPI`), so it can be further simplified. > > There's only one thing that's not yet implemented -- the equivalent of `ArchiveHeapLoader::patch_native_pointers()`. I'll do that next. > > > $ java -XX:+NewArchiveHeapLoading -Xmx128m -Xlog:cds+gc --version > [0.004s][info][cds,gc] Delayed allocation records alloced: 640 > [0.004s][info][cds,gc] Load Time: 1388458 > > > The whole allocation + reloc takes about 1.4ms. It's about 1.25ms slower in the worst case (when the "old" code doesn't have to relocate -- see the `(**)` in the table below). It's 0.8ms slower when the "old" code has to relocate. > > > All times are in ms, for "java --version" > > ==================================== > Dump: java -Xshare:dump -Xmx128m > > G1 old new diff > 128m 14.476 15.754 +1.277 (**) > 8192m 15.359 16.085 +0.726 > > > Serial old new > 128m 13.442 14.241 +0.798 > 8192m 13.740 14.532 +0.791 > > ==================================== > Dump: java -Xshare:dump -Xmx8192m > > G1 old new diff > 128m 14.975 15.787 +0.812 > 2048m 16.239 17.035 +0.796 > 8192m 14.821 16.042 +1.221 (**) > > > Serial old new > 128m 13.444 14.167 +0.723 > 8192m 13.717 14.502 +0.785 > > > While the code is slower than before, it's a lot simpler. It works on all collectors. I tested on ZGC, but I think Shenandoah should work as well. > > The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future. > > The extra memory cost is: > > - a temporary in-memory copy of the archived heap objects > - a temporary table of 1/2 the size of the archived heap objects > > The former can be reduced by readi... @iklam can you please elaborate a bit on relocation optimizations being done by the patch. Without any background on the idea, it is difficult to infer it from the code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1629132283 From coleenp at openjdk.org Mon Jul 10 15:10:25 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 10 Jul 2023 15:10:25 GMT Subject: [jdk21] RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 14:23:18 GMT, Patricio Chilano Mateo wrote: > Clean backport of [JDK-8302351](https://bugs.openjdk.org/browse/JDK-8302351). > > Thanks, > Patricio Looks good. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk21/pull/106#pullrequestreview-1522314384 From eastigeevich at openjdk.org Mon Jul 10 15:19:18 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 10 Jul 2023 15:19:18 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > 8307352: AARCH64: Improve itable_stub src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1233: > 1231: ldr(temp_itbl_klass, Address(recv_klass, scan_temp, Address::lsl(3))); > 1232: mov(holder_offset, zr); > 1233: lea(scan_temp, Address(recv_klass, scan_temp, Address::lsl(3))); The comments to the code help little to understand the origins of the generated code. I understand the generated code corresponds to: temp_itbl_klass = (itableOffsetEntry*)(recv_klass->start_of_itable())[0].interface_klass() which is temp_itbl_klass = *(recv_klass + Klass::vtable_start_offset() + sizeof(intptr_t)*recv_klass->_vtable_length + itableOffsetEntry::offset_offset()) You store `recv_klass + Klass::vtable_start_offset() +itableOffsetEntry::offset_offset()` into `recv_klass`. BTW, you assume `sizeof(intptr_t) == 8` to use `Address::lsl(3)`. Should this be mentioned in comments? You store `recv_klass + sizeof(intptr_t)*recv_klass->_vtable_length` into `scan_temp` to access `itable[i].interface_klass()`. I think the code is worth adding more comments. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1258423029 From pchilanomate at openjdk.org Mon Jul 10 16:00:19 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 10 Jul 2023 16:00:19 GMT Subject: [jdk21] RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 15:06:44 GMT, Coleen Phillimore wrote: > Looks good. > Thanks for the review Coleen! ------------- PR Comment: https://git.openjdk.org/jdk21/pull/106#issuecomment-1629244380 From iklam at openjdk.org Mon Jul 10 16:15:14 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 10 Jul 2023 16:15:14 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 05:35:53 GMT, Ioi Lam wrote: >>> > I first ran java -Xshare:dump so all the subsequent java --version runs use the same heap size as dump time. As a result, my "before" runs had a heap relocation delta of zero, which should correspond to the best start-up time. >>> >>> Okay, thanks for clarifying. I thought `java --version` runs were using the default archive. >> >> I haven't done any optimizations yet, but I fixed a few problems in the slow-path code. >> >> https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 >> >> >> # Before: no relocation >> $ perf stat -r 40 java --version > /dev/null >> 0.015872 +- 0.000238 seconds time elapsed ( +- 1.50% ) >> >> # Before: force relocation (quick) >> $ perf stat -r 40 java -Xmx4g --version > /dev/null >> 0.016691 +- 0.000385 seconds time elapsed ( +- 2.31% ) >> >> # Before: force relocation ("quick relocation not possible") >> $ perf stat -r 40 java -Xmx2g --version > /dev/null >> 0.017385 +- 0.000230 seconds time elapsed ( +- 1.32% ) >> >> # After >> $ perf stat -r 40 java -XX:+NewArchiveHeapLoading --version > /dev/null >> 0.018780 +- 0.000225 seconds time elapsed ( +- 1.20% ) >> >> >> So the slow path is just about 3ms slower than the fastest "before" case. >> >> Looking at the detailed timing break down (`os::thread_cpu_time()` = ns): >> >> >> $ java -XX:+NewArchiveHeapLoading -Xlog:cds+gc --version >> [0.006s][info][cds,gc] Num objs : 24184 >> [0.006s][info][cds,gc] Num bytes : 1074640 >> [0.006s][info][cds,gc] Per obj bytes : 44 >> [0.006s][info][cds,gc] Num references (incl nulls) : 87109 >> [0.006s][info][cds,gc] Num references relocated : 43225 >> [0.006s][info][cds,gc] Allocation Time : 1605084 <<<< A >> [0.006s][info][cds,gc] Relocation Time : 1246894 >> [0.006s][info][cds,gc] Table(s) dispose Time : 1306 >> >> $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=2 -Xlog:cds+gc --version >> [0.006s][info][cds,gc] Allocation Time : 2203781 <<<< B >> >> $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=-1 -Xlog:cds+gc --version >> [0.003s][info][cds,gc] Allocation Time : 282125 <<<< C >> >> $ java -XX:+NewArchiveHeapLoading -XX:NewArchiveHeapNumAllocs=0 -Xlog:cds+gc --version >> [0.004s][inf... > >> I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled. > > I've implemented a fast relocation lookup. It currently uses a table of the same size as the archived heap objects, but I can reduce that to 1/2 the size. > > See https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 > > This is implemented by about 330 lines of code in archiveHeapLoader.cpp. The code is templatized to try out different approaches (like `-XX:+NahlRawAlloc` and `-XX:+NahlUseAccessAPI`), so it can be further simplified. > > There's only one thing that's not yet implemented -- the equivalent of `ArchiveHeapLoader::patch_native_pointers()`. I'll do that next. > > > $ java -XX:+NewArchiveHeapLoading -Xmx128m -Xlog:cds+gc --version > [0.004s][info][cds,gc] Delayed allocation records alloced: 640 > [0.004s][info][cds,gc] Load Time: 1388458 > > > The whole allocation + reloc takes about 1.4ms. It's about 1.25ms slower in the worst case (when the "old" code doesn't have to relocate -- see the `(**)` in the table below). It's 0.8ms slower when the "old" code has to relocate. > > > All times are in ms, for "java --version" > > ==================================== > Dump: java -Xshare:dump -Xmx128m > > G1 old new diff > 128m 14.476 15.754 +1.277 (**) > 8192m 15.359 16.085 +0.726 > > > Serial old new > 128m 13.442 14.241 +0.798 > 8192m 13.740 14.532 +0.791 > > ==================================== > Dump: java -Xshare:dump -Xmx8192m > > G1 old new diff > 128m 14.975 15.787 +0.812 > 2048m 16.239 17.035 +0.796 > 8192m 14.821 16.042 +1.221 (**) > > > Serial old new > 128m 13.444 14.167 +0.723 > 8192m 13.717 14.502 +0.785 > > > While the code is slower than before, it's a lot simpler. It works on all collectors. I tested on ZGC, but I think Shenandoah should work as well. > > The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future. > > The extra memory cost is: > > - a temporary in-memory copy of the archived heap objects > - a temporary table of 1/2 the size of the archived heap objects > > The former can be reduced by readi... > @iklam can you please elaborate a bit on relocation optimizations being done by the patch. Without any background on the idea, it is difficult to infer it from the code. The algorithm tries to materialize all objects and relocate their oop fields in a single pass. (Each object has a "stream address" (its location in the input stream) and a "materialized address" (its location in the runtime heap)) - Materialize one object from the input stream - Enter the materialized address of this object in the `reloc_table`. Since the input stream is contiguous, we can index `reloc_table` by computing the offset of the `stream` address of this object to the bottom of the input stream. - For each non-null oop pointer in the materialized object: - If the pointee's stream address is lower than that of the current object, update the pointer with the pointee's materialized address, which is already in `reloc_table` - Otherwise, enter the location of this pointer into `reloc_table`, as a linked-list of the `Dst` type, at the `reloc_table` of the pointee. When the pointee is materialized, it walks its own `Dst` list, and relocate all pointers to itself. My branch contains a separate patch for [JDK-8251330](https://bugs.openjdk.org/browse/JDK-8251330) -- the input stream is ordered such that: - the first 50% of the input stream contains no pointers, so relocation can be skipped altogether - in the remaining input stream, about 90% of the 43225 pointers are pointing below the current object, so they can be relocated quickly. Less than 640 `Dst` are needed for the "delayed relocation". ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1629268586 From duke at openjdk.org Mon Jul 10 16:31:18 2023 From: duke at openjdk.org (sid8606) Date: Mon, 10 Jul 2023 16:31:18 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 12:24:10 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address Amit's review comments > > src/hotspot/cpu/s390/upcallLinker_s390.cpp line 243: > >> 241: case T_CHAR: >> 242: case T_INT: >> 243: __ z_lgfr(Z_RET, Z_RET); // Clear garbage in high half. > > Same as PPC here; this should really be done on the Java side. (See: https://github.com/openjdk/jdk/pull/12708#issuecomment-1440433079 and related discussion) I plan to fix it later once done with experiments with this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1258555188 From jiangli at openjdk.org Mon Jul 10 16:40:06 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 10 Jul 2023 16:40:06 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: > Move StringTable to JavaClassFile namespace. Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into JDK-8311661 - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14808/files - new: https://git.openjdk.org/jdk/pull/14808/files/cb9a4a22..c31ad972 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14808&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14808&range=00-01 Stats: 594 lines in 42 files changed: 293 ins; 179 del; 122 mod Patch: https://git.openjdk.org/jdk/pull/14808.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14808/head:pull/14808 PR: https://git.openjdk.org/jdk/pull/14808 From jiangli at openjdk.org Mon Jul 10 16:40:08 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 10 Jul 2023 16:40:08 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: <5-v5-ztURBv_BUUG3D3NmJyUk3qOKrO92KlKzzzsiV0=.b73cba5b-632c-4867-b4ed-fe8fb8b636cc@github.com> On Mon, 10 Jul 2023 14:02:14 GMT, Coleen Phillimore wrote: > This seems fine with me. We don't seem to have a namespace name convention in the style guide (was expecting lower case). But this would be an appropriate name to move SymbolTable to also if need be. Thanks for the review, Coleen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629320659 From jvernee at openjdk.org Mon Jul 10 16:50:59 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 10 Jul 2023 16:50:59 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee Tests came back green ------------- Marked as reviewed by jvernee (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14801#pullrequestreview-1522544318 From vlivanov at openjdk.org Mon Jul 10 17:04:30 2023 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 10 Jul 2023 17:04:30 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v21] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: <1s2gGfT_bsyz2mtAr3UFbXKlXniyiK2Hk4lZmBm_Crk=.89639816-872b-436f-9863-d5044e4a9ea5@github.com> On Thu, 6 Jul 2023 13:06:30 GMT, Cesar Soares Lucas wrote: >> Can I please get reviews for this PR? >> >> The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. >> >> With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) >> >> What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) >> >> This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. >> >> The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. >> >> The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. >> >> I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into rematerialization-of-merges > - Addressing PR feedback. > - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Merge branch 'openjdk:master' into rematerialization-of-merges > - Rome minor refactorings. > - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > Catching up with master. > - Address PR review 6: debug format output & some refactoring. > - Catching up with master branch. > > Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Address PR review 6: refactoring around rematerialization & improve test cases. > - Address PR review 5: refactor on rematerialization & add tests. > - ... and 12 more: https://git.openjdk.org/jdk/compare/97e99f01...25b683d6 The patch looks good. I resubmitted testing with the latest version and the results are clean. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/12897#pullrequestreview-1522564439 From mgronlun at openjdk.org Mon Jul 10 17:25:23 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 10 Jul 2023 17:25:23 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v4] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: commentary ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/07428b28..76b27a4d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=02-03 Stats: 16 lines in 2 files changed: 11 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From egahlin at openjdk.org Mon Jul 10 18:32:13 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 10 Jul 2023 18:32:13 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v4] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 17:25:23 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. >> >> Testing: jdk_jfr, stress testing. >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > commentary Looks good, but the intrinsics need to be reviewed by the compiler team. ------------- Marked as reviewed by egahlin (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14761#pullrequestreview-1522693975 From kvn at openjdk.org Mon Jul 10 19:55:07 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 10 Jul 2023 19:55:07 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v21] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Thu, 6 Jul 2023 13:06:30 GMT, Cesar Soares Lucas wrote: >> Can I please get reviews for this PR? >> >> The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. >> >> With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) >> >> What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) >> >> This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. >> >> The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. >> >> The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. >> >> I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into rematerialization-of-merges > - Addressing PR feedback. > - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Merge branch 'openjdk:master' into rematerialization-of-merges > - Rome minor refactorings. > - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > Catching up with master. > - Address PR review 6: debug format output & some refactoring. > - Catching up with master branch. > > Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Address PR review 6: refactoring around rematerialization & improve test cases. > - Address PR review 5: refactor on rematerialization & add tests. > - ... and 12 more: https://git.openjdk.org/jdk/compare/97e99f01...25b683d6 The final version looks good to me too. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/12897#pullrequestreview-1522912018 From jvernee at openjdk.org Mon Jul 10 21:25:04 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 10 Jul 2023 21:25:04 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking The JBS issue says: "That avoids having to fix the application or libraries code, which may not be practical." but I'm not sure how changing hotspot every time an application defines a conflicting symbol is more practical? This seems like a band-aid to make one particular application work. I think a more principled solution is needed that avoids symbol conflicts going forward, if we are serious about supporting static linking of hotspot. For instance, the symbols that _should_ be exported from hotspot are found in several files in make/data/hotspot-symbols. Is it possible to limit the symbols exported/visible in the static library to those symbols instead? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629746328 From coleenp at openjdk.org Mon Jul 10 22:14:13 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 10 Jul 2023 22:14:13 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking How about the namespace being HotSpotClassfile (StringTable is in the classfile directory) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629799696 From jiangli at openjdk.org Mon Jul 10 22:30:15 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 10 Jul 2023 22:30:15 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 21:19:05 GMT, Jorn Vernee wrote: >> Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8311661 >> - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. >> - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking > > The JBS issue says: "That avoids having to fix the application or libraries code, which may not be practical." but I'm not sure how changing hotspot every time an application defines a conflicting symbol is more practical? This patch seems like a band-aid to make one particular application work. I think a more principled solution is needed that avoids symbol conflicts going forward, if we are serious about supporting static linking of hotspot. > > For instance, the symbols that _should_ be exported from hotspot are found in several files in make/data/hotspot-symbols. Is it possible to limit the symbols exported/visible in the static library to those symbols instead? @JornVernee thanks for looking into this. > The JBS issue says: "That avoids having to fix the application or libraries code, which may not be practical." but I'm not sure how changing hotspot every time an application defines a conflicting symbol is more practical? This patch seems like a band-aid to make one particular application work. I think a more principled solution is needed that avoids symbol conflicts going forward, if we are serious about supporting static linking of hotspot. Changing the application or libraries code is not practical in this case as we don't have the full control of what applications or libraries may be linked with. In some cases, developers may not have access to the actual code or required insights to resolve the linkage failures. Fixing the hotspot code using namespace in this case resolves the issue from the source at once, which makes this as a more principled approach. > > For instance, the symbols that _should_ be exported from hotspot are found in several files in make/data/hotspot-symbols. Is it possible to limit the symbols exported/visible in the static library to those symbols instead? The linking failure involved with duplicate symbol issue concerns internal linkage vs. global linkage, which is different from symbol exporting, IIUC. When duplicating symbol with global linkages from different compilation units linked together, it causes the linking failure. On the other hand, exporting a symbol putting the symbol in the export table, so the symbol can be dynamically looked up. Please see some detailed discussions in: https://github.com/openjdk/jdk/pull/13451#discussion_r1166157315 https://github.com/openjdk/jdk/pull/13451#discussion_r1166209365 https://github.com/openjdk/jdk/pull/13451#discussion_r1166283513 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629814917 From jiangli at openjdk.org Mon Jul 10 22:33:11 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 10 Jul 2023 22:33:11 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: <-w0TpWxSW9CR-Ooh1tWC4c7pDMtuuzliFxR16KiYycM=.29f921b4-f97e-4154-bb7a-f9b988e7a916@github.com> On Mon, 10 Jul 2023 22:11:26 GMT, Coleen Phillimore wrote: > How about the namespace being HotSpotClassfile (StringTable is in the classfile directory) `HotSpotClassfile` sounds ok. I did some search and `HotSpotClassfile` is not used by anyone. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629817584 From jvernee at openjdk.org Mon Jul 10 22:52:12 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 10 Jul 2023 22:52:12 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: <9IerwETUxPNI933g1f3XRjtyWYn0tx6nZfkGkR8rOu8=.52f4af41-243c-442a-a6d2-08465c8748b6@github.com> On Mon, 10 Jul 2023 22:27:12 GMT, Jiangli Zhou wrote: > Changing the application or libraries code is not practical in this case as we don't have the full control of what applications or libraries may be linked with. In some cases, developers may not have access to the actual code or required insights to resolve the linkage failures. Fixing the hotspot code using namespace in this case resolves the issue from the source at once, which makes this as a more principled approach. I'm not suggesting that the application code should be changed. I'm suggesting that a broader solution should be found that avoids this issue in the future, rather than having to patch hotspot again for the next application that comes along and defines a symbol that conflicts. That's what I mean by "principled". I.e. we should be able to explain how to avoid symbol conflicts when statically linking applications against hotspot. I don't think that telling users that, if they have a symbol conflict: "just file a bug report and we will patch hotspot to use a different symbol" is good enough. For instance, we could put all of hotspot into a `HotSpot` namespace, and then document this name space name and require that applications that want to statically link against hotspot don't use the `HotSpot` name space. That would also avoid having to fix up uses of these symbols inside hotspot itself, since things in the same namespace can refer to each other already, without needing a qualifier. > The linking failure involved with duplicate symbol issue concerns internal linkage vs. global linkage, which is different from symbol exporting, IIUC. When duplicating symbol with global linkages from different compilation units linked together, it causes the linking failure. On the other hand, exporting a symbol putting the symbol in the export table, so the symbol can be dynamically looked up. Please see some detailed discussions in: > > [#13451 (comment)](https://github.com/openjdk/jdk/pull/13451#discussion_r1166157315) [#13451 (comment)](https://github.com/openjdk/jdk/pull/13451#discussion_r1166209365) [#13451 (comment)](https://github.com/openjdk/jdk/pull/13451#discussion_r1166283513) Thanks for the pointers. So it sounds like limiting the symbols 'exported' by a static library is not really possible. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629833628 From jiangli at openjdk.org Mon Jul 10 23:28:12 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 10 Jul 2023 23:28:12 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: <9IerwETUxPNI933g1f3XRjtyWYn0tx6nZfkGkR8rOu8=.52f4af41-243c-442a-a6d2-08465c8748b6@github.com> References: <9IerwETUxPNI933g1f3XRjtyWYn0tx6nZfkGkR8rOu8=.52f4af41-243c-442a-a6d2-08465c8748b6@github.com> Message-ID: On Mon, 10 Jul 2023 22:49:02 GMT, Jorn Vernee wrote: > I'm not suggesting that the application code should be changed. I'm suggesting that a broader solution should be found that avoids this issue in the future, rather than having to patch hotspot again for the next application that comes along and defines a symbol that conflicts. That's what I mean by "principled". I.e. we should be able to explain how to avoid symbol conflicts when statically linking applications against hotspot. Thanks for the clarification, @JornVernee. > I don't think that telling users that, if they have a symbol conflict: "just file a bug report and we will patch hotspot to use a different symbol" is good enough. Agreed. > > For instance, we could put all of hotspot into a `HotSpot` namespace, and then document this name space name and require that applications that want to statically link against hotspot don't use the `HotSpot` name space. That would also avoid having to fix up uses of these symbols inside hotspot itself, since things in the same namespace can refer to each other already, without needing a qualifier. I think using a hotspot special namespace can be a good general solution for resolving and avoiding any potential duplication symbol issue like this. The good news is that with our prototype and testing on JDK 11, we didn't find many duplicate symbol issues with hotspot code specifically. So far we only observed issues with: - `StringTable::StringTable` - `Thread` - `ProfileData` > Thanks for the pointers. So it sounds like limiting the symbols 'exported' by a static library is not really possible. Right. We've looked into following solutions for linkage failures due to symbol conflicts and each solution may be used in different cases when applicable: - Changing to static function/variable, so the symbol has internal linkage (as suggested by @AlanBateman in earlier code review for symbol issues in JDK library code) - Use namespace - Symbol (function, variable) renaming ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629864824 From jiangli at openjdk.org Tue Jul 11 00:16:20 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 00:16:20 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: <-w0TpWxSW9CR-Ooh1tWC4c7pDMtuuzliFxR16KiYycM=.29f921b4-f97e-4154-bb7a-f9b988e7a916@github.com> References: <-w0TpWxSW9CR-Ooh1tWC4c7pDMtuuzliFxR16KiYycM=.29f921b4-f97e-4154-bb7a-f9b988e7a916@github.com> Message-ID: On Mon, 10 Jul 2023 22:30:29 GMT, Jiangli Zhou wrote: > > How about the namespace being HotSpotClassfile (StringTable is in the classfile directory) > > `HotSpotClassfile` sounds ok. > > I did some search and `HotSpotClassfile` is not used by anyone. I'll change the namespace from `JavaClassFile` to `HotSpotClassFile` as Coleen suggested. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629902921 From coleenp at openjdk.org Tue Jul 11 00:16:22 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 00:16:22 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking What solutions do you have to resolve "Thread" and "ProfileData" ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629904499 From jiangli at openjdk.org Tue Jul 11 00:28:54 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 00:28:54 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 00:13:19 GMT, Coleen Phillimore wrote: > What solutions do you have to resolve "Thread" and "ProfileData" ? For the `Thread` symbol issue, we simply #defined the symbol to `HotspotBaseThread` in globalDefinitions.hpp in our prototype on JDK 11, to avoid large delta. It would a large change (in terms of touched files and lines) with namespace change. I'll file a bug for the Thread symbol for discussion. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629914710 From jiangli at openjdk.org Tue Jul 11 00:44:13 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 00:44:13 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 00:26:24 GMT, Jiangli Zhou wrote: > > What solutions do you have to resolve "Thread" and "ProfileData" ? > > For the `Thread` symbol issue, we simply #defined the symbol to `HotspotBaseThread` in globalDefinitions.hpp in our prototype on JDK 11, to avoid large delta. It would a large change (in terms of touched files and lines) with namespace change. I'll file a bug for the Thread symbol for discussion. https://bugs.openjdk.org/browse/JDK-8311846 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629924519 From coleenp at openjdk.org Tue Jul 11 01:06:16 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 01:06:16 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 05:18:01 GMT, Jean-Philippe Bempel wrote: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. ConstantPool merging is still really complicated and JVM_CONSTANT_Class/UnresolvedClass (and UnresolvedClassInError) may have changed to not track with what RedefineClasses does. If I'm reading this correctly, we are always calling find_matching_entry() and compare_entries_to() between the merged_cp and scratch_cp. Since we unresolve classes in merge_cp, and scratch_cp hasn't resolved anything yet, how are we getting this mismatch? Can you post a stack trace for us? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1629940553 From coleenp at openjdk.org Tue Jul 11 01:37:27 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 01:37:27 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking Ok, yeah, I'd hate for Thread to get a namespace:: prepended to the million uses of it. Maybe the same approach should be used for StringTable too, since they're solving the same problem. Is there an #if STATIC_LINK or something that surrounds this #define? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1629965664 From iklam at openjdk.org Tue Jul 11 04:26:12 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 11 Jul 2023 04:26:12 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking If there are only 3 problematic symbols, wouldn't it be easier to configure the JDK with bash configure --with-extra-cxxflags='-DStringTable=HotSpotStringTable -DThread=HotSpotThread -DProfileData=HotSpotProfileData' That way, you don't need to submit a new PR every time there's a new conflict. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1630091307 From jiangli at openjdk.org Tue Jul 11 05:08:54 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 05:08:54 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 01:34:13 GMT, Coleen Phillimore wrote: > Ok, yeah, I'd hate for Thread to get a namespace:: prepended to the million uses of it. Maybe the same approach should be used for StringTable too, since they're solving the same problem. `#define` renaming is a good workaround to avoid editing larger number of files and reducing integration/merge issues. Using namespace seems to be a better choice in the JDK mainline. We can also reduce the needed edits with `using`. > Is there an #if STATIC_LINK or something that surrounds this #define? We are proposing removing `#ifdef STATIC_BUILD` in https://github.com/jianglizhou/jdk/tree/static-java (not complete). That allows building `.so` and `.a` from the same set of object files. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1630123619 From jiangli at openjdk.org Tue Jul 11 05:13:20 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 05:13:20 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 04:23:08 GMT, Ioi Lam wrote: > If there are only 3 problematic symbols, wouldn't it be easier to configure the JDK with > > ``` > bash configure --with-extra-cxxflags='-DStringTable=HotSpotStringTable -DThread=HotSpotThread -DProfileData=HotSpotProfileData' > ``` > > That way, you don't need to submit a new PR every time there's a new conflict. Thanks, I haven't explored that option. Seems to be a useful quick fix when experimenting with JDK static linking. As we are working towards a longer term and more general solution, resolving the issues in hotspot code seems to be better. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1630130242 From stuefe at openjdk.org Tue Jul 11 05:15:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 05:15:21 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Mon, 10 Jul 2023 14:04:33 GMT, Ashutosh Mehra wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: >> >> - Make test spikes more pronounced >> - Dont query procfs if logging is off >> - rename logtag again >> - When probing for safepoint end, use the smaller of (interval, 250ms) >> - Remove TrimNativeHeap and expand TrimNativeHeapInterval >> - Improve comments for non-supportive platforms >> - Aleksey cosmetics >> - suspend count return 16 bits >> - Fix linker errors >> - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap >> - ... and 22 more: https://git.openjdk.org/jdk/compare/eb3249a7...15566761 > > src/hotspot/share/runtime/trimNativeHeap.cpp line 139: > >> 137: double t2 = now(); >> 138: if (sc.after != SIZE_MAX) { >> 139: const size_t delta = sc.after < sc.before ? (sc.before - sc.after) : (sc.after - sc.before); > > @tstuefe under what situations can `sc.after` be more than `sc.before` after trimming? Is it to handle the case where memory allocations happened in-between the malloc_trim() and the calls to get process memory? Yes. The numbers we print out are RSS; there is no way to get the exact figure for how much memory has been released by the glibc (not even itself knows). RSS is influenced by many factors, and its a rough outline only. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1259195531 From cjplummer at openjdk.org Tue Jul 11 05:25:13 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 11 Jul 2023 05:25:13 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA In-Reply-To: References: Message-ID: <4S1x7PphEjaz7kCe2uJTmFyaIccpEIn9fh52Sr4neMg=.6a3dac9e-feb0-459e-8972-e7cdbb48855e@github.com> On Fri, 30 Jun 2023 14:32:03 GMT, Ashutosh Mehra wrote: > Please review this PR that enables ClassWriter to write annotations to the class file being dumped. > > The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. > The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. Overall the changes look good. Just a few minor suggestions. src/hotspot/share/runtime/vmStructs.cpp line 972: > 970: unchecked_nonstatic_field(Array, _data, sizeof(Klass*)) \ > 971: unchecked_nonstatic_field(Array, _data, sizeof(ResolvedIndyEntry)) \ > 972: unchecked_nonstatic_field(Array*>, _data, sizeof(Array*)) \ Fix alignment of the _data column. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Annotations.java line 74: > 72: public U1Array getFieldAnnotations(int fieldIndex) { > 73: Address addr = fieldsAnnotations.getValue(getAddress()); > 74: ArrayOfU1Array annotationsArray = VMObjectFactory.newObject(ArrayOfU1Array.class, addr); How about caching this result so you don't need to allocate a new object every time this API is called. Same thing in `getFieldTypeAnnotations()`. src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 451: > 449: offset += 1; > 450: } > 451: Address addr = getAddress().getAddressAt((getSize() - offset) * VM.getVM().getAddressSize()); A comment on the address computation would be useful here and in the changes below. ------------- Changes requested by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14735#pullrequestreview-1507639071 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1248137517 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1259190786 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1259199965 From stuefe at openjdk.org Tue Jul 11 07:45:05 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 07:45:05 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 14:32:03 GMT, Ashutosh Mehra wrote: > Please review this PR that enables ClassWriter to write annotations to the class file being dumped. > > The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. > The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. Hi @ashu-mehra, looks mostly good. Some nits inline. I tried it out and it works fine. What would be needed to make the Annotations appear in the "printall" command? I was somehow expecting to see at least something like "Annotation at xxxx". Cheers, Thomas src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 470: > 468: if (hasParameterAnnotations()) { > 469: offset += 1; > 470: } Code here and in other places could be tightened: int offset = (hasMethodAnnotations() ? 1 : 0) + (hasParameterAnnotations() ? 1 : 0) + (hasTypeAnnotations() ? 1 : 0); src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 494: > 492: offset += 1; > 493: } > 494: Address addr = getAddress().getAddressAt((getSize() - offset) * VM.getVM().getAddressSize()); Factor this address calculation out, and as @plummercj remarked, comment it? src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java line 874: > 872: return null; > 873: } > 874: } Would calling these functions even be valid to call if Annotations are not present? If yes, maybe return Optional? But since the rest of the code does not use Optional either, maybe rather keep the code the same. Up to you, feel free to ignore this. ------------- PR Review: https://git.openjdk.org/jdk/pull/14735#pullrequestreview-1523513001 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1259248538 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1259249446 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1259259800 From stuefe at openjdk.org Tue Jul 11 07:45:07 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 07:45:07 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA In-Reply-To: References: Message-ID: <-EeFleA4CD947u9g-IJBsRe9HpGSXLRUOxAkfP9L-0w=.32772548-808f-490b-bd09-8c6d91cf1913@github.com> On Tue, 11 Jul 2023 06:31:24 GMT, Thomas Stuefe wrote: >> Please review this PR that enables ClassWriter to write annotations to the class file being dumped. >> >> The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. >> >> Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` >> Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. >> The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 470: > >> 468: if (hasParameterAnnotations()) { >> 469: offset += 1; >> 470: } > > Code here and in other places could be tightened: > > > int offset = (hasMethodAnnotations() ? 1 : 0) + > (hasParameterAnnotations() ? 1 : 0) + > (hasTypeAnnotations() ? 1 : 0); Possibly even factor it out into separate functions like e.g. `offsetOfGenericSignatureIndex` does. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1259254940 From stuefe at openjdk.org Tue Jul 11 07:51:13 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 07:51:13 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: <0pppsqFlXMHIQPd91i6twOr-vt3JoxQt4wFwGrBqeYI=.b328e568-a114-4634-9bd8-858128e9aa18@github.com> On Tue, 4 Jul 2023 13:02:22 GMT, Thomas Stuefe wrote: >> Today, if we use UseTransparentHugePages, we assume that the *static* hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in `/proc/memlimit` "Hugepagesize") >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect: >> >> - whether THPs are enabled should be checked at `/sys/kernel/mm/transparent_hugepage/enabled`, which is a tri-state value ("always", "madvise", "never"). THPs are available for the first two states. >> - The page size employed by `khugepaged` is set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. It can differ from the default page size used for static hugepages. For example, we could configure a system such that it uses 1G static hugepages, but the THP page size would still be 2M. >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback johan May I have a second review? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1630323843 From vkempik at openjdk.org Tue Jul 11 08:38:02 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 11 Jul 2023 08:38:02 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v2] In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 05:50:06 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > Simplify tail for shrot string compare Now that 17u backport is done, can we please get back to this PR? we have got compare case for small strings reviewed, only left the case for large strings, where the case for UU/LL is pretty simple ( just replacing ld with misaligned_load) and the case for large UL/LU strings is the only complex thing ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1630392602 From duke at openjdk.org Tue Jul 11 09:10:17 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Tue, 11 Jul 2023 09:10:17 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli Message-ID: Please review this small change for slli and slli.uw slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) addi with 0 has higher chances to be just a register renaming and not utilise ALU at all We have observed small positive effect on hifive (and no change on thead). Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba performance on hifive, before: | Benchmark | Mode | Cnt | Score | | Error | Units | |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | with patch: | Benchmark | Mode | Cnt | Score | | Error | Units | |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | ------------- Commit messages: - initial commit Changes: https://git.openjdk.org/jdk/pull/14823/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14823&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311862 Stats: 31 lines in 4 files changed: 24 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14823/head:pull/14823 PR: https://git.openjdk.org/jdk/pull/14823 From luhenry at openjdk.org Tue Jul 11 09:20:01 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 11 Jul 2023 09:20:01 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 09:02:39 GMT, Ilya Gavrilin wrote: > Please review this small change for slli and slli.uw > slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) > addi with 0 has higher chances to be just a register renaming and not utilise ALU at all > We have observed small positive effect on hifive (and no change on thead). > Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available > testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba > > performance on hifive, before: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | > > with patch: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | LGTM ------------- Marked as reviewed by luhenry (Committer). PR Review: https://git.openjdk.org/jdk/pull/14823#pullrequestreview-1523829100 From eliu at openjdk.org Tue Jul 11 09:37:03 2023 From: eliu at openjdk.org (Eric Liu) Date: Tue, 11 Jul 2023 09:37:03 GMT Subject: RFR: 8306136: [vectorapi] Intrinsics of VectorMask.laneIsSet() In-Reply-To: References: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> Message-ID: On Wed, 21 Jun 2023 19:04:39 GMT, Paul Sandoz wrote: >> VectorMask.laneIsSet() [1] is implemented based on VectorMask.toLong() [2], and it's performance highly depends on the intrinsification of toLong(). However, if `toLong()` is failed to intrinsify, on some architectures or unsupported species, it's much more expensive than pure getBits(). Besides, some CPUs (e.g. with Arm Neon) may not have efficient instructions to implementation toLong(), so we propose to intrinsify VectorMask.laneIsSet separately. >> >> This patch optimize laneIsSet() by calling the existing intrinsic method VectorSupport.extract(), which actually does not introduce new intrinsic method. The C2 compiler intrinsification logic to support _VectorExtract has also been extended to better support laneIsSet(). It tries to extract the mask's lane value with an ExtractUB node if the hardware backend supports it. While on hardware without ExtractUB backend support , c2 will still try to generate toLong() related nodes, which behaves the same as before the patch. >> >> Key changes in this patch: >> >> 1. Reuse intrinsic `VectorSupport.extract()` in Java side. No new intrinsic method is introduced. >> 2. In compiler, `ExtractUBNode` is generated if backend support is. If not, the original "toLong" pattern is generated if it's implemented. Otherwise, it uses the default Java `getBits[i]` rather than the expensive and complicated toLong() based implementation. >> 3. Enable `ExtractUBNode` on AArch64 to extract the lane value for a vector mask in compiler, together with changing its bottom type to TypeInt::BOOL. This helps optimize the conditional selection generated by >> >> ``` >> >> public boolean laneIsSet(int i) { >> return VectorSupport.extract(..., defaultImpl) == 1L; >> } >> >> ``` >> >> [Test] >> hotspot:compiler/vectorapi and jdk/incubator/vector passed. >> >> [Performance] >> >> Below shows the performance gain on 128-bit vector size Neon machine. For 64 and 128 SPECIES, the improvment caused by this intrinsics. For other SPECIES which can not be intrinfied, performance gain comes from the default Java implementation changes, i.e. getBits[i] vs. toLong(). >> >> >> Benchmark Gain (after/before) >> microMaskLaneIsSetByte128_con 2.47 >> microMaskLaneIsSetByte128_var 1.82 >> microMaskLaneIsSetByte256_con 3.01 >> microMaskLaneIsSetByte256_var 3.04 >> microMaskLaneIsSetByte512_con 4.83 >> microMaskLaneIsSetByte512_var 4.86 >> microMaskLaneIsSetByt... > > Getting crashes on linux-x64-debug when using these VM options: > > -XX:UseAVX=3 -XX:+UnlockDiagnosticVMOptions -XX:+UseKNLSetting > > For a test tag of `tier3-vector-avx512` when running tests for `open/test/jdk/:jdk_vector`. > > Relevant bits from the HS error log file: > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (/opt/mach5/mesos/work_dir/slaves/cd627e65-f015-4fb1-a1d2-b6c9b8127f98-S9618/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/24abb99d-ff0d-447b-9153-bb7048d6d487/runs/d1fc7a0c-634f-4df4-9d5f-6a464bb177b9/workspace/open/src/hotspot/share/opto/vectorIntrinsics.cpp:2586), pid=1090716, tid=1090731 > # assert(!Matcher::has_predicated_vectors()) failed: should be > # > ... > Current CompileTask: > C2: 9402 989 b jdk.incubator.vector.Int256Vector$Int256Mask::laneIsSet (38 bytes) > > Stack: [0x00007fa325bfc000,0x00007fa325cfd000], sp=0x00007fa325cf78f0, free space=1006k > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x1815626] LibraryCallKit::inline_vector_extract()+0xc26 (vectorIntrinsics.cpp:2586) > V [libjvm.so+0x121cc94] LibraryIntrinsic::generate(JVMState*)+0x1c4 (library_call.cpp:117) > V [libjvm.so+0x853141] CallGenerator::do_late_inline_helper()+0x9b1 (callGenerator.cpp:695) > V [libjvm.so+0x9ea704] Compile::inline_incrementally_one()+0xd4 (compile.cpp:2015) > V [libjvm.so+0x9eb873] Compile::inline_incrementally(PhaseIterGVN&)+0x273 (compile.cpp:2098) > V [libjvm.so+0x9eebc7] Compile::Optimize()+0x427 (compile.cpp:2233) > V [libjvm.so+0x9f1f75] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1aa5 (compile.cpp:839) > V [libjvm.so+0x84bc04] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x3c4 (c2compiler.cpp:118) > V [libjvm.so+0x9fdf10] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xa00 (compileBroker.cpp:2265) > V [libjvm.so+0x9fed98] CompileBroker::compiler_thread_loop()+0x618 (compileBroker.cpp:1944) > V [libjvm.so+0xeb6dec] JavaThread::thread_main_inner()+0xcc (javaThread.cpp:719) > V [libjvm.so+0x17970aa] Thread::call_run()+0xba (thread.cpp:217) > V [libjvm.so+0x149715c] thread_native_entry(Thread*)+0x11c (os_linux.cpp:775) > ... > model name : Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl xtopology c... @PaulSandoz @jatin-bhateja Could you help to take a look? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14200#issuecomment-1630485381 From pli at openjdk.org Tue Jul 11 10:06:36 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 11 Jul 2023 10:06:36 GMT Subject: RFR: 8308994: C2: Re-implement experimental post loop vectorization [v2] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 15:11:20 GMT, Vladimir Kozlov wrote: >> Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Address part of comments from Emanuel > > Yes, you can remove old code first. And work on new implementation after that. Thanks @vnkozlov and @eme64, I just created https://github.com/openjdk/jdk/pull/14824 for the legacy code cleanup. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14581#issuecomment-1630536093 From stuefe at openjdk.org Tue Jul 11 10:53:07 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 10:53:07 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Tue, 4 Jul 2023 13:11:38 GMT, Johan Sj?len wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback johan > > Thank you Thomas! > > These changes look good to me now, I'm approving this. @jdksjolen was kind enough to put this through Oracle's CI, all good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1630601455 From mgronlun at openjdk.org Tue Jul 11 12:41:21 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 11 Jul 2023 12:41:21 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v5] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: only transient promotion buffers for disk configurations ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/76b27a4d..635d05e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From coleenp at openjdk.org Tue Jul 11 12:59:02 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 12:59:02 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 01:26:44 GMT, Coleen Phillimore wrote: > Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. > > Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. Thanks for looking at this and your help. I think I replied to all your comments but I think there's more work to do to make this safer. ------------- PR Review: https://git.openjdk.org/jdk/pull/14822#pullrequestreview-1524185447 From coleenp at openjdk.org Tue Jul 11 12:59:05 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 12:59:05 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 02:09:40 GMT, Dean Long wrote: >> Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. >> >> Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. > > src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 224: > >> 222: unsigned target = *(unsigned *)a; >> 223: target &= ~mask; >> 224: target |= checked_cast(val); > > Any value that doesn't fit in 32 bits is going to fail, so it's tempting to force the callers to pass 32-bit types, but that's a bigger change. How about something like this: > > static ALWAYSINLINE void patch(address a, int msb, int lsb, uint32_t val) { > /* original code, no additional checked_cast needed */ > } > > static ALWAYSINLINE void patch(address a, int msb, int lsb, uint64_t val) { > patch(a, msb, lsb, checked_cast(val)); > } I'll try this suggestion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259690565 From coleenp at openjdk.org Tue Jul 11 12:59:07 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 12:59:07 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: <3IzSseKzc1au1HBwc6jZCV15Qmqdu_A1_O9FTUHzx5Y=.be7154c1-8488-4d12-ba66-cf476178b5c7@github.com> Message-ID: On Tue, 11 Jul 2023 12:29:09 GMT, Coleen Phillimore wrote: >> uint64_t uval64 = val; >> unsigned uval = checked_cast(uval64); >> This won't work for negative values, right? > > val comes in signed, so we want to just chop off the sign. checked_cast(signed val) will assert. checked_cast<> doesn't do sign conversion. We don't have a cast that does sign conversion. Not sure I trust this either. I'm going to write a gtest for this. >> How many callers are passing in negative values and actually need these convenience functions? > > The overloading was really unhappy with the version of the functions that pass uint8_t for all the arguments. The callers might pass a couple uint8_t but then also a random selection of int and for one or more of the other parameters. > How many callers? Actually quite a lot for the emit_int16/24/32 ones. They pass an int imm8 value and have some (value | encode) parameter where encode is an int passed in. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259682829 PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259685864 From coleenp at openjdk.org Tue Jul 11 12:59:11 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 12:59:11 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: <3IzSseKzc1au1HBwc6jZCV15Qmqdu_A1_O9FTUHzx5Y=.be7154c1-8488-4d12-ba66-cf476178b5c7@github.com> Message-ID: On Tue, 11 Jul 2023 02:31:00 GMT, Dean Long wrote: >> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 268: >> >>> 266: unsigned mask = checked_cast(right_n_bits(nbits)); >>> 267: uval &= mask; >>> 268: f(checked_cast(uval), lsb + nbits - 1, lsb); >> >> Suggestion: >> >> f(uval, lsb + nbits - 1, lsb); > > See my comment below about not trusting checked_cast to do the right thing for int64_t --> unsigned. This one seems to mask off the top half so the checked_cast<> will succeed, ie just change the type. >> src/hotspot/share/asm/assembler.hpp line 305: >> >>> 303: >>> 304: >>> 305: public: >> >> I don't think we need this. See below. > > Nevermind, I tried my alternative idea below and it didn't work. For these particular cases where we only care about going to uint8_t, we could check is8bit(). Another trick I've seen is checking if (val >> width) is 0 or -1. There is an is8bit() test that precedes these casts. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259682275 PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259684184 From mgronlun at openjdk.org Tue Jul 11 13:22:32 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 11 Jul 2023 13:22:32 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v6] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: debug aid removals ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/635d05e8..be1d0da1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=04-05 Stats: 2 lines in 2 files changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From stuefe at openjdk.org Tue Jul 11 13:51:45 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 13:51:45 GMT Subject: RFR: JDK-8311870: Split CompressedKlassPointers from compressedOops.hpp Message-ID: In preparation for some Lilliput-related changes, I'd like to get some purely mechanical code moves out of the way. It would also improve separation of concerns and reduces include header bloat. In particular, this patch does: 1) Move `CompressedKlassPointers` from `compressedOops.(cpp|hpp|inline.hpp)` to `compressedKlass.(cpp|hpp|inline.hpp)` 2) flatten the `NarrowPtrStruct _narrow_klass` to `address _base; int _shift` (its implicit null check member is not needed for Klass and it has little merit otherwise). 3) moved `narrowKlass` from `oopsHierarchy.hpp` to `compressedKlass.hpp` 4) remove `KlassAlignment` and `LogKlassAlignment` (the word-sized variants, not xxxInBytes) since they are unused 5) Move `KlassEncodingMetaspaceMax`, `LogKlassAlignmentInBytes` and `KlassAlignmentInBytes` to compressedKlass.hpp 6) Fixed all include issues (including existing missing includes) 7) Fixed VM struct because of (2) Note that nothing functional is changed. ------------- Commit messages: - JDK-8311870-Split-CompressedKlassPointers-from-compressedOops.hpp Changes: https://git.openjdk.org/jdk/pull/14826/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14826&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311870 Stats: 565 lines in 35 files changed: 333 ins; 220 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/14826.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14826/head:pull/14826 PR: https://git.openjdk.org/jdk/pull/14826 From stuefe at openjdk.org Tue Jul 11 13:51:46 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 13:51:46 GMT Subject: RFR: JDK-8311870: Split CompressedKlassPointers from compressedOops.hpp In-Reply-To: References: Message-ID: <0LIGZH-MTEMG4PArLyvq-HlvwawBVspeZlH9aHs2HVc=.88598f88-2501-45dd-8a16-dc01c32a80b6@github.com> On Tue, 11 Jul 2023 11:20:19 GMT, Thomas Stuefe wrote: > In preparation for some Lilliput-related changes, I'd like to get some purely mechanical code moves out of the way. It would also improve separation of concerns and reduces include header bloat. > > In particular, this patch does: > > 1) Move `CompressedKlassPointers` from `compressedOops.(cpp|hpp|inline.hpp)` to `compressedKlass.(cpp|hpp|inline.hpp)` > > 2) flatten the `NarrowPtrStruct _narrow_klass` to `address _base; int _shift` (its implicit null check member is not needed for Klass and it has little merit otherwise). > > 3) moved `narrowKlass` from `oopsHierarchy.hpp` to `compressedKlass.hpp` > > 4) remove `KlassAlignment` and `LogKlassAlignment` (the word-sized variants, not xxxInBytes) since they are unused > > 5) Move `KlassEncodingMetaspaceMax`, `LogKlassAlignmentInBytes` and `KlassAlignmentInBytes` to compressedKlass.hpp > > 6) Fixed all include issues (including existing missing includes) > > 7) Fixed VM struct because of (2) > > Note that nothing functional is changed. pinging @rkennke ------------- PR Comment: https://git.openjdk.org/jdk/pull/14826#issuecomment-1630864393 From aph at openjdk.org Tue Jul 11 13:55:12 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 11 Jul 2023 13:55:12 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: <3IzSseKzc1au1HBwc6jZCV15Qmqdu_A1_O9FTUHzx5Y=.be7154c1-8488-4d12-ba66-cf476178b5c7@github.com> Message-ID: <7Jw3Met4X1Mg3BglG2dxW2WfWTuLRYA_PSP0gp3T0zE=.4dfd8320-5a9e-47a7-aa72-a5344141d647@github.com> On Tue, 11 Jul 2023 12:45:02 GMT, Coleen Phillimore wrote: >> val comes in signed, so we want to just chop off the sign. checked_cast(signed val) will assert. checked_cast<> doesn't do sign conversion. We don't have a cast that does sign conversion. > > Not sure I trust this either. I'm going to write a gtest for this. We already have a hard guarantee that this signed value fits in the signed field. There's no need for any more asserting for that: just chop it off. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259768367 From aph at openjdk.org Tue Jul 11 13:55:14 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 11 Jul 2023 13:55:14 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: Message-ID: <0RBmevBRzQKhh8V7FhQo7SFVj6aolkE2beO2HW-UbgE=.72eb6552-c2b1-4a6e-a4d3-bdf274e039c0@github.com> On Tue, 11 Jul 2023 12:39:09 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 556: >> >>> 554: } else { >>> 555: i->f(0b01, 25, 24); >>> 556: i->f(checked_cast(offset() >> size), 21, 10); >> >> I remember there being issues with checked_cast and sign extension. When going from int64_t to unsigned and back, I think we need to do int64_t --> int32_t --> unsigned, and not int64_t --> uint64_t --> unsigned. Is that what checked_cast will do? To be safe, or at least make it easier to understand, shound't we use checked_cast only to change the size or sign, but not both? So going to int64_t to unsigned would require two checked_casts. > > The assignment does the sign conversion first. The mask removes the top half with sign extension (right_n_bits is a macro that somehow returns intptr_t). So the check_cast<> just converts unsigned 64 bit to unsigned 32, which shouldn't be necessary since we just chopped off the top bits. > > > uint64_t uval = val; > unsigned mask = checked_cast(right_n_bits(nbits)); > uval &= mask; > f(checked_cast(uval), lsb + nbits - 1, lsb); That's right. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259769005 From pchilanomate at openjdk.org Tue Jul 11 13:59:18 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 11 Jul 2023 13:59:18 GMT Subject: [jdk21] Integrated: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite In-Reply-To: References: Message-ID: <6infr5EwuKvJ1b41nsDoDpGC99PzJutWcHpEzlYechM=.4255f0a1-a0bf-4c6d-99d6-2e2d017cccdd@github.com> On Mon, 10 Jul 2023 14:23:18 GMT, Patricio Chilano Mateo wrote: > Clean backport of [JDK-8302351](https://bugs.openjdk.org/browse/JDK-8302351). > > Thanks, > Patricio This pull request has now been integrated. Changeset: 308b4c63 Author: Patricio Chilano Mateo URL: https://git.openjdk.org/jdk21/commit/308b4c63cb4a09f9c0bc2457ed224b8675534a22 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite Reviewed-by: coleenp Backport-of: 0c86c31bccd676e1cfbd35898ee16e89d5752688 ------------- PR: https://git.openjdk.org/jdk21/pull/106 From fyang at openjdk.org Tue Jul 11 14:43:12 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 11 Jul 2023 14:43:12 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 08:35:14 GMT, Vladimir Kempik wrote: > Now that 17u backport is done, can we please get back to this PR? we have got compare case for small strings reviewed, only left the case for large strings, where the case for UU/LL is pretty simple ( just replacing ld with misaligned_load) and the case for large UL/LU strings is the only complex thing Yeah, I think I will be able to take another look tomorrow. Sorry for the delay. I think the case for large UL/LU strings is getting more complex too. Is it possible to simplify it similarly with misaligned_load? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1630957703 From coleenp at openjdk.org Tue Jul 11 15:01:10 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 15:01:10 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 12:51:27 GMT, Coleen Phillimore wrote: >> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 224: >> >>> 222: unsigned target = *(unsigned *)a; >>> 223: target &= ~mask; >>> 224: target |= checked_cast(val); >> >> Any value that doesn't fit in 32 bits is going to fail, so it's tempting to force the callers to pass 32-bit types, but that's a bigger change. How about something like this: >> >> static ALWAYSINLINE void patch(address a, int msb, int lsb, uint32_t val) { >> /* original code, no additional checked_cast needed */ >> } >> >> static ALWAYSINLINE void patch(address a, int msb, int lsb, uint64_t val) { >> patch(a, msb, lsb, checked_cast(val)); >> } > > I'll try this suggestion. src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp:614:35: error: call to 'patch' is ambiguous [2023-07-11T14:00:01,568Z] void set_kind(int order_kind) { Instruction_aarch64::patch(addr_at(0), 11, 8, order_kind); } [2023-07-11T14:00:01,568Z] ^~~~~~~~~~~~~~~~~~~~~~~~~~ The compiler thinks its ambiguous. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259869970 From coleenp at openjdk.org Tue Jul 11 15:31:13 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 15:31:13 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking When you add in Thread, the namespace solution seems a lot less attractive unless you can do "using namespace hotspot;" around all the code in the JVM without altering names everywhere. Having sets of different HotSpotX namespaces seems a lot less attractive in the code and isn't really more general. So I don't think this is a good change anymore. Maybe there is a build option to wrap names as Ioi suggested or some global namespace wrapping. But then we'd just want to wrap the whole thing in namespace hotspot. This code was written before namespaces existed everywhere. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631038577 From coleenp at openjdk.org Tue Jul 11 15:37:06 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 11 Jul 2023 15:37:06 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking I think I've changed my mind. Since static build is a build time issue, then maybe a source solution shouldn't be done for this. src/hotspot/share/runtime/serviceThread.cpp line 119: > 117: (has_dcmd_notification_event = (!UseNotificationThread && DCmdFactory::has_pending_jmx_notification())) | > 118: (stringtable_work = JavaClassFile::StringTable::has_work()) | > 119: (symboltable_work = SymbolTable::has_work()) | It does look strange since you'd expect SymbolTable to be in the same namespace. ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14808#pullrequestreview-1524561569 PR Review Comment: https://git.openjdk.org/jdk/pull/14808#discussion_r1259912370 From vkempik at openjdk.org Tue Jul 11 15:44:03 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 11 Jul 2023 15:44:03 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v2] In-Reply-To: References: Message-ID: <0XSKJuDcgGV2zYkOTFnlZpSItiSrCG49uckyp2vQJH4=.dd47a82c-620a-44fc-98ba-82c8a11d1011@github.com> On Thu, 29 Jun 2023 05:50:06 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > Simplify tail for shrot string compare the case for large LU/UL was made very unfriendly to the hifive ( and misaligned case in general) from the ground up. Doing 4 (but not 8) bytes aligned LD on every iteration. there is not big potential for simplifying ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1631060617 From stuefe at openjdk.org Tue Jul 11 15:47:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 11 Jul 2023 15:47:04 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking Hi, to prevent clashes like this, libraries that support static linking tend to define a global namespace or a common prefix, usually switchable via defines. IIRC sqlite does this, zlib, jemalloc... If we could pull such a thing off with a minimum of invasiveness, it would be a much more robust solution. If we introduce such a namespace, the name could be very short and non-descriptive, and then be defined (or completely switched off by default) via compile option. That way one could even link two hotspots if one wanted (only one usuable at a time for other reasons). If combined with "using" somewhere, maybe in precompiled.hpp, this solution may not be that invasive. Cheers, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631065356 From simonis at openjdk.org Tue Jul 11 15:47:35 2023 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 11 Jul 2023 15:47:35 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v3] In-Reply-To: References: Message-ID: > As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: > > The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: > - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. > - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). > - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). > - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. > - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, > - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. > - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. > - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. > - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). > > Following is a stacktrace of what I've explained so far: > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [libjvm.so+0x143a96a] StackWalk::fill_in_frames... Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: Fixed case when calling getCallerClass() from a @CallerSensitive method reflectively ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14773/files - new: https://git.openjdk.org/jdk/pull/14773/files/1ec6d90b..08143fdb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14773&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14773&range=01-02 Stats: 293 lines in 14 files changed: 187 ins; 87 del; 19 mod Patch: https://git.openjdk.org/jdk/pull/14773.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14773/head:pull/14773 PR: https://git.openjdk.org/jdk/pull/14773 From simonis at openjdk.org Tue Jul 11 15:48:13 2023 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 11 Jul 2023 15:48:13 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() can throw if invoked reflectively [v2] In-Reply-To: References: Message-ID: <_PFAIwFztDmWVKgVFpZcYqFjarJHum0O5-qbQbuhFDY=.3f907681-5434-4875-aac2-838ffc72ec00@github.com> On Wed, 5 Jul 2023 17:25:24 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Rename new parameter according to the HS coding conventions As discussed before, the test for a `@CallerSensitive` method must be applied to the first non-reflective, non-methodhandle frame below `StackWalker::getCallerClass()`, independent from the batch in which it appears. Because we can not predict which batch this will be, I've removed the previous solution which added a special flag to distinguish the first from the follow up batches. Instead I've added a new flag `_needs_caller_sensitive_check` to the `BaseFrameStream` class because every invocation of `StackWalker::getCallerClass()` will always use a single instance of `BaseFrameStream`. Currently the check for `@CallerSensitive` methods can only be done in the VM because for `StackWalker::getCallerClass()` we only pass the `Class` of each stack frame back to Java but not the corresponding `Method`. On the other hand, we currently skip reflective and methodhandle frames in Java (i.e. with `isReflectionFrame()` and `isMethodHandleFrame()`). In order to avoid the need to pass the stack frame methods back to Java I've therefore implemented (and combined) the the check for reflective and methodhandle frames in the VM (see `is_reflection_or_methodhandle_frame()`). Because `isMethodHandleFrame()` was only used in `ClassBuffer::consumeFrames()` it can be removed. I think we can also skip the remaining test for reflective frames in `AbstractStackWalker::peekFrame()` for the `getCallerClass()` case (because that check is now already done in the VM) but I'll leave that for [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). I've also moved the `ReflectiveGetCallerClassTest.java` test under `java/lang/StackWalker/CallerSensitiveMethod` such that I can use the `java/util/CSM.java` class which gets injected into the base module for the new `@CallerSensitive` test. Finally I realized that some `StackWalker` tests fail if we run with `-javaoption:-XX:+ShowHiddenFrames`. Ideally these tests should be fixed to make them agnostic to `ShowHiddenFrames` but for now I've added a `@requires vm.opt.ShowHiddenFrames == false` tag to exclude them if ran with `-XX:+ShowHiddenFrames`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1631065562 From simonis at openjdk.org Tue Jul 11 15:55:14 2023 From: simonis at openjdk.org (Volker Simonis) Date: Tue, 11 Jul 2023 15:55:14 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() throws UOE if invoked reflectively [v2] In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 07:16:19 GMT, David Holmes wrote: >> Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename new parameter according to the HS coding conventions > > src/hotspot/share/prims/stackwalk.cpp line 501: > >> 499: KeepStackGCProcessedMark keep_stack(THREAD); >> 500: numFrames = fill_in_frames(mode, stream, frame_count, start_index, >> 501: frames_array, end_index, true, CHECK_NULL); > > Could you annotate the new argument please ie. `true /* first batch */` and `false /* not first batch */`. Thanks. I've removed these arguments in the latest version. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14773#discussion_r1259939849 From iklam at openjdk.org Tue Jul 11 16:23:22 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 11 Jul 2023 16:23:22 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 15:44:30 GMT, Thomas Stuefe wrote: > Hi, > > to prevent clashes like this, libraries that support static linking tend to define a global namespace or a common prefix, usually switchable via defines. IIRC sqlite does this, zlib, jemalloc... If we could pull such a thing off with a minimum of invasiveness, it would be a much more robust solution. Do the libraries that you mention just enclose every .cpp and .hpp files with `namespace xxx {` and `}`. Something like // thread.hpp #include <....> #include <....> #include <....> namespace hotspot { // <<< add class Thread {....} } // <<< add That doesn't look too bad to me, as we just need to do this once, and it can (probably) be done with a script. > > If we introduce such a namespace, the name could be very short and non-descriptive, and then be defined (or completely switched off by default) via compile option. That way one could even link two hotspots if one wanted (only one usuable at a time for other reasons). > > If combined with "using" somewhere, maybe in precompiled.hpp, this solution may not be that invasive. Are you suggesting something like this? // precompiled.hpp namespace hotspot {} using Bar = hotspot::Bar; // bar.h, the "Bar" class is automatically put inside "hotspot" namespace. class Bar { static int a() { return 0; } }; I tried it but couldn't get it to work. The `using` must be put after the type of `hotspot::Bar` has already been declared. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631120256 From aph at openjdk.org Tue Jul 11 16:37:04 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 11 Jul 2023 16:37:04 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 14:58:18 GMT, Coleen Phillimore wrote: >> I'll try this suggestion. > > src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp:614:35: error: call to 'patch' is ambiguous > [2023-07-11T14:00:01,568Z] void set_kind(int order_kind) { Instruction_aarch64::patch(addr_at(0), 11, 8, order_kind); } > [2023-07-11T14:00:01,568Z] ^~~~~~~~~~~~~~~~~~~~~~~~~~ > > The compiler thinks its ambiguous. I'm not surprised. I think that overloading `patch()` will be a trail of tears. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1259989146 From jvernee at openjdk.org Tue Jul 11 16:43:15 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 11 Jul 2023 16:43:15 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking `using namespace hotspot;` could be used I think: > using namespace ns-name ; | (5) > 5) [using-directive](https://en.cppreference.com/w/cpp/language/namespace#Using-directives): From the point of view of unqualified [name lookup](https://en.cppreference.com/w/cpp/language/lookup) of any name after a using-directive and until the end of the scope in which it appears, every name from ns-name is visible as if it were declared in the nearest enclosing namespace which contains both the using-directive and ns-name. (From: https://en.cppreference.com/w/cpp/language/namespace) But, if we put everything in the same `hotspot` namespace we shouldn't need to add `hotspot::` qualifiers since things in the same namespace can already refer to each other: namespace hotspot { class Bar { public: static int a() { return 0; } }; int func(Bar* bar) { // Bar is unqualified here return bar->a(); } } ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631147930 From jiangli at openjdk.org Tue Jul 11 16:49:12 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 16:49:12 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 16:19:40 GMT, Ioi Lam wrote: > > Hi, > > to prevent clashes like this, libraries that support static linking tend to define a global namespace or a common prefix, usually switchable via defines. IIRC sqlite does this, zlib, jemalloc... If we could pull such a thing off with a minimum of invasiveness, it would be a much more robust solution. > > Do the libraries that you mention just enclose every .cpp and .hpp files with `namespace xxx {` and `}`. Both zlib and jemalloc achieve name space isolations via #define, although not in C++ namespace definition. I think that's what @tstuefe was referring to. > > Something like > > // thread.hpp > #include <....> > #include <....> > #include <....> > namespace hotspot { // <<< add > class Thread {....} > } // <<< add > That doesn't look too bad to me, as we just need to do this once, and it can (probably) be done with a script. Global namespace (as @JornVernee has suggested) does sound attractive as long as we can do it without very invasive changes. It's also a more future-proof solution. I'll do some exploration as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631156523 From iklam at openjdk.org Tue Jul 11 17:05:56 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 11 Jul 2023 17:05:56 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 16:39:50 GMT, Jorn Vernee wrote: >> Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8311661 >> - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. >> - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking > > `using namespace hotspot;` could be used I think: > >> using namespace ns-name ; | (5) > >> 5) [using-directive](https://en.cppreference.com/w/cpp/language/namespace#Using-directives): From the point of view of unqualified [name lookup](https://en.cppreference.com/w/cpp/language/lookup) of any name after a using-directive and until the end of the scope in which it appears, every name from ns-name is visible as if it were declared in the nearest enclosing namespace which contains both the using-directive and ns-name. > > (From: https://en.cppreference.com/w/cpp/language/namespace) > > But, if we put everything in the same `hotspot` namespace we shouldn't need to add `hotspot::` qualifiers since things in the same namespace can already refer to each other: > > > namespace hotspot { > class Bar { > public: > static int a() { return 0; } > }; > > int func(Bar* bar) { // Bar is unqualified here > return bar->a(); > } > } > Global namespace (as @JornVernee has suggested) does sound attractive as long as we can do it without very invasive changes. It's also a more future-proof solution. I'll do some exploration as well. Something like this might work. We need to enclose all declarations inside `namespace hotspot {...}`, so all .hpp files need to be touched (which I think is OK). We can add the `using namespace hotspot` in precompiled.hpp so we that can avoid modifying most .cpp files. `using namespace` affects only the lookup of existing names. It doesn't affect the declaration of new types (like Bar in my example below). // precompiled.hpp namespace hotspot {} using namespace hotspot; // foo.hpp namespace hotspot { class Foo { public: static int x(); }; } // bar.hpp class Bar { // declares ::Bar, not hotspot::Bar public: static int a() { return 2; } }; // foo.cpp int Foo::x() { return 1; } // defines Hotspot::Foo::x() int main() { printf("%d %d\n", Foo::x(), hotspot::Foo::x()); //printf("%d %d\n", Bar::a(), hotspot::Bar::x()); <-- this doesn't work } ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631178570 From dlong at openjdk.org Tue Jul 11 17:43:19 2023 From: dlong at openjdk.org (Dean Long) Date: Tue, 11 Jul 2023 17:43:19 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 16:34:18 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/nativeInst_aarch64.hpp:614:35: error: call to 'patch' is ambiguous >> [2023-07-11T14:00:01,568Z] void set_kind(int order_kind) { Instruction_aarch64::patch(addr_at(0), 11, 8, order_kind); } >> [2023-07-11T14:00:01,568Z] ^~~~~~~~~~~~~~~~~~~~~~~~~~ >> >> The compiler thinks its ambiguous. > > I'm not surprised. I think that overloading `patch()` will be a trail of tears. OK, nevermind. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1260072071 From mgronlun at openjdk.org Tue Jul 11 18:42:22 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 11 Jul 2023 18:42:22 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v7] In-Reply-To: References: Message-ID: <04qc8fKL99C9_zOXZe7gMuPt_O4E0aJ-KqrV8wfb9W4=.37d24c71-3fdf-4977-aa91-cdc765a7ab2c@github.com> > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: include for ByteSize ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/be1d0da1..08210ac9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=05-06 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From duke at openjdk.org Tue Jul 11 19:15:54 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Jul 2023 19:15:54 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 07:41:01 GMT, Thomas Stuefe wrote: > What would be needed to make the Annotations appear in the "printall" command? I was somehow expecting to see at least something like "Annotation at xxxx". I am not sure what all details `printall` is expected to emit out. Looking at the code, printall doesn't seem to use ClassWriter. It uses HTMLGenerator to format the method data. I can emit something like "Annotation at xxxx" but it would be more useful if it can display the contents of the annotations. Unfortunately HTMLGenerator doesn't understand Annotations at all. Probably it is better left for another task. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14735#issuecomment-1631375871 From mgronlun at openjdk.org Tue Jul 11 19:20:32 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 11 Jul 2023 19:20:32 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v8] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: build x86_32 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/08210ac9..97854c2e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From fparain at openjdk.org Tue Jul 11 19:34:15 2023 From: fparain at openjdk.org (Frederic Parain) Date: Tue, 11 Jul 2023 19:34:15 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v7] In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 18:57:16 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Interpreter fix and cleanup Thank you for addressing the issues that were spotted. Looks good to me now. ------------- Marked as reviewed by fparain (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14129#pullrequestreview-1525030809 From duke at openjdk.org Tue Jul 11 19:39:29 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Jul 2023 19:39:29 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v2] In-Reply-To: References: Message-ID: > Please review this PR that enables ClassWriter to write annotations to the class file being dumped. > > The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. > The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Review comments Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14735/files - new: https://git.openjdk.org/jdk/pull/14735/files/66f2c104..889537fd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14735&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14735&range=00-01 Stats: 58 lines in 2 files changed: 32 ins; 21 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14735.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14735/head:pull/14735 PR: https://git.openjdk.org/jdk/pull/14735 From duke at openjdk.org Tue Jul 11 19:39:29 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Jul 2023 19:39:29 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v2] In-Reply-To: <4S1x7PphEjaz7kCe2uJTmFyaIccpEIn9fh52Sr4neMg=.6a3dac9e-feb0-459e-8972-e7cdbb48855e@github.com> References: <4S1x7PphEjaz7kCe2uJTmFyaIccpEIn9fh52Sr4neMg=.6a3dac9e-feb0-459e-8972-e7cdbb48855e@github.com> Message-ID: On Fri, 30 Jun 2023 17:59:15 GMT, Chris Plummer wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments >> >> Signed-off-by: Ashutosh Mehra > > src/hotspot/share/runtime/vmStructs.cpp line 972: > >> 970: unchecked_nonstatic_field(Array, _data, sizeof(Klass*)) \ >> 971: unchecked_nonstatic_field(Array, _data, sizeof(ResolvedIndyEntry)) \ >> 972: unchecked_nonstatic_field(Array*>, _data, sizeof(Array*)) \ > > Fix alignment of the _data column. Fixed. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Annotations.java line 74: > >> 72: public U1Array getFieldAnnotations(int fieldIndex) { >> 73: Address addr = fieldsAnnotations.getValue(getAddress()); >> 74: ArrayOfU1Array annotationsArray = VMObjectFactory.newObject(ArrayOfU1Array.class, addr); > > How about caching this result so you don't need to allocate a new object every time this API is called. Same thing in `getFieldTypeAnnotations()`. I think VMObjectFactory is a better place to implement the caching behavior so that all such patterns can benefit from it. I think it is better addressed in another task. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 451: > >> 449: offset += 1; >> 450: } >> 451: Address addr = getAddress().getAddressAt((getSize() - offset) * VM.getVM().getAddressSize()); > > A comment on the address computation would be useful here and in the changes below. Added a comment about the layout of the annotation pointers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260185912 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260194851 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260193404 From duke at openjdk.org Tue Jul 11 19:39:29 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Jul 2023 19:39:29 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v2] In-Reply-To: <-EeFleA4CD947u9g-IJBsRe9HpGSXLRUOxAkfP9L-0w=.32772548-808f-490b-bd09-8c6d91cf1913@github.com> References: <-EeFleA4CD947u9g-IJBsRe9HpGSXLRUOxAkfP9L-0w=.32772548-808f-490b-bd09-8c6d91cf1913@github.com> Message-ID: On Tue, 11 Jul 2023 06:39:09 GMT, Thomas Stuefe wrote: >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 470: >> >>> 468: if (hasParameterAnnotations()) { >>> 469: offset += 1; >>> 470: } >> >> Code here and in other places could be tightened: >> >> >> int offset = (hasMethodAnnotations() ? 1 : 0) + >> (hasParameterAnnotations() ? 1 : 0) + >> (hasTypeAnnotations() ? 1 : 0); > > Possibly even factor it out into separate functions like e.g. `offsetOfGenericSignatureIndex` does. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260193515 From duke at openjdk.org Tue Jul 11 19:56:25 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Jul 2023 19:56:25 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v3] In-Reply-To: References: Message-ID: > Please review this PR that enables ClassWriter to write annotations to the class file being dumped. > > The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. > The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Some code motion and factoring out common code Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14735/files - new: https://git.openjdk.org/jdk/pull/14735/files/889537fd..1d79e734 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14735&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14735&range=01-02 Stats: 123 lines in 1 file changed: 63 ins; 56 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14735.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14735/head:pull/14735 PR: https://git.openjdk.org/jdk/pull/14735 From duke at openjdk.org Tue Jul 11 19:56:25 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 11 Jul 2023 19:56:25 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 06:32:24 GMT, Thomas Stuefe wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Some code motion and factoring out common code >> >> Signed-off-by: Ashutosh Mehra > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 494: > >> 492: offset += 1; >> 493: } >> 494: Address addr = getAddress().getAddressAt((getSize() - offset) * VM.getVM().getAddressSize()); > > Factor this address calculation out, and as @plummercj remarked, comment it? Done. > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/InstanceKlass.java line 874: > >> 872: return null; >> 873: } >> 874: } > > Would calling these functions even be valid to call if Annotations are not present? > > If yes, maybe return Optional? But since the rest of the code does not use Optional either, maybe rather keep the code the same. > > Up to you, feel free to ignore this. I would keep it as is following existing code pattern. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260208546 PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260209779 From dlong at openjdk.org Tue Jul 11 19:57:09 2023 From: dlong at openjdk.org (Dean Long) Date: Tue, 11 Jul 2023 19:57:09 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking I think we already used -fvisibility=hidden to solve problems like this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631426872 From jiangli at openjdk.org Tue Jul 11 20:21:06 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 20:21:06 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 19:53:58 GMT, Dean Long wrote: > I think we already used -fvisibility=hidden to solve problems like this. Right, the `stringTable.o` was already compiled with `-fvisibility=hidden` for example. `-fvisibility=hidden` didn't resolve the duplicate symbol issue with static linking. We ran into the duplicate duplicate symbol linking failure when user code also defines the symbol and statically linked together with hotspot `libjvm.a`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631455504 From cjplummer at openjdk.org Tue Jul 11 20:37:07 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 11 Jul 2023 20:37:07 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 19:56:25 GMT, Ashutosh Mehra wrote: >> Please review this PR that enables ClassWriter to write annotations to the class file being dumped. >> >> The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. >> >> Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` >> Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. >> The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Some code motion and factoring out common code > > Signed-off-by: Ashutosh Mehra src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 641: > 639: // | | > 640: // V V > 641: // | ... | default | type | parameter | method | So the `...` part is represented by `getSize()`? It would be good to call that out. Also point out that each of the annotations pointers is optional. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260251805 From cjplummer at openjdk.org Tue Jul 11 20:37:04 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 11 Jul 2023 20:37:04 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v3] In-Reply-To: References: <4S1x7PphEjaz7kCe2uJTmFyaIccpEIn9fh52Sr4neMg=.6a3dac9e-feb0-459e-8972-e7cdbb48855e@github.com> Message-ID: On Tue, 11 Jul 2023 19:34:29 GMT, Ashutosh Mehra wrote: >> src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/Annotations.java line 74: >> >>> 72: public U1Array getFieldAnnotations(int fieldIndex) { >>> 73: Address addr = fieldsAnnotations.getValue(getAddress()); >>> 74: ArrayOfU1Array annotationsArray = VMObjectFactory.newObject(ArrayOfU1Array.class, addr); >> >> How about caching this result so you don't need to allocate a new object every time this API is called. Same thing in `getFieldTypeAnnotations()`. > > I think VMObjectFactory is a better place to implement the caching behavior so that all such patterns can benefit from it. I think it is better addressed in another task. I think maybe you misunderstood what I meant by "cache". I'm not talking about a hashmap of weak references or anything like that. Just add a `ArrayOfU1Array annotationsArray` field to the Annotations object and store the result there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1260246699 From dlong at openjdk.org Tue Jul 11 20:38:13 2023 From: dlong at openjdk.org (Dean Long) Date: Tue, 11 Jul 2023 20:38:13 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking Does passing -Wl,--exclude-libs,ALL to gcc work for static libs? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631474057 From jiangli at openjdk.org Tue Jul 11 20:44:15 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 20:44:15 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 17:03:12 GMT, Ioi Lam wrote: > Something like this might work. We need to enclose all declarations inside `namespace hotspot {...}`, so all .hpp files need to be touched (which I think is OK). I just looked briefly. For `.hpp` files alone in `hotspot` dir, we have 2276. Although the number of files is large, I think the actual changes are trivial. Still want to make sure everyone is okay with the changes at this scale. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631481093 From iklam at openjdk.org Tue Jul 11 21:11:14 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 11 Jul 2023 21:11:14 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: <5cl-bjjTvbpCd2yM0xRcarMllvDMqyapqD2ec2_Z_mc=.205b5d59-6bd5-4978-873b-49f32bf458a3@github.com> On Tue, 11 Jul 2023 20:41:16 GMT, Jiangli Zhou wrote: > > Something like this might work. We need to enclose all declarations inside `namespace hotspot {...}`, so all .hpp files need to be touched (which I think is OK). > > I just looked briefly. For `.hpp` files alone in `hotspot` dir, we have 2276. Although the number of files is large, I think the actual changes are trivial. Still want to make sure everyone is okay with the changes at this scale. If adding `using hotspot` in precompiled.hpp works, we can roll this out in stages. At the beginning, we can just put the three problematic classes (`Thread`, `StringTable` and `ProfileData`) in the `hotspot` namespace, so only a few files need to be changed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631511102 From vladimir.kozlov at oracle.com Tue Jul 11 21:50:16 2023 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 11 Jul 2023 14:50:16 -0700 Subject: CFV: New HotSpot Group Member: Fei Yang Message-ID: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> Hi, I hereby nominate Fei Yang to Membership in the HotSpot Group. Fei is one of the main authors of the RISC-V port. His main area of expertise is the HotSpot JIT compilers. One of his notable contributions is the backend of C1 and C2 JIT compilers for RISC-V. He has ported Loom to Linux/RISC-V platform and has been active in AArch64 port project since 2015. Fei is currently the JDK Reviewer and the Project Lead for RISC-V Port [1], which implements a full-featured JVM on Linux/RISC-V platform, including the template interpreter, C1/C2 and all current mainline GCs. That work had been integrated to mainline since JDK 19 and backported to JDK 17u recently. Fei reviewed more than 150 pull requests and contributed more than 50 recorded changes to the JDK project [2]. Votes are due by Tuesday, 25 July 2023 at 22:00 UTC. Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. Votes must be cast in the open by replying to this mailing list. For Lazy Consensus voting instructions, see [4]. Best regards, Vladimir K [2] https://openjdk.org/census#riscv-port [1] https://github.com/search?q=author%3ARealFYang+repo%3Aopenjdk%2Fjdk&type=commits [3] https://openjdk.org/census#hotspot [4] https://openjdk.org/groups/#member-vote From vladimir.x.ivanov at oracle.com Tue Jul 11 21:53:01 2023 From: vladimir.x.ivanov at oracle.com (Vladimir Ivanov) Date: Tue, 11 Jul 2023 14:53:01 -0700 Subject: CFV: New HotSpot Group Member: Fei Yang In-Reply-To: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> References: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> Message-ID: Vote: yes Best regards, Vladimir Ivanov On 7/11/23 14:50, Vladimir Kozlov wrote: > I hereby nominate Fei Yang to Membership in the HotSpot Group. From vladimir.kozlov at oracle.com Tue Jul 11 21:57:07 2023 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Tue, 11 Jul 2023 14:57:07 -0700 Subject: CFV: New HotSpot Group Member: Fei Yang In-Reply-To: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> References: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> Message-ID: Vote: yes Thanks, Vladimir K On 7/11/23 2:50 PM, Vladimir Kozlov wrote: > Hi, > > I hereby nominate Fei Yang to Membership in the HotSpot Group. > > Fei is one of the main authors of the RISC-V port. His main area of > expertise is the HotSpot JIT compilers. One of his notable contributions > is the backend of C1 and C2 JIT compilers for RISC-V. He has ported Loom to > Linux/RISC-V platform and has been active in AArch64 port project since 2015. > > Fei is currently the JDK Reviewer and the Project Lead for RISC-V Port [1], > which implements a full-featured JVM on Linux/RISC-V platform, including > the template interpreter, C1/C2 and all current mainline GCs. That work had > been integrated to mainline since JDK 19 and backported to JDK 17u recently. > > Fei reviewed more than 150 pull requests and contributed more than > 50 recorded changes to the JDK project [2]. > > Votes are due by Tuesday, 25 July 2023 at 22:00 UTC. > > Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Best regards, > Vladimir K > > [2] https://openjdk.org/census#riscv-port > [1] https://github.com/search?q=author%3ARealFYang+repo%3Aopenjdk%2Fjdk&type=commits > [3] https://openjdk.org/census#hotspot > [4] https://openjdk.org/groups/#member-vote From jiangli at openjdk.org Tue Jul 11 22:26:10 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 22:26:10 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 20:35:13 GMT, Dean Long wrote: > Does passing -Wl,--exclude-libs,ALL to gcc work for static libs? I just did a quick test using `gcc` with testing code added to define `StringTable` related symbols outside `libjvm.a`. With `-Wl,--exclude-libs,ALL`, statically linking the launcher executable with `libjvm.a` and the testing code fails due to multiple definition of the related symbols. I guess that's expected as `--exclude-libs` is for controlling symbol exporting. >From https://linux.die.net/man/1/ld#:~:text=Specifying%20%22%2D%2Dexclude%2Dlibs%20ALL,and%20for%20ELF%20targeted%20ports. --exclude-libs lib,lib,... Specifies a list of archive libraries from which symbols should not be automatically exported. The library names may be delimited by commas or colons. Specifying "--exclude-libs ALL" excludes symbols in all archive libraries from automatic export. This option is available only for the i386 PE targeted port of the linker and for ELF targeted ports. For i386 PE , symbols explicitly listed in a .def file are still exported, regardless of this option. For ELF targeted ports, symbols affected by this option will be treated as hidden. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631585756 From iklam at openjdk.org Tue Jul 11 22:41:06 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 11 Jul 2023 22:41:06 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:40:06 GMT, Jiangli Zhou wrote: >> Move StringTable to JavaClassFile namespace. > > Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'master' into JDK-8311661 > - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. > - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking I found a way to hide the unwanted symbols in libjvm.a. This requires `ld --relocatable` and `objcopy --keep-global-symbols=...`. See the prototype here: - https://github.com/iklam/tools/tree/main/misc/staticlib So potentially we can do this completely in the makefiles, without adding namespaces to HotSpot. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631597197 From coleen.phillimore at oracle.com Tue Jul 11 22:55:50 2023 From: coleen.phillimore at oracle.com (coleen.phillimore at oracle.com) Date: Tue, 11 Jul 2023 18:55:50 -0400 Subject: CFV: New HotSpot Group Member: Fei Yang In-Reply-To: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> References: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> Message-ID: <49fcd5a2-7d78-1272-e886-c8b1c17db8a3@oracle.com> Vote: yes On 7/11/23 5:50 PM, Vladimir Kozlov wrote: > Hi, > > I hereby nominate Fei Yang to Membership in the HotSpot Group. > > Fei is one of the main authors of the RISC-V port. His main area of > expertise is the HotSpot JIT compilers. One of his notable contributions > is the backend of C1 and C2 JIT compilers for RISC-V. He has ported > Loom to > Linux/RISC-V platform and has been active in AArch64 port project > since 2015. > > Fei is currently the JDK Reviewer and the Project Lead for RISC-V Port > [1], > which implements a full-featured JVM on Linux/RISC-V platform, including > the template interpreter, C1/C2 and all current mainline GCs. That > work had > been integrated to mainline since JDK 19 and backported to JDK 17u > recently. > > Fei reviewed more than 150 pull requests and contributed more than > 50 recorded changes to the JDK project [2]. > > Votes are due by Tuesday, 25 July 2023 at 22:00 UTC. > > Only current Members of the HotSpot Group [3] are eligible to vote on > this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Best regards, > Vladimir K > > [2] https://openjdk.org/census#riscv-port > [1] > https://github.com/search?q=author%3ARealFYang+repo%3Aopenjdk%2Fjdk&type=commits > [3] https://openjdk.org/census#hotspot > [4] https://openjdk.org/groups/#member-vote From jiangli at openjdk.org Tue Jul 11 23:00:14 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Tue, 11 Jul 2023 23:00:14 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 22:38:10 GMT, Ioi Lam wrote: > I found a way to hide the unwanted symbols in libjvm.a. This requires `ld --relocatable` and `objcopy --keep-global-symbols=...`. See the prototype here: > > * https://github.com/iklam/tools/tree/main/misc/staticlib > > So potentially we can do this completely in the makefiles, without adding namespaces to HotSpot. Yeah, `objcopy` can be used to localize symbols. One of my colleague implemented symbol localizing for `libfreetype.a` and `libharfbuzz.a` for static linking issue. In some cases, user might want to link with a different version of the harfbuzz library than the version linked with the JDK code. Then multiple versions of the libraries could be linked together into the executable. That was a solution suggested by C++ experts and it worked. Doing partial linking that produces a single `.o` file simplifies the work of `objcopy`. This is not a very portable solution though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631611220 From dean.long at oracle.com Tue Jul 11 23:00:27 2023 From: dean.long at oracle.com (dean.long at oracle.com) Date: Tue, 11 Jul 2023 16:00:27 -0700 Subject: CFV: New HotSpot Group Member: Fei Yang In-Reply-To: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> References: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> Message-ID: Vote: yes On 7/11/23 2:50 PM, Vladimir Kozlov wrote: > Hi, > > I hereby nominate Fei Yang to Membership in the HotSpot Group. > > Fei is one of the main authors of the RISC-V port. His main area of > expertise is the HotSpot JIT compilers. One of his notable contributions > is the backend of C1 and C2 JIT compilers for RISC-V. He has ported > Loom to > Linux/RISC-V platform and has been active in AArch64 port project > since 2015. > > Fei is currently the JDK Reviewer and the Project Lead for RISC-V Port > [1], > which implements a full-featured JVM on Linux/RISC-V platform, including > the template interpreter, C1/C2 and all current mainline GCs. That > work had > been integrated to mainline since JDK 19 and backported to JDK 17u > recently. > > Fei reviewed more than 150 pull requests and contributed more than > 50 recorded changes to the JDK project [2]. > > Votes are due by Tuesday, 25 July 2023 at 22:00 UTC. > > Only current Members of the HotSpot Group [3] are eligible to vote on > this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Best regards, > Vladimir K > > [2] https://openjdk.org/census#riscv-port > [1] > https://github.com/search?q=author%3ARealFYang+repo%3Aopenjdk%2Fjdk&type=commits > [3] https://openjdk.org/census#hotspot > [4] https://openjdk.org/groups/#member-vote From iklam at openjdk.org Tue Jul 11 23:37:03 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 11 Jul 2023 23:37:03 GMT Subject: RFR: JDK-8311870: Split CompressedKlassPointers from compressedOops.hpp In-Reply-To: References: Message-ID: <1d_j8Ww3KoZiliCAEwPRnsGrKZSy6CaKoAPOacGzfCA=.bc2e418a-e260-48ba-947f-a77342673f01@github.com> On Tue, 11 Jul 2023 11:20:19 GMT, Thomas Stuefe wrote: > In preparation for some Lilliput-related changes, I'd like to get some purely mechanical code moves out of the way. It would also improve separation of concerns and reduces include header bloat. > > In particular, this patch does: > > 1) Move `CompressedKlassPointers` from `compressedOops.(cpp|hpp|inline.hpp)` to `compressedKlass.(cpp|hpp|inline.hpp)` > > 2) flatten the `NarrowPtrStruct _narrow_klass` to `address _base; int _shift` (its implicit null check member is not needed for Klass and it has little merit otherwise). > > 3) moved `narrowKlass` from `oopsHierarchy.hpp` to `compressedKlass.hpp` > > 4) remove `KlassAlignment` and `LogKlassAlignment` (the word-sized variants, not xxxInBytes) since they are unused > > 5) Move `KlassEncodingMetaspaceMax`, `LogKlassAlignmentInBytes` and `KlassAlignmentInBytes` to compressedKlass.hpp > > 6) Fixed all include issues (including existing missing includes) > > 7) Fixed VM struct because of (2) > > Note that nothing functional is changed. Looks reasonable to me. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14826#pullrequestreview-1525290781 From coleenp at openjdk.org Wed Jul 12 00:43:54 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jul 2023 00:43:54 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v2] In-Reply-To: References: Message-ID: > Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. > > Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix narrow_cast in assembler to check for signed ranges when signed arguments are passed in. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14822/files - new: https://git.openjdk.org/jdk/pull/14822/files/08354ed5..12b1e95b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14822&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14822&range=00-01 Stats: 10 lines in 4 files changed: 3 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14822.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14822/head:pull/14822 PR: https://git.openjdk.org/jdk/pull/14822 From iklam at openjdk.org Wed Jul 12 00:46:34 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 12 Jul 2023 00:46:34 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map Message-ID: This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. Example with `-XX:-UseCompressedOops`: 0x00000000100001f0: @@ Object java.lang.String 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 - klass: 'java/lang/String' 0x0000000800010290 - ---- fields (total size 4 words): - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) - private final 'coder' 'B' @16 0 (0x00) - private 'hashIsZero' 'Z' @17 false (0x00) - injected 'flags' 'B' @18 1 (0x01) - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 0x0000000010000210: @@ Object [B length: 6 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e - klass: {type array byte} 0x00000008000024c8 - 0: 4e N - 1: 41 A - 2: 52 R - 3: 52 R - 4: 4f O - 5: 57 W Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a - klass: 'java/lang/String' 0x0000000800010290 - ---- fields (total size 3 words): - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) - private final 'coder' 'B' @16 0 (0x00) - private 'hashIsZero' 'Z' @17 false (0x00) - injected 'flags' 'B' @18 1 (0x01) - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f - klass: {type array byte} 0x00000008000024c8 - 0: 4e N - 1: 41 A - 2: 52 R - 3: 52 R - 4: 4f O - 5: 57 W ------------- Commit messages: - 8308903: Print detailed info for Java objects in -Xlog:cds+map Changes: https://git.openjdk.org/jdk/pull/14841/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308903 Stats: 205 lines in 7 files changed: 164 ins; 12 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/14841.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14841/head:pull/14841 PR: https://git.openjdk.org/jdk/pull/14841 From jiangli at openjdk.org Wed Jul 12 01:29:24 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 12 Jul 2023 01:29:24 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: <5cl-bjjTvbpCd2yM0xRcarMllvDMqyapqD2ec2_Z_mc=.205b5d59-6bd5-4978-873b-49f32bf458a3@github.com> References: <5cl-bjjTvbpCd2yM0xRcarMllvDMqyapqD2ec2_Z_mc=.205b5d59-6bd5-4978-873b-49f32bf458a3@github.com> Message-ID: <0J49MlUO8xVYipLytP2Fzd9KNWZfd0J3fge92B0Mq3g=.388bfe76-2854-466b-a64c-3a93d00b168a@github.com> On Tue, 11 Jul 2023 21:08:01 GMT, Ioi Lam wrote: > > > Something like this might work. We need to enclose all declarations inside `namespace hotspot {...}`, so all .hpp files need to be touched (which I think is OK). > > > > > > I just looked briefly. For `.hpp` files alone in `hotspot` dir, we have 2276. Although the number of files is large, I think the actual changes are trivial. Still want to make sure everyone is okay with the changes at this scale. > > If adding `using hotspot` in precompiled.hpp works, we can roll this out in stages. At the beginning, we can just put the three problematic classes (`Thread`, `StringTable` and `ProfileData`) in the `hotspot` namespace, so only a few files need to be changed. Moving to the namespace incrementally seems to be reasonable to me. Will let others to chime in on this for their thoughts. Add `using ` to precompiled.hpp does help avoid adding `::` in many places. We still need to put the implementation code inside `namespace { ...}`, or use `::`. I experimented with StringTable and only needed to edit stringTable.* and precompiled.hpp. I tested with and without `--disable-precompiled-headers` and both built ok. C++ references that I read discourages using `using` directive. https://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice has some helpful information. For the namespace usages within hotspot, there is probably no concern with pollution. It seems eventually all code would be moved to the namespace then `using` would not be needed. `Thread` change would be bigger. There are references like `class Thread;`. It seems to be a good idea to be handled with https://bugs.openjdk.org/browse/JDK-8311846 separately. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1631715225 From dzhang at openjdk.org Wed Jul 12 01:34:10 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Wed, 12 Jul 2023 01:34:10 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v7] In-Reply-To: <58-yt8-aJSeLF3WAAd0TMMNrNTOz7NbS73zNXO2FMy0=.fb39c8fc-5762-485e-8bdd-5d508c56d52d@github.com> References: <58-yt8-aJSeLF3WAAd0TMMNrNTOz7NbS73zNXO2FMy0=.fb39c8fc-5762-485e-8bdd-5d508c56d52d@github.com> Message-ID: <6Q3MUDL5x67bZoEro_z9ywzBb0PBzKboI35GPSPQPsU=.74e5718d-15c5-4315-9967-bdf528ed4be2@github.com> On Mon, 10 Jul 2023 07:10:23 GMT, Dingli Zhang wrote: > > It looks like I misunderstood what `tbz` does! I believe you are correct in suggesting that the `andr` is not necessary. > > Hi, @matias9927, Thanks for update! We have already done the adaptation for RISC-V locally and we are currently testing. I will update the test results and give the corresponding patch later. Hi, @zifeihan and I have finished the RISC-V part and tier1-3 looks good. Please help us to add the RISC-V part, thanks a lot! https://github.com/DingliZhang/jdk/commit/a14433d37c908362982bf2571a62f42bc236e7d8 on this branch: https://github.com/DingliZhang/jdk/tree/pr-14129-rebase ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1631718358 From coleenp at openjdk.org Wed Jul 12 01:35:26 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jul 2023 01:35:26 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v3] In-Reply-To: References: Message-ID: > Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. > > Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Unset Wconversion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14822/files - new: https://git.openjdk.org/jdk/pull/14822/files/12b1e95b..71c8fdec Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14822&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14822&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14822.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14822/head:pull/14822 PR: https://git.openjdk.org/jdk/pull/14822 From tobias.hartmann at oracle.com Wed Jul 12 05:05:43 2023 From: tobias.hartmann at oracle.com (Tobias Hartmann) Date: Wed, 12 Jul 2023 07:05:43 +0200 Subject: CFV: New HotSpot Group Member: Fei Yang In-Reply-To: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> References: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> Message-ID: Vote: yes Best regards, Tobias On 11.07.23 23:50, Vladimir Kozlov wrote: > Hi, > > I hereby nominate Fei Yang to Membership in the HotSpot Group. > > Fei is one of the main authors of the RISC-V port. His main area of > expertise is the HotSpot JIT compilers. One of his notable contributions > is the backend of C1 and C2 JIT compilers for RISC-V. He has ported Loom to > Linux/RISC-V platform and has been active in AArch64 port project since 2015. > > Fei is currently the JDK Reviewer and the Project Lead for RISC-V Port [1], > which implements a full-featured JVM on Linux/RISC-V platform, including > the template interpreter, C1/C2 and all current mainline GCs. That work had > been integrated to mainline since JDK 19 and backported to JDK 17u recently. > > Fei reviewed more than 150 pull requests and contributed more than > 50 recorded changes to the JDK project [2]. > > Votes are due by Tuesday, 25 July 2023 at 22:00 UTC. > > Only current Members of the HotSpot Group [3] are eligible to vote on this nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Best regards, > Vladimir K > > [2] https://openjdk.org/census#riscv-port > [1] https://github.com/search?q=author%3ARealFYang+repo%3Aopenjdk%2Fjdk&type=commits > [3] https://openjdk.org/census#hotspot > [4] https://openjdk.org/groups/#member-vote From stuefe at openjdk.org Wed Jul 12 05:18:19 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 12 Jul 2023 05:18:19 GMT Subject: RFR: JDK-8311870: Split CompressedKlassPointers from compressedOops.hpp In-Reply-To: <1d_j8Ww3KoZiliCAEwPRnsGrKZSy6CaKoAPOacGzfCA=.bc2e418a-e260-48ba-947f-a77342673f01@github.com> References: <1d_j8Ww3KoZiliCAEwPRnsGrKZSy6CaKoAPOacGzfCA=.bc2e418a-e260-48ba-947f-a77342673f01@github.com> Message-ID: On Tue, 11 Jul 2023 23:33:52 GMT, Ioi Lam wrote: > Looks reasonable to me. Thank you Ioi. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14826#issuecomment-1631862898 From sspitsyn at openjdk.org Wed Jul 12 05:29:41 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 12 Jul 2023 05:29:41 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach Message-ID: This is an issue with a dynamic load of a JVMTI agent into running VM. The `VM_SetNotifyJvmtiEventsMode` enabling operation makes a call to the function `count_transitions_and_correct_jvmti_thread_states()`. This function in its turn make a call to the function `correct_jvmti_thread_state()`. But it does it conditionally, only if the field `_whitebox_used` is `true`. The test provided in the bug report showed that it has to be called unconditionally as the assumption that it is only needed on the subsequent `notifyJvmti` enabling is incorrect. Then the field `_whitebox_used` is not needed anymore and removed in this fix. Some obsolete comments are removed or updated. New test is added: `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest`. It is failed without the fix and passed with the fix. Testing: - ran new test `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest` - mach5 tiers 1-5 are good ------------- Commit messages: - fix traling spaces and remove unneeded empty lines in new test - 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach Changes: https://git.openjdk.org/jdk/pull/14842/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14842&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311556 Stats: 198 lines in 3 files changed: 183 ins; 13 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14842.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14842/head:pull/14842 PR: https://git.openjdk.org/jdk/pull/14842 From thomas.stuefe at gmail.com Wed Jul 12 05:31:40 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 12 Jul 2023 07:31:40 +0200 Subject: CFV: New HotSpot Group Member: Fei Yang In-Reply-To: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> References: <9fefeea5-59a3-450c-83dc-1bfd4ff92f14@oracle.com> Message-ID: Vote: yes On Tue, Jul 11, 2023 at 11:50?PM Vladimir Kozlov wrote: > Hi, > > I hereby nominate Fei Yang to Membership in the HotSpot Group. > > Fei is one of the main authors of the RISC-V port. His main area of > expertise is the HotSpot JIT compilers. One of his notable contributions > is the backend of C1 and C2 JIT compilers for RISC-V. He has ported Loom to > Linux/RISC-V platform and has been active in AArch64 port project since > 2015. > > Fei is currently the JDK Reviewer and the Project Lead for RISC-V Port [1], > which implements a full-featured JVM on Linux/RISC-V platform, including > the template interpreter, C1/C2 and all current mainline GCs. That work had > been integrated to mainline since JDK 19 and backported to JDK 17u > recently. > > Fei reviewed more than 150 pull requests and contributed more than > 50 recorded changes to the JDK project [2]. > > Votes are due by Tuesday, 25 July 2023 at 22:00 UTC. > > Only current Members of the HotSpot Group [3] are eligible to vote on this > nomination. > Votes must be cast in the open by replying to this mailing list. > > For Lazy Consensus voting instructions, see [4]. > > Best regards, > Vladimir K > > [2] https://openjdk.org/census#riscv-port > [1] > https://github.com/search?q=author%3ARealFYang+repo%3Aopenjdk%2Fjdk&type=commits > [3] https://openjdk.org/census#hotspot > [4] https://openjdk.org/groups/#member-vote > -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuefe at openjdk.org Wed Jul 12 05:55:13 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 12 Jul 2023 05:55:13 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 00:39:30 GMT, Ioi Lam wrote: > The output looks like oopDesc::print_on(tty), but we need to print the pointers using the locations of the objects at runtime. I don't understand. Are you not experimenting with a new allocation mode that would basically make the runtime address unpredictable? Or is this still for the old way of things where the heap archive is mapped as-is ? Other than that, this looks very useful. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14841#issuecomment-1631891189 From lmesnik at openjdk.org Wed Jul 12 06:23:12 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 12 Jul 2023 06:23:12 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 05:02:53 GMT, Serguei Spitsyn wrote: > This is an issue with a dynamic load of a JVMTI agent into running VM. > The `VM_SetNotifyJvmtiEventsMode` enabling operation makes a call to the function `count_transitions_and_correct_jvmti_thread_states()`. This function in its turn make a call to the function `correct_jvmti_thread_state()`. But it does it conditionally, only if the field `_whitebox_used` is `true`. > The test provided in the bug report showed that it has to be called unconditionally as the assumption that it is only needed on the subsequent `notifyJvmti` enabling is incorrect. > > Then the field `_whitebox_used` is not needed anymore and removed in this fix. > Some obsolete comments are removed or updated. > > New test is added: `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest`. > It is failed without the fix and passed with the fix. > > Testing: > - ran new test `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest` > - mach5 tiers 1-5 are good Marked as reviewed by lmesnik (Reviewer). test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest/VThreadTLSTest.java line 29: > 27: * @requires vm.continuations > 28: * @requires vm.jvmti > 29: * @compile VThreadTLSTest.java I believe that this line is redundant. jtreg compiles test automatically. test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest/VThreadTLSTest.java line 39: > 37: * @requires vm.continuations > 38: * @requires vm.jvmti > 39: * @compile VThreadTLSTest.java I believe that this line is redundant like 29. ------------- PR Review: https://git.openjdk.org/jdk/pull/14842#pullrequestreview-1525583011 PR Review Comment: https://git.openjdk.org/jdk/pull/14842#discussion_r1260645294 PR Review Comment: https://git.openjdk.org/jdk/pull/14842#discussion_r1260645647 From fjiang at openjdk.org Wed Jul 12 06:41:55 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Wed, 12 Jul 2023 06:41:55 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 09:02:39 GMT, Ilya Gavrilin wrote: > Please review this small change for slli and slli.uw > slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) > addi with 0 has higher chances to be just a register renaming and not utilise ALU at all > We have observed small positive effect on hifive (and no change on thead). > Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available > testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba > > performance on hifive, before: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | > > with patch: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | Looks good, with one nit: src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4342: > 4340: _slli_uw(Rd, Rs, shamt); > 4341: } else { > 4342: slli(Rd, Rs, shamt+32); Suggestion: slli(Rd, Rs, shamt + 32); ------------- Marked as reviewed by fjiang (Author). PR Review: https://git.openjdk.org/jdk/pull/14823#pullrequestreview-1525617190 PR Review Comment: https://git.openjdk.org/jdk/pull/14823#discussion_r1260670748 From duke at openjdk.org Wed Jul 12 07:04:15 2023 From: duke at openjdk.org (sid8606) Date: Wed, 12 Jul 2023 07:04:15 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee @TheRealMDoerr , @reinrich Please review this PR, Would like to hear your reviews on platform specific changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1631958291 From duke at openjdk.org Wed Jul 12 07:35:20 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Wed, 12 Jul 2023 07:35:20 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli [v2] In-Reply-To: References: Message-ID: > Please review this small change for slli and slli.uw > slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) > addi with 0 has higher chances to be just a register renaming and not utilise ALU at all > We have observed small positive effect on hifive (and no change on thead). > Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available > testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba > > performance on hifive, before: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | > > with patch: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: Fix typo in macroAssembler_riscv.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14823/files - new: https://git.openjdk.org/jdk/pull/14823/files/70bce628..2df17399 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14823&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14823&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14823/head:pull/14823 PR: https://git.openjdk.org/jdk/pull/14823 From duke at openjdk.org Wed Jul 12 07:45:15 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Wed, 12 Jul 2023 07:45:15 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli [v2] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 06:38:11 GMT, Feilong Jiang wrote: >> Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix typo in macroAssembler_riscv.cpp > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4342: > >> 4340: _slli_uw(Rd, Rs, shamt); >> 4341: } else { >> 4342: slli(Rd, Rs, shamt+32); > > Suggestion: > > slli(Rd, Rs, shamt + 32); Thank you for your suggestion ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14823#discussion_r1260741827 From sspitsyn at openjdk.org Wed Jul 12 08:01:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 12 Jul 2023 08:01:55 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach [v2] In-Reply-To: References: Message-ID: > This is an issue with a dynamic load of a JVMTI agent into running VM. > The `VM_SetNotifyJvmtiEventsMode` enabling operation makes a call to the function `count_transitions_and_correct_jvmti_thread_states()`. This function in its turn make a call to the function `correct_jvmti_thread_state()`. But it does it conditionally, only if the field `_whitebox_used` is `true`. > The test provided in the bug report showed that it has to be called unconditionally as the assumption that it is only needed on the subsequent `notifyJvmti` enabling is incorrect. > > Then the field `_whitebox_used` is not needed anymore and removed in this fix. > Some obsolete comments are removed or updated. > > New test is added: `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest`. > It is failed without the fix and passed with the fix. > > Testing: > - ran new test `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest` > - mach5 tiers 1-5 are good Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: removed unneeded @compile commands from new test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14842/files - new: https://git.openjdk.org/jdk/pull/14842/files/f582fbc0..743188b5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14842&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14842&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 2 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14842.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14842/head:pull/14842 PR: https://git.openjdk.org/jdk/pull/14842 From sspitsyn at openjdk.org Wed Jul 12 08:01:59 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 12 Jul 2023 08:01:59 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach [v2] In-Reply-To: References: Message-ID: <95ap33tLjqW86oEv0QLda_wCPOwYsmVV1QNeTNX8Zq0=.33bf605a-cffe-4b9e-9f40-7cce5dd656ab@github.com> On Wed, 12 Jul 2023 06:11:04 GMT, Leonid Mesnik wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: removed unneeded @compile commands from new test > > test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest/VThreadTLSTest.java line 29: > >> 27: * @requires vm.continuations >> 28: * @requires vm.jvmti >> 29: * @compile VThreadTLSTest.java > > I believe that this line is redundant. jtreg compiles test automatically. Yes. I forgot to remove it. :) > test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest/VThreadTLSTest.java line 39: > >> 37: * @requires vm.continuations >> 38: * @requires vm.jvmti >> 39: * @compile VThreadTLSTest.java > > I believe that this line is redundant like 29. Yes. I forgot to remove it. :) Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14842#discussion_r1260756205 PR Review Comment: https://git.openjdk.org/jdk/pull/14842#discussion_r1260757587 From rkennke at openjdk.org Wed Jul 12 08:13:10 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 12 Jul 2023 08:13:10 GMT Subject: RFR: JDK-8311870: Split CompressedKlassPointers from compressedOops.hpp In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 11:20:19 GMT, Thomas Stuefe wrote: > In preparation for some Lilliput-related changes, I'd like to get some purely mechanical code moves out of the way. It would also improve separation of concerns and reduces include header bloat. > > In particular, this patch does: > > 1) Move `CompressedKlassPointers` from `compressedOops.(cpp|hpp|inline.hpp)` to `compressedKlass.(cpp|hpp|inline.hpp)` > > 2) flatten the `NarrowPtrStruct _narrow_klass` to `address _base; int _shift` (its implicit null check member is not needed for Klass and it has little merit otherwise). > > 3) moved `narrowKlass` from `oopsHierarchy.hpp` to `compressedKlass.hpp` > > 4) remove `KlassAlignment` and `LogKlassAlignment` (the word-sized variants, not xxxInBytes) since they are unused > > 5) Move `KlassEncodingMetaspaceMax`, `LogKlassAlignmentInBytes` and `KlassAlignmentInBytes` to compressedKlass.hpp > > 6) Fixed all include issues (including existing missing includes) > > 7) Fixed VM struct because of (2) > > Note that nothing functional is changed. Looks good! Looks good to me! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14826#pullrequestreview-1525765735 PR Review: https://git.openjdk.org/jdk/pull/14826#pullrequestreview-1525767509 From rrich at openjdk.org Wed Jul 12 08:33:16 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 12 Jul 2023 08:33:16 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen Message-ID: This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. The algorithm to share scanning large arrays is supposed to be a straight forward extension of the scheme implemented in `PSCardTable::scavenge_contents_parallel`. - A worker scans the part of a large array located in its stripe - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. #### Performance testing ##### BigArrayInOldGenRR.java [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. Observations * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at least 2 workers if possible (see `WorkerPolicy::calc_active_workers`) I suspect that the inverse scaling in the test is caused by using task stealing to distribute the work, which is too fine-grained. Each task refers to a single array element that needs to be traced. Neighboring array elements are likely to be taken by different threads. The array reference and card table updates by different threads result in poor cache performance due to false sharing. ##### Large Gerrit Instance on JDK11 We've received reports about long young pauses of Parallel GC of a large [Gerrit](https://en.wikipedia.org/wiki/Gerrit_(software)) Instance (-Xmx256g, -Xmn128g, 176 cores, 352 Intel HyperThreads). There were spikes of 20s, 30s and up to 50s young pauses. To begin with, Parallel configures 223 worker threads. Given that 2 HT on the same core cannot run concurrently we decided to configure just 100 gc threads. We noticed that there was always one thread taking extremely long for old-to-young-roots-task. With tracing we saw there were one or two arrays being scanned with 4-5 million elements. We've tried to tune task stealing and further reduce the number of threads but if it helped to reduce the peaks then average young pauses would increase. Finally we've implemented parallel scanning of large arrays. With an [experimental backport to jdk11](https://github.com/openjdk/jdk11u-dev/compare/master...reinrich:jdk11u-dev:ps_parallel_scanning_of_large_arrays_in_old__BACKPORT_JDK11) of this pr the extremely long young pauses did not occur anymore. Results are presented in [gerrit_young_gcs.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/gerrit_young_gcs.pdf)([gerrit_young_gcs.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/gerrit_young_gcs.ods)). Full logs are given with [javagc-2023-04-28_07-52-02.log.gz](https://cr.openjdk.org/~rrich/webrevs/8310031/javagc-2023-04-28_07-52-02.log.gz) and [javagc-2023-07-05__with_fix.log.gz](https://cr.openjdk.org/~rrich/webrevs/8310031/javagc-2023-07-05__with_fix.log.gz). ##### Renaissance Benchmarks I've used the Renaissance Benchmark Suite to test this PR. The benchmarks do not seem to put a lot of stress on gc though. [renaissance.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/renaissance.pdf)([renaissance.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/renaissance.ods)) shows the results. Differences in the results look like noise to me. #### Functional testing JTReg tests: tier1-4 of hotspot and jdk. All of Langtools and jaxp. The tests were executed repeatedly and also once with Parallel as default GC. All testing was done with fastdebug and release builds on the main platforms and also on Linux/PPC64le. ------------- Commit messages: - 8310031: Parallel: Implement better work distribution for large object arrays in old gen Changes: https://git.openjdk.org/jdk/pull/14846/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310031 Stats: 148 lines in 6 files changed: 139 ins; 1 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From shade at openjdk.org Wed Jul 12 08:44:27 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Jul 2023 08:44:27 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Mon, 10 Jul 2023 13:53:36 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: > > - Make test spikes more pronounced > - Dont query procfs if logging is off > - rename logtag again > - When probing for safepoint end, use the smaller of (interval, 250ms) > - Remove TrimNativeHeap and expand TrimNativeHeapInterval > - Improve comments for non-supportive platforms > - Aleksey cosmetics > - suspend count return 16 bits > - Fix linker errors > - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap > - ... and 22 more: https://git.openjdk.org/jdk/compare/417c1c87...15566761 Hopefully a final read. I think there are minor things left, see comments. Alternatively, apply this patch over your current PR, which contains fixes for my comments, and then some polishing: [trimnative-shipilev-1.patch](https://github.com/openjdk/jdk/files/12025768/trimnative-shipilev-1.patch) ...also, Windows builds are failing. src/hotspot/share/runtime/trimNativeHeap.cpp line 47: > 45: > 46: // Statistics > 47: unsigned _num_trims_performed; Sorry for the nit, but this is `uint16_t` too then, for consistency? src/hotspot/share/runtime/trimNativeHeap.cpp line 75: > 73: SafepointSynchronize::is_synchronizing(); > 74: } > 75: static constexpr int64_t safepoint_poll_ms = 250; Let's document this a little: // Upper limit for the backoff during pending/in-progress safepoint. // Chosen as reasonable value to balance the overheads of waking up // during the safepoint, which might have undesired effects on latencies, // and the accuracy in tracking the trimming interval. static constexpr int64_t safepoint_poll_ms = 250; src/hotspot/share/runtime/trimNativeHeap.cpp line 90: > 88: assert(NativeHeapTrimmer::enabled(), "Only call if enabled"); > 89: > 90: LogStartStopMark logStartStop; Hotspot style: no camel case for local identifiers. src/hotspot/share/runtime/trimNativeHeap.cpp line 110: > 108: } else if (at_or_nearing_safepoint()) { > 109: const int64_t wait_ms = MIN2((int64_t)TrimNativeHeapInterval, safepoint_poll_ms); > 110: ml.wait(safepoint_poll_ms); `MIN2(..., ...)` might work better without a cast? Also, let's actually use `wait_ms` here :) src/hotspot/share/runtime/trimNativeHeap.cpp line 117: > 115: tnow = now(); > 116: > 117: } while (at_or_nearing_safepoint() || is_suspended() || next_trim_time > tnow); Maybe invert this to pre-condition while? src/hotspot/share/runtime/trimNativeHeap.cpp line 120: > 118: } // Lock scope > 119: > 120: // 2 - Trim outside of lock protection. There is no `1 -` to match this `2 -` to. src/hotspot/share/runtime/trimNativeHeap.hpp line 46: > 44: static void cleanup(); > 45: > 46: static uint64_t num_trims_performed(); Need this for anything? Does not seem to be implemented. test/hotspot/jtreg/runtime/os/TestTrimNative.java line 34: > 32: * @build jdk.test.whitebox.WhiteBox > 33: * @run driver jdk.test.lib.helpers.ClassFileInstaller jdk.test.whitebox.WhiteBox > 34: * @run main/othervm -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI TestTrimNative trimNative I see that we always spawn a VM with Whitebox enabled explicitly there. Do you need to enable Whitebox for these? Also, can these be just `@run driver`? ------------- PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1522725836 PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1632093967 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258705275 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258833848 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258835428 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1260797898 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1260809509 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1260799630 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1258840799 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1260764210 From vkempik at openjdk.org Wed Jul 12 08:49:37 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Wed, 12 Jul 2023 08:49:37 GMT Subject: RFR: 8310656: RISC-V: __builtin___clear_cache can fail silently. [v3] In-Reply-To: <4eyogulkaSvi1d-xVbPCAp_mwRSD5sHyfysJj2Gat2A=.abfd20ed-c21d-4150-b25c-e4f9a5b71868@github.com> References: <6JeLSyWD6twDLD83OPiG-_5lTgGVHn8dh-rKkc7scmM=.9b7be0e7-cb20-44c6-8d28-d72c00d41edf@github.com> <4eyogulkaSvi1d-xVbPCAp_mwRSD5sHyfysJj2Gat2A=.abfd20ed-c21d-4150-b25c-e4f9a5b71868@github.com> Message-ID: On Sat, 1 Jul 2023 11:11:15 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> We recently had a bug where user were missing permissions to use this syscall. >> Which caused crashing on, according to hs_err on things like "addi x11, x24, 0" with SIGILL. >> If it fails it is even possible to execute valid but 'old' instruction which may not lead to a crash, instead the program misbehaves. >> >> To avoid this mess I suggest that we first test the syscall during vm init and we use it directly. >> This way we can make sure it never fails. >> >> Tested failing syscall with qemu, tested t1 in qemu, t1 on jh7110 in-progress. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - merge update and nits > - Merge branch 'master' into 8310656 > - added back data barrier > > Signed-off-by: Robbin Ehn > - 8310656: RISC-V: __builtin___clear_cache can fail silently. would you mind backporting it to 17u-dev as well ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14670#issuecomment-1632101141 From shade at openjdk.org Wed Jul 12 08:51:23 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 12 Jul 2023 08:51:23 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Mon, 10 Jul 2023 13:53:36 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: > > - Make test spikes more pronounced > - Dont query procfs if logging is off > - rename logtag again > - When probing for safepoint end, use the smaller of (interval, 250ms) > - Remove TrimNativeHeap and expand TrimNativeHeapInterval > - Improve comments for non-supportive platforms > - Aleksey cosmetics > - suspend count return 16 bits > - Fix linker errors > - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap > - ... and 22 more: https://git.openjdk.org/jdk/compare/31ef4f2a...15566761 One more thing, as I play with it: the GC logging does not have a comma before timestamp, see: [1.210s][info][trimnative] Trim native heap (1): RSS+Swap: 1192M->1191M (-1552K), 0.353ms [1.528s][info][gc ] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 91M->78M(1024M) 73.040ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1632104321 From aph at openjdk.org Wed Jul 12 08:56:17 2023 From: aph at openjdk.org (Andrew Haley) Date: Wed, 12 Jul 2023 08:56:17 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v3] In-Reply-To: References: Message-ID: <5Gygzr9uLJCx5a7mS4w_fDs-_krL3SA9pO1P0UC97Gc=.8bc76719-f6ca-4352-9052-2ccc62b7f70b@github.com> On Wed, 12 Jul 2023 01:35:26 GMT, Coleen Phillimore wrote: >> Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. >> >> Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Unset Wconversion src/hotspot/share/asm/assembler.hpp line 304: > 302: } else { > 303: return checked_cast(x); > 304: } Indentation looks wrong? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14822#discussion_r1260841166 From stuefe at openjdk.org Wed Jul 12 09:10:27 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 12 Jul 2023 09:10:27 GMT Subject: RFR: JDK-8311870: Split CompressedKlassPointers from compressedOops.hpp In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 08:11:00 GMT, Roman Kennke wrote: > Looks good to me! Thank you @rkennke ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14826#issuecomment-1632130286 From stuefe at openjdk.org Wed Jul 12 09:10:29 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 12 Jul 2023 09:10:29 GMT Subject: Integrated: JDK-8311870: Split CompressedKlassPointers from compressedOops.hpp In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 11:20:19 GMT, Thomas Stuefe wrote: > In preparation for some Lilliput-related changes, I'd like to get some purely mechanical code moves out of the way. It would also improve separation of concerns and reduces include header bloat. > > In particular, this patch does: > > 1) Move `CompressedKlassPointers` from `compressedOops.(cpp|hpp|inline.hpp)` to `compressedKlass.(cpp|hpp|inline.hpp)` > > 2) flatten the `NarrowPtrStruct _narrow_klass` to `address _base; int _shift` (its implicit null check member is not needed for Klass and it has little merit otherwise). > > 3) moved `narrowKlass` from `oopsHierarchy.hpp` to `compressedKlass.hpp` > > 4) remove `KlassAlignment` and `LogKlassAlignment` (the word-sized variants, not xxxInBytes) since they are unused > > 5) Move `KlassEncodingMetaspaceMax`, `LogKlassAlignmentInBytes` and `KlassAlignmentInBytes` to compressedKlass.hpp > > 6) Fixed all include issues (including existing missing includes) > > 7) Fixed VM struct because of (2) > > Note that nothing functional is changed. This pull request has now been integrated. Changeset: 753bd563 Author: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/753bd563ecca6bb5ff9b5ebc0957bc1854dce78d Stats: 565 lines in 35 files changed: 333 ins; 220 del; 12 mod 8311870: Split CompressedKlassPointers from compressedOops.hpp Reviewed-by: iklam, rkennke ------------- PR: https://git.openjdk.org/jdk/pull/14826 From fyang at openjdk.org Wed Jul 12 09:26:34 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 12 Jul 2023 09:26:34 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v47] In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 13:21:30 GMT, Roman Kennke wrote: >> Roman Kennke has updated the pull request incrementally with three additional commits since the last revision: >> >> - Use BytesPerWord >> - Revert unnecessary change in s390 >> - Revert unnecessary change in PPC > > I've simplified the PR significantly: > - The gap is now usually cleared as part of the header initialization (as is already done for instance oops). This allows to use simple and fast word-sized clearing of the rest of the array. > - Calculations of min and max array and tlab sizes are all reverted, because they are still conservatively correct. Optimizing those for a few bytes extra seems to be a rather complex task and should be done as separate PR. > - The ARM parts could be reverted wholesale (sorry, @tstuefe) because 32 bit arch doesn't require any changes anymore. > > @RealFYang can you please check the RISCV port and bring it into the same shape as aarch64/x86? > @TheRealMDoerr Can you ack PPC and s390 ports? I've only done very minor cleanups there, compared to early version. > @coleenp please review again? Maybe bring others in as you see fit and/or run Mach5 testing (preferably also with -XX:-UseCompressedClassPointers, because that is what this PR is about) @rkennke : Here you go. Small update for RISC-V by referencing AArch64. [11044-riscv-fix.txt](https://github.com/openjdk/jdk/files/12026158/11044-riscv-fix.txt) Run non-trivial benchmark workloads, will perform more tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/11044#issuecomment-1632158856 From simonis at openjdk.org Wed Jul 12 11:49:18 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 12 Jul 2023 11:49:18 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > 8307352: AARCH64: Improve itable_stub I wonder if you can get rid of `temp_reg3` (and the `holder_offset` argument of `MacroAssembler::lookup_interface_method_stub()`) by defining `resolved_klass_reg` as `r17` in `VtableStubs::create_itable_stub()` and defining `Register holder_offset = method_result` as local variable in `MacroAssembler::lookup_interface_method_stub()` (the usages of `holder_offset` and `method_result` don't seem to overlap). Otherwise looks good. ------------- PR Review: https://git.openjdk.org/jdk/pull/13792#pullrequestreview-1526166815 From simonis at openjdk.org Wed Jul 12 11:49:21 2023 From: simonis at openjdk.org (Volker Simonis) Date: Wed, 12 Jul 2023 11:49:21 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 11:26:41 GMT, Evgeny Astigeevich wrote: >> Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: >> >> 8307352: AARCH64: Improve itable_stub > > src/hotspot/cpu/aarch64/vtableStubs_aarch64.cpp line 200: > >> 198: temp_reg, temp_reg2, temp_reg3, itable_index, L_no_such_interface); >> 199: >> 200: const ptrdiff_t lookupSize = __ pc() - start_pc; > > I think you can rename `lookupSize` into `codesize`. Agree. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1261058060 From coleenp at openjdk.org Wed Jul 12 12:17:31 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jul 2023 12:17:31 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v4] In-Reply-To: References: Message-ID: > Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. > > Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Fix indentation, thanks for pointing that out @aph. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14822/files - new: https://git.openjdk.org/jdk/pull/14822/files/71c8fdec..c3c41267 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14822&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14822&range=02-03 Stats: 8 lines in 1 file changed: 0 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14822.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14822/head:pull/14822 PR: https://git.openjdk.org/jdk/pull/14822 From jpbempel at openjdk.org Wed Jul 12 12:44:13 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Wed, 12 Jul 2023 12:44:13 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 00:34:59 GMT, David Holmes wrote: >> Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. > > src/hotspot/share/oops/constantPool.cpp line 1361: > >> 1359: int recur2 = cp2->uncached_klass_ref_index_at(index2); >> 1360: bool match = compare_entry_to(recur1, cp2, recur2); >> 1361: match |= is_unresolved_class_mismatch(recur1, cp2, recur2); > > This is changing the definition of a "match" such that in all cases an unresolved class entry is considered equal to a resolved class entry - but only for these Field and MethodRefs. I don't see how this connects back to the original problem. Nor do I see why this is alway a correct thing to do. I agree that it may be not the right place to call is_unresolved_class_mismatch method because we capture all cases calling compare_entry_to. For a JVM_CONSTANT_Class, compare_entry_to is recovered by the call to is_unresolved_class_mismatch in VM_RedefineClasses::merge_constant_pools. The idea is to do the same for other reference to the class in FieldRefs and (Interface)MethodRefs because otherwise the entry does not match with pre-resolved Throwable class. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1261120394 From eastigeevich at openjdk.org Wed Jul 12 13:37:14 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 12 Jul 2023 13:37:14 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > 8307352: AARCH64: Improve itable_stub src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1283: > 1281: bind(L_loop_scan_resolved); > 1282: ldr(temp_itbl_klass, Address(pre(scan_temp, scan_step))); > 1283: bind(L_loop_scan_resolved_entry); Is the indentation of `bind` correct? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1261189923 From dnsimon at openjdk.org Wed Jul 12 13:56:31 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 12 Jul 2023 13:56:31 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests Message-ID: This PR adds support for a `vm.libgraal.enabled` property that can be used with the jtreg `@requires` tag to determine if the VM under test is using libgraal as the JIT. ------------- Commit messages: - add support for vm.libgraal.enabled property Changes: https://git.openjdk.org/jdk/pull/14851/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311946 Stats: 23 lines in 3 files changed: 23 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14851.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14851/head:pull/14851 PR: https://git.openjdk.org/jdk/pull/14851 From jpbempel at openjdk.org Wed Jul 12 14:04:04 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Wed, 12 Jul 2023 14:04:04 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: <2IK4-mDcKnsZr6krAGeX1Yi_8WmTL9TMB5JCovHhKCU=.6b24c922-a875-4320-a0b4-62521dd36909@github.com> Message-ID: <2h3nnsUGPKr-m57732KEyubp_ogkTtE_QTpjy2c81lc=.6f86e032-9b37-4178-9298-926502b1cdb2@github.com> On Fri, 7 Jul 2023 00:46:09 GMT, David Holmes wrote: > these comments both pertain to the problem at hand: the merge reverts the resolved class entry to be unresolved; unresolved entries are not considered matches for resolved ones, hence we grow the constant pool. And If if I am reading this correctly from `VM_RedefineClasses::merge_constant_pools`: bool match = scratch_cp->compare_entry_to(scratch_i, *merge_cp_p, scratch_i); if (match) { // found a match at the same index so nothing more to do continue; } else if (is_unresolved_class_mismatch(scratch_cp, scratch_i, *merge_cp_p, scratch_i)) { // The mismatch in compare_entry_to() above is because of a // resolved versus unresolved class entry at the same index // with the same string value. Since Pass 0 reverted any // class entries to unresolved class entries in *merge_cp_p, // we go with the unresolved class entry. continue; } In a case of `JVM_CONSTANT_Class`/`JVM_CONSTANT_UnresolvedClass` this is handled specially with is_unresolved_class_mismatch. But the same problem rises again when dealing with `JVM_CONSTANT_Methodref` and al. because the unresolved state contains also unresolved class for the scratch while for old_cp it is a resolved one. The idea behind my fix proposal is to do the same as for `JVM_CONSTANT_Class`. But I can understand your concern that introducing this in a method that could be used outside of the code of class redefintion is worrisome. I can try to find another way to do the same but specifically in the context of class redefinition. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1632587401 From rkennke at openjdk.org Wed Jul 12 14:14:46 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Wed, 12 Jul 2023 14:14:46 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v50] In-Reply-To: References: Message-ID: > See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. > > Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. > > Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. > > Testing: > - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) > - [x] tier1 (x86_64, x86_32, aarch64, riscv) > - [x] tier2 (x86_64, aarch64, riscv) > - [x] tier3 (x86_64, riscv) Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: RISCV fixes by @RealYFang ------------- Changes: - all: https://git.openjdk.org/jdk/pull/11044/files - new: https://git.openjdk.org/jdk/pull/11044/files/89fd83f5..b54cd4bf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=49 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=11044&range=48-49 Stats: 22 lines in 1 file changed: 7 ins; 9 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/11044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11044/head:pull/11044 PR: https://git.openjdk.org/jdk/pull/11044 From jpbempel at openjdk.org Wed Jul 12 14:17:17 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Wed, 12 Jul 2023 14:17:17 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: <9qUwwSSO-rpcOlcgXyydiX697xEMZ2qKVRmTgkbX9ts=.9b9f987e-cfeb-4b46-a923-885a67e25f97@github.com> On Tue, 11 Jul 2023 01:03:29 GMT, Coleen Phillimore wrote: > ConstantPool merging is still really complicated and JVM_CONSTANT_Class/UnresolvedClass (and UnresolvedClassInError) may have changed to not track with what RedefineClasses does. If I'm reading this correctly, we are always calling find_matching_entry() and compare_entries_to() between the merged_cp and scratch_cp. Since we unresolve classes in merge_cp, and scratch_cp hasn't resolved anything yet, how are we getting this mismatch? Can you post a stack trace for us? I have try to explain in the comment to David above. Tell me if there is something that is still not clear. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1632608966 From matsaave at openjdk.org Wed Jul 12 15:08:41 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 12 Jul 2023 15:08:41 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v8] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: RISCV port ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/733860f3..69c695d1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=06-07 Stats: 185 lines in 3 files changed: 101 ins; 36 del; 48 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From cslucas at openjdk.org Wed Jul 12 15:12:23 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 12 Jul 2023 15:12:23 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v21] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Thu, 6 Jul 2023 13:06:30 GMT, Cesar Soares Lucas wrote: >> Can I please get reviews for this PR? >> >> The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. >> >> With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) >> >> What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) >> >> This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. >> >> The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. >> >> The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. >> >> I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 22 commits: > > - Merge branch 'openjdk:master' into rematerialization-of-merges > - Addressing PR feedback. > - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Merge branch 'openjdk:master' into rematerialization-of-merges > - Rome minor refactorings. > - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > Catching up with master. > - Address PR review 6: debug format output & some refactoring. > - Catching up with master branch. > > Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Address PR review 6: refactoring around rematerialization & improve test cases. > - Address PR review 5: refactor on rematerialization & add tests. > - ... and 12 more: https://git.openjdk.org/jdk/compare/97e99f01...25b683d6 Thank you all for reviewing this PR! Your feedback made it much better! ------------- PR Comment: https://git.openjdk.org/jdk/pull/12897#issuecomment-1632721475 From matsaave at openjdk.org Wed Jul 12 15:18:02 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 12 Jul 2023 15:18:02 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v8] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 15:08:41 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > RISCV port > > > It looks like I misunderstood what `tbz` does! I believe you are correct in suggesting that the `andr` is not necessary. > > > > > > Hi, @matias9927, Thanks for update! We have already done the adaptation for RISC-V locally and we are currently testing. I will update the test results and give the corresponding patch later. > > Hi, @zifeihan and I have finished the RISC-V part and tier1-3 looks good. Please help us to add the RISC-V part, thanks a lot! [DingliZhang at a14433d](https://github.com/DingliZhang/jdk/commit/a14433d37c908362982bf2571a62f42bc236e7d8) on this branch: https://github.com/DingliZhang/jdk/tree/pr-14129-rebase Thank you for your contribution! I really appreciate the help. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1632732076 From stuefe at openjdk.org Wed Jul 12 15:20:12 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 12 Jul 2023 15:20:12 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 19:13:01 GMT, Ashutosh Mehra wrote: > > What would be needed to make the Annotations appear in the "printall" command? I was somehow expecting to see at least something like "Annotation at xxxx". > > I am not sure what all details `printall` is expected to emit out. Looking at the code, printall doesn't seem to use ClassWriter. It uses HTMLGenerator to format the method data. I can emit something like "Annotation at xxxx" but it would be more useful if it can display the contents of the annotations. Unfortunately HTMLGenerator doesn't understand Annotations at all. Probably it is better left for another task. No problem at all. I only thought about this because it would have making a regression test very easy (there is a jtreg test that calls printall and then parses the output). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14735#issuecomment-1632735373 From iklam at openjdk.org Wed Jul 12 16:42:13 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 12 Jul 2023 16:42:13 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 05:51:58 GMT, Thomas Stuefe wrote: > > The output looks like oopDesc::print_on(tty), but we need to print the pointers using the locations of the objects at runtime. > > > > I don't understand. Are you not experimenting with a new allocation mode that would basically make the runtime address unpredictable? Or is this still for the old way of things where the heap archive is mapped as-is ? > > > > Other than that, this looks very useful. This map file is useful for both the existing and experimental loading code. My experimental loading code uses the same format for input. I partially wrote this PR to test the new loading code, which assumes contiguous objects in the input stream. So at runtime, I just need to look at the lower 3 digits, which would be the same as in the map file. The map file also shows how the objects are grouped. For example, I am reordering the objects so that the first 50% of them don't need relocation. That can be verified with the map file. Also, even with the new loading approach, we could have a future optimization where the GC knows that we are still in the archive loading stage, and can return contiguous objects that start with the requested address. This way, we can avoid relocation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14841#issuecomment-1632868383 From jiangli at openjdk.org Wed Jul 12 16:54:28 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 12 Jul 2023 16:54:28 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v3] In-Reply-To: References: Message-ID: > Move StringTable to JavaClassFile namespace. Jiangli Zhou has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'master' into JDK-8311661 - - Change to use a hotspot global namespace, based on the discussions in this PR thread. The namespace is 'hotspot_jvm', which may have less chance of collisions than 'hotspot'. - Use 'using' directive in precompiled.hpp, as suggested by others in the PR comments. - Merge branch 'master' into JDK-8311661 - Move '} // namespace JavaClassFile' to after '#endif //INCLUDE_CDS_JAVA_HEAP'. - 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14808/files - new: https://git.openjdk.org/jdk/pull/14808/files/c31ad972..3385b743 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14808&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14808&range=01-02 Stats: 1783 lines in 118 files changed: 1218 ins; 333 del; 232 mod Patch: https://git.openjdk.org/jdk/pull/14808.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14808/head:pull/14808 PR: https://git.openjdk.org/jdk/pull/14808 From kvn at openjdk.org Wed Jul 12 18:22:12 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 12 Jul 2023 18:22:12 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests In-Reply-To: References: Message-ID: <1WwNYGuJqlY5sywpgtpC9mzaWfZ-gdbF4-WgLukMe0I=.96ba2e7b-edbc-4b32-b45a-b3446d8403ec@github.com> On Wed, 12 Jul 2023 13:48:23 GMT, Doug Simon wrote: > This PR adds support for a `vm.libgraal.enabled` property that can be used with the jtreg `@requires` tag to determine if the VM under test is using libgraal as the JIT. test/lib/jdk/test/whitebox/code/Compiler.java line 103: > 101: return false; > 102: } > 103: Boolean useJvmciNativeLibrary = WB.getBooleanVMFlag("UseJVMCICompiler"); Typo: "UseJVMCICompiler". Should be "UseJVMCINativeLibrary". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1261565058 From duke at openjdk.org Wed Jul 12 18:48:42 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 12 Jul 2023 18:48:42 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v4] In-Reply-To: References: Message-ID: > Please review this PR that enables ClassWriter to write annotations to the class file being dumped. > > The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. > The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: More review comments Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14735/files - new: https://git.openjdk.org/jdk/pull/14735/files/1d79e734..f21b53ab Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14735&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14735&range=02-03 Stats: 20 lines in 2 files changed: 10 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14735.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14735/head:pull/14735 PR: https://git.openjdk.org/jdk/pull/14735 From duke at openjdk.org Wed Jul 12 18:48:44 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 12 Jul 2023 18:48:44 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v4] In-Reply-To: References: <4S1x7PphEjaz7kCe2uJTmFyaIccpEIn9fh52Sr4neMg=.6a3dac9e-feb0-459e-8972-e7cdbb48855e@github.com> Message-ID: On Tue, 11 Jul 2023 20:28:29 GMT, Chris Plummer wrote: >> I think VMObjectFactory is a better place to implement the caching behavior so that all such patterns can benefit from it. I think it is better addressed in another task. > > I think maybe you misunderstood what I meant by "cache". I'm not talking about a hashmap of weak references or anything like that. Just add a `ArrayOfU1Array annotationsArray` field to the Annotations object and store the result there. Got it. Updated the code as suggested. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1261593029 From duke at openjdk.org Wed Jul 12 18:48:48 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 12 Jul 2023 18:48:48 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 20:33:59 GMT, Chris Plummer wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Some code motion and factoring out common code >> >> Signed-off-by: Ashutosh Mehra > > src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/oops/ConstMethod.java line 641: > >> 639: // | | >> 640: // V V >> 641: // | ... | default | type | parameter | method | > > So the `...` part is represented by `getSize()`? It would be good to call that out. Also point out that each of the annotations pointers is optional. Updated the comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14735#discussion_r1261592735 From coleenp at openjdk.org Wed Jul 12 19:00:21 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Wed, 12 Jul 2023 19:00:21 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 05:18:01 GMT, Jean-Philippe Bempel wrote: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. Hi, I now see why there's a resolved class in the scratch constant pool (ie the constant pool for the new class bytes). It looks like javac has preresolved it for us, which you said but I didn't know why. The merged constant pool always has an unresolved class copy of that class entry, because we create the merged constant pool with all unresolved classes. I now believe that ConstantPool::compare_entry_to should always compare class and resolved class as equivalent. Since the klass_name_at() function is the same call for both unresolved and resolved class, you could change the tag at the beginning of compare_entry_to() to JVM_CONSTANT_UnresolvedClass if it's resolved, like we do for error classes. Then remove the is_unresolved_class_mismatch function entirely (the comment at that call might be useful to explain why you're changing the tag in compare_entry_to). This code is only used for RedefineClasses, but even if it wasn't this comparison is the right thing to do. For special measure, you could #if INCLUDE_JVMTI around this and the operand functions that call it or file an RFE to do so for further cleanup (compare_entry_to, find_matching_entry, find_matching_operand, compare_operand_to). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1633051058 From jiangli at openjdk.org Wed Jul 12 19:29:31 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 12 Jul 2023 19:29:31 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v4] In-Reply-To: References: Message-ID: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> > Move StringTable to JavaClassFile namespace. Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: Add hotspot_jvm:: to StringTable in javaClasses.hpp. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14808/files - new: https://git.openjdk.org/jdk/pull/14808/files/3385b743..db681ecf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14808&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14808&range=02-03 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14808.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14808/head:pull/14808 PR: https://git.openjdk.org/jdk/pull/14808 From dnsimon at openjdk.org Wed Jul 12 19:46:19 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 12 Jul 2023 19:46:19 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v2] In-Reply-To: References: Message-ID: > This PR adds support for a `vm.libgraal.enabled` property that can be used with the jtreg `@requires` tag to determine if the VM under test is using libgraal as the JIT. Doug Simon has updated the pull request incrementally with one additional commit since the last revision: fix wrong flag name ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14851/files - new: https://git.openjdk.org/jdk/pull/14851/files/38d29d35..c0cbca79 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14851.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14851/head:pull/14851 PR: https://git.openjdk.org/jdk/pull/14851 From dnsimon at openjdk.org Wed Jul 12 19:46:21 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 12 Jul 2023 19:46:21 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v2] In-Reply-To: <1WwNYGuJqlY5sywpgtpC9mzaWfZ-gdbF4-WgLukMe0I=.96ba2e7b-edbc-4b32-b45a-b3446d8403ec@github.com> References: <1WwNYGuJqlY5sywpgtpC9mzaWfZ-gdbF4-WgLukMe0I=.96ba2e7b-edbc-4b32-b45a-b3446d8403ec@github.com> Message-ID: On Wed, 12 Jul 2023 18:18:49 GMT, Vladimir Kozlov wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> fix wrong flag name > > test/lib/jdk/test/whitebox/code/Compiler.java line 103: > >> 101: return false; >> 102: } >> 103: Boolean useJvmciNativeLibrary = WB.getBooleanVMFlag("UseJVMCICompiler"); > > Typo: "UseJVMCICompiler". Should be "UseJVMCINativeLibrary". That's embarrassing! Thanks for catching it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1261650544 From kvn at openjdk.org Wed Jul 12 21:21:03 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 12 Jul 2023 21:21:03 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v2] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 19:46:19 GMT, Doug Simon wrote: >> This PR adds support for a `vm.libgraal.enabled` property that can be used with the jtreg `@requires` tag to determine if the VM under test is using libgraal as the JIT. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > fix wrong flag name Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14851#pullrequestreview-1527244691 From dholmes at openjdk.org Wed Jul 12 21:37:02 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 12 Jul 2023 21:37:02 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 18:57:14 GMT, Coleen Phillimore wrote: > It looks like javac has preresolved it for us, which you said but I didn't know why. The preresolving is done by the verifier. https://bugs.openjdk.org/browse/JDK-8308762?focusedCommentId=14584707&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14584707 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1633242164 From jiangli at openjdk.org Wed Jul 12 23:10:56 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Wed, 12 Jul 2023 23:10:56 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: <0J49MlUO8xVYipLytP2Fzd9KNWZfd0J3fge92B0Mq3g=.388bfe76-2854-466b-a64c-3a93d00b168a@github.com> References: <5cl-bjjTvbpCd2yM0xRcarMllvDMqyapqD2ec2_Z_mc=.205b5d59-6bd5-4978-873b-49f32bf458a3@github.com> <0J49MlUO8xVYipLytP2Fzd9KNWZfd0J3fge92B0Mq3g=.388bfe76-2854-466b-a64c-3a93d00b168a@github.com> Message-ID: On Wed, 12 Jul 2023 01:26:08 GMT, Jiangli Zhou wrote: > Add `using ` to precompiled.hpp does help avoid adding `::` in many places. We still need to put the implementation code inside `namespace { ...}`, or use `::`. I experimented with StringTable and only needed to edit stringTable.* and precompiled.hpp. I tested with and without `--disable-precompiled-headers` and both built ok. src/hotspot/share/classfile/javaClasses.hpp also needs change, otherwise it fails to build on windows. I updated the PR with suggestions incorporated. Please take a look of the updated version, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1633318213 From cjplummer at openjdk.org Wed Jul 12 23:23:54 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 12 Jul 2023 23:23:54 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 18:48:42 GMT, Ashutosh Mehra wrote: >> Please review this PR that enables ClassWriter to write annotations to the class file being dumped. >> >> The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. >> >> Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` >> Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. >> The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > More review comments > > Signed-off-by: Ashutosh Mehra Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14735#pullrequestreview-1527362780 From cjplummer at openjdk.org Thu Jul 13 00:53:18 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Thu, 13 Jul 2023 00:53:18 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach [v2] In-Reply-To: References: Message-ID: <8StbT0fyZV89YHANACgI2Lew5kde0fO7TRlI-IP-6h4=.407bf7a1-504f-4fdb-ba1a-965f6741b127@github.com> On Wed, 12 Jul 2023 08:01:55 GMT, Serguei Spitsyn wrote: >> This is an issue with a dynamic load of a JVMTI agent into running VM. >> The `VM_SetNotifyJvmtiEventsMode` enabling operation makes a call to the function `count_transitions_and_correct_jvmti_thread_states()`. This function in its turn make a call to the function `correct_jvmti_thread_state()`. But it does it conditionally, only if the field `_whitebox_used` is `true`. >> The test provided in the bug report showed that it has to be called unconditionally as the assumption that it is only needed on the subsequent `notifyJvmti` enabling is incorrect. >> >> Then the field `_whitebox_used` is not needed anymore and removed in this fix. >> Some obsolete comments are removed or updated. >> >> New test is added: `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest`. >> It is failed without the fix and passed with the fix. >> >> Testing: >> - ran new test `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest` >> - mach5 tiers 1-5 are good > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: removed unneeded @compile commands from new test Looks good other than one minor typo. test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest/VThreadTLSTest.java line 68: > 66: return; > 67: } > 68: for (int repetion = 0; repetion < 10; repetion++) { repetion -> repetition ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14842#pullrequestreview-1527411302 PR Review Comment: https://git.openjdk.org/jdk/pull/14842#discussion_r1261855278 From sspitsyn at openjdk.org Thu Jul 13 01:28:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 13 Jul 2023 01:28:57 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach [v3] In-Reply-To: References: Message-ID: > This is an issue with a dynamic load of a JVMTI agent into running VM. > The `VM_SetNotifyJvmtiEventsMode` enabling operation makes a call to the function `count_transitions_and_correct_jvmti_thread_states()`. This function in its turn make a call to the function `correct_jvmti_thread_state()`. But it does it conditionally, only if the field `_whitebox_used` is `true`. > The test provided in the bug report showed that it has to be called unconditionally as the assumption that it is only needed on the subsequent `notifyJvmti` enabling is incorrect. > > Then the field `_whitebox_used` is not needed anymore and removed in this fix. > Some obsolete comments are removed or updated. > > New test is added: `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest`. > It is failed without the fix and passed with the fix. > > Testing: > - ran new test `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest` > - mach5 tiers 1-5 are good Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: renamed incorrectly spelled local variable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14842/files - new: https://git.openjdk.org/jdk/pull/14842/files/743188b5..0a86e71d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14842&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14842&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14842.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14842/head:pull/14842 PR: https://git.openjdk.org/jdk/pull/14842 From sspitsyn at openjdk.org Thu Jul 13 01:29:01 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 13 Jul 2023 01:29:01 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach [v2] In-Reply-To: <8StbT0fyZV89YHANACgI2Lew5kde0fO7TRlI-IP-6h4=.407bf7a1-504f-4fdb-ba1a-965f6741b127@github.com> References: <8StbT0fyZV89YHANACgI2Lew5kde0fO7TRlI-IP-6h4=.407bf7a1-504f-4fdb-ba1a-965f6741b127@github.com> Message-ID: On Thu, 13 Jul 2023 00:48:46 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: removed unneeded @compile commands from new test > > test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest/VThreadTLSTest.java line 68: > >> 66: return; >> 67: } >> 68: for (int repetion = 0; repetion < 10; repetion++) { > > repetion -> repetition Thanks! Renamed it to `count`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14842#discussion_r1261873276 From sspitsyn at openjdk.org Thu Jul 13 01:59:26 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 13 Jul 2023 01:59:26 GMT Subject: Integrated: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 05:02:53 GMT, Serguei Spitsyn wrote: > This is an issue with a dynamic load of a JVMTI agent into running VM. > The `VM_SetNotifyJvmtiEventsMode` enabling operation makes a call to the function `count_transitions_and_correct_jvmti_thread_states()`. This function in its turn make a call to the function `correct_jvmti_thread_state()`. But it does it conditionally, only if the field `_whitebox_used` is `true`. > The test provided in the bug report showed that it has to be called unconditionally as the assumption that it is only needed on the subsequent `notifyJvmti` enabling is incorrect. > > Then the field `_whitebox_used` is not needed anymore and removed in this fix. > Some obsolete comments are removed or updated. > > New test is added: `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest`. > It is failed without the fix and passed with the fix. > > Testing: > - ran new test `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest` > - mach5 tiers 1-5 are good This pull request has now been integrated. Changeset: 11a5115c Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/11a5115caf179a1bbed5311e12ed3851e026c5c5 Stats: 196 lines in 3 files changed: 181 ins; 13 del; 2 mod 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach Reviewed-by: lmesnik, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/14842 From sspitsyn at openjdk.org Thu Jul 13 03:48:35 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 13 Jul 2023 03:48:35 GMT Subject: [jdk21] RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach Message-ID: Clean backport from mainline jdk repo to jdk21 for the fix of: [8311556](https://bugs.openjdk.org/browse/JDK-8311556): GetThreadLocalStorage not working for vthreads mounted during JVMTI attach Testing: - TBD: mach5 tiers 1-5 ------------- Commit messages: - Backport 11a5115caf179a1bbed5311e12ed3851e026c5c5 Changes: https://git.openjdk.org/jdk21/pull/117/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=117&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311556 Stats: 196 lines in 3 files changed: 181 ins; 13 del; 2 mod Patch: https://git.openjdk.org/jdk21/pull/117.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/117/head:pull/117 PR: https://git.openjdk.org/jdk21/pull/117 From dlong at openjdk.org Thu Jul 13 04:18:18 2023 From: dlong at openjdk.org (Dean Long) Date: Thu, 13 Jul 2023 04:18:18 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v3] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 01:35:26 GMT, Coleen Phillimore wrote: >> Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. >> >> Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Unset Wconversion This seems more than adequate for removing -Wconversion warnings. ------------- Marked as reviewed by dlong (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14822#pullrequestreview-1527541944 From sspitsyn at openjdk.org Thu Jul 13 04:38:23 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 13 Jul 2023 04:38:23 GMT Subject: RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach [v3] In-Reply-To: References: Message-ID: <_8TFhnIV7_DC-kzYHgMh-PIfCfbeTqk9vTtpCTwm8J0=.1cafab6f-1e11-4129-b1a4-3f58595f642b@github.com> On Thu, 13 Jul 2023 01:28:57 GMT, Serguei Spitsyn wrote: >> This is an issue with a dynamic load of a JVMTI agent into running VM. >> The `VM_SetNotifyJvmtiEventsMode` enabling operation makes a call to the function `count_transitions_and_correct_jvmti_thread_states()`. This function in its turn make a call to the function `correct_jvmti_thread_state()`. But it does it conditionally, only if the field `_whitebox_used` is `true`. >> The test provided in the bug report showed that it has to be called unconditionally as the assumption that it is only needed on the subsequent `notifyJvmti` enabling is incorrect. >> >> Then the field `_whitebox_used` is not needed anymore and removed in this fix. >> Some obsolete comments are removed or updated. >> >> New test is added: `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest`. >> It is failed without the fix and passed with the fix. >> >> Testing: >> - ran new test `test/hotspot/jtreg/serviceability/jvmti/vthread/VThreadTLSTest` >> - mach5 tiers 1-5 are good > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: renamed incorrectly spelled local variable Leonid a Chris, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14842#issuecomment-1633531286 From thartmann at openjdk.org Thu Jul 13 05:16:11 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 13 Jul 2023 05:16:11 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v2] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 19:46:19 GMT, Doug Simon wrote: >> This PR adds support for a `vm.libgraal.enabled` property that can be used with the jtreg `@requires` tag to determine if the VM under test is using libgraal as the JIT. > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > fix wrong flag name Looks good. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14851#pullrequestreview-1527582636 From thartmann at openjdk.org Thu Jul 13 05:46:14 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 13 Jul 2023 05:46:14 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v8] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 19:20:32 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. >> >> Testing: jdk_jfr, stress testing. >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > build x86_32 The C2 changes look good to me. I guess this needs some testing on platforms not supported by Oracle. src/hotspot/share/opto/library_call.cpp line 3024: > 3022: > 3023: // Load the current value of the notified field in the JfrThreadLocal. > 3024: Node* notified_offset = basic_plus_adr(top(), tls_ptr, in_bytes(NOTIFY_OFFSET_JFR)); Suggestion: Node* notified_offset = basic_plus_adr(top(), tls_ptr, in_bytes(NOTIFY_OFFSET_JFR)); ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14761#pullrequestreview-1527596166 PR Review Comment: https://git.openjdk.org/jdk/pull/14761#discussion_r1262003063 From fyang at openjdk.org Thu Jul 13 06:59:10 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 13 Jul 2023 06:59:10 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v8] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 19:20:32 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. >> >> Testing: jdk_jfr, stress testing. >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > build x86_32 FYI: Performed `jdk_jfr` test on my 64-core linux-riscv64 server, result is clean. Hope that helps. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14761#issuecomment-1633670208 From rehn at openjdk.org Thu Jul 13 07:33:28 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 13 Jul 2023 07:33:28 GMT Subject: [jdk21] Integrated: 8310656: RISC-V: __builtin___clear_cache can fail silently. In-Reply-To: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> References: <9rOeyG4D3mhVi_ql6nVKeNEwOyzxUuD8PRIS_NsMbz4=.adda14aa-457d-4564-88aa-27fe26aafddc@github.com> Message-ID: On Sun, 2 Jul 2023 18:35:30 GMT, Robbin Ehn wrote: > Hi all, > > This pull request contains a backport of commit [faf1b822](https://github.com/openjdk/jdk/commit/faf1b822d03b726413d77a2b247dfbbf4db7d57e) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Robbin Ehn on 2 Jul 2023 and was reviewed by Ludovic Henry, Thomas Stuefe and Fei Yang. > > Thanks! This pull request has now been integrated. Changeset: 5f1d7627 Author: Robbin Ehn URL: https://git.openjdk.org/jdk21/commit/5f1d762750a0d4c20da5b23d54f922dbb5cbbe34 Stats: 131 lines in 3 files changed: 126 ins; 1 del; 4 mod 8310656: RISC-V: __builtin___clear_cache can fail silently. Reviewed-by: luhenry, fyang Backport-of: faf1b822d03b726413d77a2b247dfbbf4db7d57e ------------- PR: https://git.openjdk.org/jdk21/pull/90 From dholmes at openjdk.org Thu Jul 13 07:48:41 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 13 Jul 2023 07:48:41 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms Message-ID: [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). Testing: - tiers 1-3 (sanity) - TestNativeStack regression test Thanks ------------- Commit messages: - Only save symbols on Windows - other platforms don't need it - Merge branch 'master' into 8311541-print-jni-stack - interim Changes: https://git.openjdk.org/jdk/pull/14862/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14862&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311541 Stats: 51 lines in 5 files changed: 43 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14862.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14862/head:pull/14862 PR: https://git.openjdk.org/jdk/pull/14862 From aph at openjdk.org Thu Jul 13 08:35:15 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 13 Jul 2023 08:35:15 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 12:17:31 GMT, Coleen Phillimore wrote: >> Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. >> >> Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indentation, thanks for pointing that out @aph. Looks good, thanks. After all that back-and-forth I think you've come up with a clean and maintainable solution to a tricky problem. ------------- Marked as reviewed by aph (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14822#pullrequestreview-1527904118 From fyang at openjdk.org Thu Jul 13 09:42:05 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 13 Jul 2023 09:42:05 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v2] In-Reply-To: References: Message-ID: <5mpUd4sUz6TSOzToHZeJsNcB31XewH19f_sTWIq9Gxk=.f71c7250-c196-4c0a-8abe-9cdb2bc195cb@github.com> On Thu, 29 Jun 2023 05:50:06 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > Simplify tail for shrot string compare OK. I have went through the changes in function `generate_compare_long_string_different_encoding` and here is what I am thinking. I see there are two purposes for these changes: 1. Make sure that the memory accesses happened in `SMALL_LOOP` 8-bytes aligned on strL; 2. Avoid unaligned accessed in the loop tail (the if-else structure with AvoidUnalignedAccesses check); I am fine with the changes for the first purpose. In fact, my previous comment was about the second one. Since it's the loop tail, making using of misaligned_load (and thus eliminating the if block) shouldn't affect much. Please consider that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1633907075 From dnsimon at openjdk.org Thu Jul 13 09:48:24 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 13 Jul 2023 09:48:24 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: References: Message-ID: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> > This PR adds support for a `vm.libgraal.enabled` property that can be used with the jtreg `@requires` tag to determine if the VM under test is using libgraal as the JIT. Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: - add support for vm.libgraal property and fix problems with vm.libgraal.enabled property - Merge remote-tracking branch 'openjdk-jdk/master' into JDK-8311946 - fix wrong flag name - add support for vm.libgraal.enabled property ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14851/files - new: https://git.openjdk.org/jdk/pull/14851/files/c0cbca79..ce7c1cba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=01-02 Stats: 10960 lines in 131 files changed: 5120 ins; 5610 del; 230 mod Patch: https://git.openjdk.org/jdk/pull/14851.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14851/head:pull/14851 PR: https://git.openjdk.org/jdk/pull/14851 From dnsimon at openjdk.org Thu Jul 13 09:52:05 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 13 Jul 2023 09:52:05 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> Message-ID: On Thu, 13 Jul 2023 09:48:24 GMT, Doug Simon wrote: >> This PR adds support for `vm.libgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. >> >> * `vm.libgraal`: the libgraal shared library file is present >> * `vm.libgraal.enabled`: libgraal is used as JIT compiler > > Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - add support for vm.libgraal property and fix problems with vm.libgraal.enabled property > - Merge remote-tracking branch 'openjdk-jdk/master' into JDK-8311946 > - fix wrong flag name > - add support for vm.libgraal.enabled property I have pushed further changes to fix a bug (`vm.libgraal.enabled` was returning `VMProps.isGraalEnabled()`) and to add the `vm.libgraal` property. The latter is actually what one mostly often wants as it lets libgraal specific tests run on a JDK that includes libgraal but does not enable it by default without having to put `TEST_OPTS=JAVA_OPTIONS="-XX:+UnlockExperimentalVMOptions -XX:+UseJVMCICompiler` on the `make test` command line. Please re-review in light of these changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14851#issuecomment-1633922569 From luhenry at openjdk.org Thu Jul 13 12:44:12 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 13 Jul 2023 12:44:12 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli [v2] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 07:35:20 GMT, Ilya Gavrilin wrote: >> Please review this small change for slli and slli.uw >> slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) >> addi with 0 has higher chances to be just a register renaming and not utilise ALU at all >> We have observed small positive effect on hifive (and no change on thead). >> Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available >> testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba >> >> performance on hifive, before: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | >> >> with patch: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | > > Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo in macroAssembler_riscv.cpp Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14823#pullrequestreview-1528350707 From vkempik at openjdk.org Thu Jul 13 12:44:57 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Thu, 13 Jul 2023 12:44:57 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: <6Pi3CEPYmPsn63RN-CuB7G5-kkgIQ9ip3NVXPUt-RpM=.6c9795c2-5a0a-4a9a-91ab-aeace9ffaab4@github.com> On Wed, 28 Jun 2023 09:20:13 GMT, Ludovic Henry wrote: >> 8310949: RISC-V: Initialize UseUnalignedAccesses > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by vkempik (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14676#pullrequestreview-1528351511 From luhenry at openjdk.org Thu Jul 13 12:44:58 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 13 Jul 2023 12:44:58 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 09:52:53 GMT, Robbin Ehn wrote: >> One case to consider: >> lets say I have a system with X big cores ( which support MISALIGNED_FAST) and X small cores ( with MISALIGNED_EMU) >> >> if I run some java workload on all cores, what should hw_prober return ? obvious result here is to use +AvoidUnallignedAccesses. >> >> If I run same java workload but use taskset to run it only on big cores, how will jdk's hw_prober code work ? should it work properly and disable AvoidUnallignedAccesses or it's too much and one need to manually set -XX:-AvoidUnallignedAccesses ? > >> One case to consider: lets say I have a system with X big cores ( which support MISALIGNED_FAST) and X small cores ( with MISALIGNED_EMU) >> >> if I run some java workload on all cores, what should hw_prober return ? obvious result here is to use +AvoidUnallignedAccesses. >> >> If I run same java workload but use taskset to run it only on big cores, how will jdk's hw_prober code work ? should it work properly and disable AvoidUnallignedAccesses or it's too much and one need to manually set -XX:-AvoidUnallignedAccesses ? > > FYI: > The only way today using hwprobe in example above is to query each cpu individually to find the cpu set that have fast and the cpu set have emulated. (or vector or any other extension which may differ) > I have proposed that hwprobe should be able to also return a cpu set for some set of features. > To either inform user of what affinity they should use, or if we want change the affinity of the VM automagically. @robehn @VladimirKempik @RealFYang could I please get another review? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1634172351 From duke at openjdk.org Thu Jul 13 13:18:58 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 13 Jul 2023 13:18:58 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 15:17:10 GMT, Thomas Stuefe wrote: >>> What would be needed to make the Annotations appear in the "printall" command? I was somehow expecting to see at least something like "Annotation at xxxx". >> >> I am not sure what all details `printall` is expected to emit out. Looking at the code, printall doesn't seem to use ClassWriter. It uses HTMLGenerator to format the method data. I can emit something like "Annotation at xxxx" but it would be more useful if it can display the contents of the annotations. Unfortunately HTMLGenerator doesn't understand Annotations at all. Probably it is better left for another task. > >> > What would be needed to make the Annotations appear in the "printall" command? I was somehow expecting to see at least something like "Annotation at xxxx". >> >> I am not sure what all details `printall` is expected to emit out. Looking at the code, printall doesn't seem to use ClassWriter. It uses HTMLGenerator to format the method data. I can emit something like "Annotation at xxxx" but it would be more useful if it can display the contents of the annotations. Unfortunately HTMLGenerator doesn't understand Annotations at all. Probably it is better left for another task. > > No problem at all. I only thought about this because it would have making a regression test very easy (there is a jtreg test that calls printall and then parses the output). @tstuefe can I get your approval as well if there are no other concerns to address. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14735#issuecomment-1634227457 From stuefe at openjdk.org Thu Jul 13 13:27:14 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 13:27:14 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 18:48:42 GMT, Ashutosh Mehra wrote: >> Please review this PR that enables ClassWriter to write annotations to the class file being dumped. >> >> The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. >> >> Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` >> Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. >> The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > More review comments > > Signed-off-by: Ashutosh Mehra Looks good, thanks! ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14735#pullrequestreview-1528442782 From fyang at openjdk.org Thu Jul 13 13:55:08 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 13 Jul 2023 13:55:08 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 09:20:13 GMT, Ludovic Henry wrote: >> 8310949: RISC-V: Initialize UseUnalignedAccesses > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > review Hello, could you please provide some JMH numbers on platforms like T-Head? (-XX:+UseUnalignedAccesses vs -XX:-UseUnalignedAccesses)? I haven't seen such numbers yet. We only tested on Unmatched before: https://cr.openjdk.org/~dzhang/TestUseUnalignedAccesses/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1634286612 From coleenp at openjdk.org Thu Jul 13 13:57:07 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 13 Jul 2023 13:57:07 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 05:18:01 GMT, Jean-Philippe Bempel wrote: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. I see. I was looking at jcod output but that doesn't distinguish resolved vs. unresolved class. At any rate, the verifier will re-resolve the class when it verifies the merged constant pool. It's still safe to consider these types as same. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1634290376 From duke at openjdk.org Thu Jul 13 13:57:08 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 13 Jul 2023 13:57:08 GMT Subject: RFR: 8311102: Write annotations in the classfile dumped by SA [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 23:21:12 GMT, Chris Plummer wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> More review comments >> >> Signed-off-by: Ashutosh Mehra > > Marked as reviewed by cjplummer (Reviewer). @plummercj @tstuefe thank you for reviewing this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14735#issuecomment-1634289240 From coleenp at openjdk.org Thu Jul 13 14:38:12 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 13 Jul 2023 14:38:12 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 6 Jul 2023 05:18:01 GMT, Jean-Philippe Bempel wrote: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. Also there is a nice test harness for class redefinition in the test/hotspot/jtreg/serviceability/jvmti/RedefineClasses tests that you might be able to use to add a test for this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1634358162 From vkempik at openjdk.org Thu Jul 13 14:43:13 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Thu, 13 Jul 2023 14:43:13 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: <-4cAowIAFmn0bKAnAL0JAOiGCqNRkbQ1HD9JztHV81I=.d8a1595f-d333-4d6a-a408-df658884f3a4@github.com> On Wed, 28 Jun 2023 09:20:13 GMT, Ludovic Henry wrote: >> 8310949: RISC-V: Initialize UseUnalignedAccesses > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > review Can we deduct UseUnalignedAccesses based on the value of AvoidUnalignedAccess or vice-versa?, We currently have two flags which is confusing ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1634367476 From duke at openjdk.org Thu Jul 13 14:49:23 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Thu, 13 Jul 2023 14:49:23 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: On Wed, 28 Jun 2023 09:20:13 GMT, Ludovic Henry wrote: >> 8310949: RISC-V: Initialize UseUnalignedAccesses > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > review Results on T-Head board -XX:-UseUnalignedAccesses : Benchmark (size) Mode Cnt Score Error Units TestUseUnalignedAccesses.testPutIntUnaligned 100 thrpt 15 1907.338 ? 6.943 ops/ms TestUseUnalignedAccesses.testPutLongUnaligned 100 thrpt 15 1200.164 ? 2.577 ops/ms TestUseUnalignedAccesses.testPutShortUnaligned 100 thrpt 15 2005.762 ? 8.693 ops/ms XX:+UseUnalignedAccesses: Benchmark (size) Mode Cnt Score Error Units TestUseUnalignedAccesses.testPutIntUnaligned 100 thrpt 15 2997.842 ? 11.228 ops/ms TestUseUnalignedAccesses.testPutLongUnaligned 100 thrpt 15 3058.505 ? 11.970 ops/ms TestUseUnalignedAccesses.testPutShortUnaligned 100 thrpt 15 3157.681 ? 39.818 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1634377606 From fyang at openjdk.org Thu Jul 13 14:57:08 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 13 Jul 2023 14:57:08 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: <_l8PevT1BfVhxRSmaaNxjVBbeLlx2h_aw9ramMA56bQ=.95ef2f7f-7c78-4fc2-bdc3-e22afc62cdc2@github.com> On Wed, 28 Jun 2023 09:20:13 GMT, Ludovic Henry wrote: >> 8310949: RISC-V: Initialize UseUnalignedAccesses > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > review Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14676#pullrequestreview-1528648668 From coleenp at openjdk.org Thu Jul 13 16:13:05 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 13 Jul 2023 16:13:05 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 12:17:31 GMT, Coleen Phillimore wrote: >> Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. >> >> Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indentation, thanks for pointing that out @aph. Thank you for your time and attention, @aph and @dean-long. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14822#issuecomment-1634518638 From kvn at openjdk.org Thu Jul 13 16:50:13 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 13 Jul 2023 16:50:13 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> Message-ID: <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> On Thu, 13 Jul 2023 09:48:24 GMT, Doug Simon wrote: >> This PR adds support for `vm.libgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. >> >> * `vm.libgraal`: the libgraal shared library file is present >> * `vm.libgraal.enabled`: libgraal is used as JIT compiler > > Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - add support for vm.libgraal property and fix problems with vm.libgraal.enabled property > - Merge remote-tracking branch 'openjdk-jdk/master' into JDK-8311946 > - fix wrong flag name > - add support for vm.libgraal.enabled property test/jtreg-ext/requires/VMProps.java line 127: > 125: map.put("vm.graal.enabled", this::isGraalEnabled); > 126: // vm.libgraal is true if the libgraal shared library file is present > 127: map.put("vm.libgraal", this::hasLibraal); May be the property also should be named "vm.hasLibgraal" to be clear what it means. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1262818055 From duke at openjdk.org Thu Jul 13 17:18:28 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 13 Jul 2023 17:18:28 GMT Subject: Integrated: 8311102: Write annotations in the classfile dumped by SA In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 14:32:03 GMT, Ashutosh Mehra wrote: > Please review this PR that enables ClassWriter to write annotations to the class file being dumped. > > The fields annotations are stored in `Annotations::_fields_annotations` which is of type `Array*>`. There is no class in SA that can represent it. I have added ArrayOfU1Array to correspond to the type `Array*>` and it works. I believe there are better approaches but that would require a bit more refactoring of the classes representing Array types. I will leave that for future work for now. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Tested it manually by dumping j.l.String class and comparing the annotations with the original class using javap. > The test case mentioned in [JDK-8311101](https://bugs.openjdk.org/browse/JDK-8311101) would provide better coverage. This pull request has now been integrated. Changeset: c710e711 Author: Ashutosh Mehra Committer: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/c710e711780b3c334fdb9e1299b3c39a2b48649e Stats: 426 lines in 9 files changed: 408 ins; 5 del; 13 mod 8311102: Write annotations in the classfile dumped by SA Reviewed-by: cjplummer, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/14735 From stuefe at openjdk.org Thu Jul 13 17:46:05 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 17:46:05 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Mon, 10 Jul 2023 18:50:15 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: >> >> - Make test spikes more pronounced >> - Dont query procfs if logging is off >> - rename logtag again >> - When probing for safepoint end, use the smaller of (interval, 250ms) >> - Remove TrimNativeHeap and expand TrimNativeHeapInterval >> - Improve comments for non-supportive platforms >> - Aleksey cosmetics >> - suspend count return 16 bits >> - Fix linker errors >> - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap >> - ... and 22 more: https://git.openjdk.org/jdk/compare/0fd1ca68...15566761 > > src/hotspot/share/runtime/trimNativeHeap.cpp line 47: > >> 45: >> 46: // Statistics >> 47: unsigned _num_trims_performed; > > Sorry for the nit, but this is `uint16_t` too then, for consistency? No, since `_num_trims_performed` is the number of trims performed during the lifetime of the JVM. It should probably bumped to 64-bit, now that we have millisecond intervals. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1262877532 From shade at openjdk.org Thu Jul 13 17:46:06 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 13 Jul 2023 17:46:06 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 17:40:47 GMT, Thomas Stuefe wrote: >> src/hotspot/share/runtime/trimNativeHeap.cpp line 47: >> >>> 45: >>> 46: // Statistics >>> 47: unsigned _num_trims_performed; >> >> Sorry for the nit, but this is `uint16_t` too then, for consistency? > > No, since `_num_trims_performed` is the number of trims performed during the lifetime of the JVM. It should probably bumped to 64-bit, now that we have millisecond intervals. Yeah, my patch, see the link above, does it as `uint64_t` :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1262879033 From stuefe at openjdk.org Thu Jul 13 17:46:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 17:46:06 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 17:42:28 GMT, Aleksey Shipilev wrote: >> No, since `_num_trims_performed` is the number of trims performed during the lifetime of the JVM. It should probably bumped to 64-bit, now that we have millisecond intervals. > > Yeah, my patch, see the link above, does it as `uint64_t` :) Ah, I see you meant uint64_t, at least your patch says so. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1262879521 From mgronlun at openjdk.org Thu Jul 13 17:56:39 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 13 Jul 2023 17:56:39 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v8] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 06:56:06 GMT, Fei Yang wrote: > FYI: Performed `jdk_jfr` test on my 64-core linux-riscv64 server, result is clean. Hope that helps. Thanks Fei, it helps! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14761#issuecomment-1634654935 From mgronlun at openjdk.org Thu Jul 13 17:56:42 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 13 Jul 2023 17:56:42 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v4] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 18:28:40 GMT, Erik Gahlin wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> commentary > > Looks good, but the intrinsics need to be reviewed by the compiler team. Thank you @egahlin and @TobiHartmann for your reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14761#issuecomment-1634658773 From mgronlun at openjdk.org Thu Jul 13 17:56:37 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 13 Jul 2023 17:56:37 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v9] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: remove space Co-authored-by: Tobias Hartmann ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/97854c2e..fe248010 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=07-08 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From iklam at openjdk.org Thu Jul 13 18:05:10 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 13 Jul 2023 18:05:10 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v4] In-Reply-To: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> References: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> Message-ID: <7RYF5SEdxG-CBlpMo4Ts1ZznVZPTdU9VHmhalQvKW-Q=.0d515dc8-666a-49a5-9098-2caeeaaf8d79@github.com> On Wed, 12 Jul 2023 19:29:31 GMT, Jiangli Zhou wrote: >> Move StringTable to 'hotspot_jvm' namespace. > > Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: > > Add hotspot_jvm:: to StringTable in javaClasses.hpp. Hi Jiangli, Putting the HotSpot code in a namespace is a fundamental change that should be done in a JEP. The idea of using namespaces for HotSpot has been around for a while, but so far there hasn't been a strong motivation for doing it. Perhaps static linking would finally give us a go-ahead reason. The JEP should discuss the goals, design choices, risks and alternatives. For example: - Other than static linking, what other problems can we solve with namespaces? Knowing the other goals may help us in choosing a design. - Do we want a single namespace, or multiple namespaces (one for GC, one for JIT, etc) - Do we want to change over incrementally (as in this PR), or in a single step - Do we want to enable namespaces optionally (e.g., only for static linking)? - With namespaces, the debug symbols will be much bigger, and we could also run into issues with debuggers and other tools. - There may be alternatives for static linking that may have less impact than namespaces. The [hotspot-dev mailing list](https://mail.openjdk.org/mailman/listinfo/hotspot-dev) would be the best place to have such discussions. Also, at this time, many OpenJDK developers who are interested on this topic are on vacation, so we probably need to wait till the end of the summer so everyone has a chance to chime in. JEP may feel like a lot of process, but for this change, I think it's the best avenue for exploration. Thanks Ioi ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1634674049 From stuefe at openjdk.org Thu Jul 13 18:08:03 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 18:08:03 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v9] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 35 additional commits since the last revision: - Fix windows build - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap - Alekseys patch - Make test spikes more pronounced - Dont query procfs if logging is off - rename logtag again - When probing for safepoint end, use the smaller of (interval, 250ms) - Remove TrimNativeHeap and expand TrimNativeHeapInterval - Improve comments for non-supportive platforms - Aleksey cosmetics - ... and 25 more: https://git.openjdk.org/jdk/compare/c9433971...e821d518 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/15566761..e821d518 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=07-08 Stats: 13783 lines in 460 files changed: 7083 ins; 6176 del; 524 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Thu Jul 13 18:08:10 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 18:08:10 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Wed, 12 Jul 2023 08:47:47 GMT, Aleksey Shipilev wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: >> >> - Make test spikes more pronounced >> - Dont query procfs if logging is off >> - rename logtag again >> - When probing for safepoint end, use the smaller of (interval, 250ms) >> - Remove TrimNativeHeap and expand TrimNativeHeapInterval >> - Improve comments for non-supportive platforms >> - Aleksey cosmetics >> - suspend count return 16 bits >> - Fix linker errors >> - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap >> - ... and 22 more: https://git.openjdk.org/jdk/compare/9fd07c73...15566761 > > One more thing, as I play with it: the GC logging does not have a comma before timestamp, see: > > > [1.210s][info][trimnative] Trim native heap (1): RSS+Swap: 1192M->1191M (-1552K), 0.353ms > [1.528s][info][gc ] GC(0) Pause Young (Normal) (G1 Evacuation Pause) 91M->78M(1024M) 73.040ms > > > Also, maybe the logging tag already says this is trimmer, and what we want to point out is this was periodic trim. The value in parentheses in GC logs is heap capacity, which makes trimmer delta confusing, but we can live with that. Do we really want to say "RSS+Swap" here? I think this would be cleaner: > > > [3.214s][info][trimnative] Periodic Trim (1): 1261M->1197M (-65848K) 0.353ms Thanks @shipilev! We are closing in. Applied your patch, fixed Windows. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1634672151 From shade at openjdk.org Thu Jul 13 18:08:11 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 13 Jul 2023 18:08:11 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Mon, 10 Jul 2023 13:53:36 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: > > - Make test spikes more pronounced > - Dont query procfs if logging is off > - rename logtag again > - When probing for safepoint end, use the smaller of (interval, 250ms) > - Remove TrimNativeHeap and expand TrimNativeHeapInterval > - Improve comments for non-supportive platforms > - Aleksey cosmetics > - suspend count return 16 bits > - Fix linker errors > - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap > - ... and 22 more: https://git.openjdk.org/jdk/compare/9fd07c73...15566761 Yes, the only thing left is to bikeshed the logging statements a little. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1634674497 From stuefe at openjdk.org Thu Jul 13 18:08:11 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 18:08:11 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Mon, 10 Jul 2023 13:53:36 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 32 additional commits since the last revision: > > - Make test spikes more pronounced > - Dont query procfs if logging is off > - rename logtag again > - When probing for safepoint end, use the smaller of (interval, 250ms) > - Remove TrimNativeHeap and expand TrimNativeHeapInterval > - Improve comments for non-supportive platforms > - Aleksey cosmetics > - suspend count return 16 bits > - Fix linker errors > - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap > - ... and 22 more: https://git.openjdk.org/jdk/compare/9fd07c73...15566761 Oh wait, I forgot your "RSS+Swap" trim message suggestion. One sec... ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1634673174 From dnsimon at openjdk.org Thu Jul 13 18:27:15 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 13 Jul 2023 18:27:15 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> Message-ID: On Thu, 13 Jul 2023 16:46:52 GMT, Vladimir Kozlov wrote: >> Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - add support for vm.libgraal property and fix problems with vm.libgraal.enabled property >> - Merge remote-tracking branch 'openjdk-jdk/master' into JDK-8311946 >> - fix wrong flag name >> - add support for vm.libgraal.enabled property > > test/jtreg-ext/requires/VMProps.java line 127: > >> 125: map.put("vm.graal.enabled", this::isGraalEnabled); >> 126: // vm.libgraal is true if the libgraal shared library file is present >> 127: map.put("vm.libgraal", this::hasLibraal); > > May be the property also should be named "vm.hasLibgraal" to be clear what it means. I'm not so sure. The current naming seems consistent with other existing properties: map.put("vm.jvmti", this::vmHasJVMTI); map.put("vm.jvmci", this::vmJvmci); However, I can change it if you insist. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1262919222 From stuefe at openjdk.org Thu Jul 13 18:43:14 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 18:43:14 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v2] In-Reply-To: <0J49MlUO8xVYipLytP2Fzd9KNWZfd0J3fge92B0Mq3g=.388bfe76-2854-466b-a64c-3a93d00b168a@github.com> References: <5cl-bjjTvbpCd2yM0xRcarMllvDMqyapqD2ec2_Z_mc=.205b5d59-6bd5-4978-873b-49f32bf458a3@github.com> <0J49MlUO8xVYipLytP2Fzd9KNWZfd0J3fge92B0Mq3g=.388bfe76-2854-466b-a64c-3a93d00b168a@github.com> Message-ID: On Wed, 12 Jul 2023 01:26:08 GMT, Jiangli Zhou wrote: > Moving to the namespace incrementally seems to be reasonable to me. Will let others to chime in on this for their thoughts. I also like the incremental namespace solution. @iklam wrote: > > The [hotspot-dev mailing list](https://mail.openjdk.org/mailman/listinfo/hotspot-dev) would be the best place to have such discussions. Also, at this time, many OpenJDK developers who are interested on this topic are on vacation, so we probably need to wait till the end of the summer so everyone has a chance to chime in. > > JEP may feel like a lot of process, but for this change, I think it's the best avenue for exploration. > I agree with @iklam that we should discuss this on hotspot-dev. It will affect the whole JDK and constitute a change that is visible from outside, so we should get this right. I also think that making the hotspot linkable in a static way could be very useful. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1634720487 From jiangli at openjdk.org Thu Jul 13 18:53:08 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Thu, 13 Jul 2023 18:53:08 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v4] In-Reply-To: <7RYF5SEdxG-CBlpMo4Ts1ZznVZPTdU9VHmhalQvKW-Q=.0d515dc8-666a-49a5-9098-2caeeaaf8d79@github.com> References: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> <7RYF5SEdxG-CBlpMo4Ts1ZznVZPTdU9VHmhalQvKW-Q=.0d515dc8-666a-49a5-9098-2caeeaaf8d79@github.com> Message-ID: On Thu, 13 Jul 2023 18:01:47 GMT, Ioi Lam wrote: > Hi Jiangli, > > Putting the HotSpot code in a namespace is a fundamental change that should be done in a JEP. The idea of using namespaces for HotSpot has been around for a while, but so far there hasn't been a strong motivation for doing it. Perhaps static linking would finally give us a go-ahead reason. > > The JEP should discuss the goals, design choices, risks and alternatives. For example: > > * Other than static linking, what other problems can we solve with namespaces? Knowing the other goals may help us in choosing a design. > * Do we want a single namespace, or multiple namespaces (one for GC, one for JIT, etc) > * Do we want to change over incrementally (as in this PR), or in a single step > * Do we want to enable namespaces optionally (e.g., only for static linking)? > * With namespaces, the debug symbols will be much bigger, and we could also run into issues with debuggers and other tools. > * There may be alternatives for static linking that may have less impact than namespaces. > > The [hotspot-dev mailing list](https://mail.openjdk.org/mailman/listinfo/hotspot-dev) would be the best place to have such discussions. Also, at this time, many OpenJDK developers who are interested on this topic are on vacation, so we probably need to wait till the end of the summer so everyone has a chance to chime in. > > JEP may feel like a lot of process, but for this change, I think it's the best avenue for exploration. > > Thanks Ioi Thanks for the response, @iklam. Going through the JEP for the hotspot namespace makes a lot of sense. Waiting until the end of summer also sounds good to me. Maybe it's still early to ask the question, would you be willing to drive the JEP for the namespace changes? I'm happy to write up the JEP otherwise. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1634731589 From duke at openjdk.org Thu Jul 13 18:54:27 2023 From: duke at openjdk.org (duke) Date: Thu, 13 Jul 2023 18:54:27 GMT Subject: Withdrawn: JDK-8296360: Track native memory used by zlib via NMT In-Reply-To: References: Message-ID: On Fri, 4 Nov 2022 14:35:00 GMT, Thomas Stuefe wrote: > This patch adds NMT tracking to the zlib. > > *Please note: we currently discuss whether NMT can be expanded across the JDK in this ML discussion [1]. This PR depends on the outcome of that discussion and won't proceed unless greenlighted. But since [1] is stalled, I post the PR for RFR to get some feedback on the code itself and for people to try it out.* > > NMT tracks hotspot native allocations but does not cover the JDK (apart from small exceptions). But the native memory > footprint of JDK libraries can be very significant. Recently we had a customer whose zlib footprint went into the ~40GB range. We analyzed the problem with an in-house memory tracker, but things would have been a lot simpler using NMT. > > This patch instruments the zlib via zalloc hooks, which is minimally invasive. It does not touch zlib sources, so it works with both the bundled zlib and system zlib. All the instrumentation happens in the JVM zlib wrapper. > > > - j.u.zip (deflate) (reserved=624, committed=624) > (malloc=624 #3) > > - j.u.zip (inflate) (reserved=221377904, committed=221377904) > (malloc=221377904 #60877) > > - Zip (other) (reserved=8163446896, committed=8163446896) > (malloc=8163446896 #182622) > > > [1] https://mail.openjdk.org/pipermail/core-libs-dev/2022-November/096197.html This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/10988 From kvn at openjdk.org Thu Jul 13 18:56:17 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 13 Jul 2023 18:56:17 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> Message-ID: On Thu, 13 Jul 2023 18:23:49 GMT, Doug Simon wrote: >> test/jtreg-ext/requires/VMProps.java line 127: >> >>> 125: map.put("vm.graal.enabled", this::isGraalEnabled); >>> 126: // vm.libgraal is true if the libgraal shared library file is present >>> 127: map.put("vm.libgraal", this::hasLibraal); >> >> May be the property also should be named "vm.hasLibgraal" to be clear what it means. > > I'm not so sure. The current naming seems consistent with other existing properties: > > map.put("vm.jvmti", this::vmHasJVMTI); > map.put("vm.jvmci", this::vmJvmci); > > However, I can change it if you insist. I would prefer naming similar to "vm.hasJFR", "vm.hasSA", "vm.hasDTrace". It is unfortunate that we used "vm.jvmci" and "vm.jvmti" naming to check presence of features. Also libgraal is not part of VM. May be you need to use "jdk.hasLibgraal". There are other properties which use "jdk." already. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1262943734 From stuefe at openjdk.org Thu Jul 13 18:56:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 18:56:56 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Bikeshed Trim log lines ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/e821d518..c8d46436 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=08-09 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Thu Jul 13 18:56:57 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 13 Jul 2023 18:56:57 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v8] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <6LrnfldcsMG6bd6RgcNi24V-eEnNmWhOM8WFAMab8Q0=.a1ca0263-ff73-431d-8dc5-b7e7f4608479@github.com> On Thu, 13 Jul 2023 18:02:08 GMT, Aleksey Shipilev wrote: > Yes, the only thing left is to bikeshed the logging statements a little. Okay done. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1634730982 From shade at openjdk.org Thu Jul 13 18:57:05 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 13 Jul 2023 18:57:05 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v9] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 18:08:03 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 35 additional commits since the last revision: > > - Fix windows build > - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap > - Alekseys patch > - Make test spikes more pronounced > - Dont query procfs if logging is off > - rename logtag again > - When probing for safepoint end, use the smaller of (interval, 250ms) > - Remove TrimNativeHeap and expand TrimNativeHeapInterval > - Improve comments for non-supportive platforms > - Aleksey cosmetics > - ... and 25 more: https://git.openjdk.org/jdk/compare/d177dba7...e821d518 Want to replace "Native heap trimmer" with "Periodic native heap trimmer" too? Would be clear that we are suspending only the periodic one. The DCmd command would still be accepted and acted upon. Thinking about it, maybe we should do a follow-up PR and just forward that request to this thread? If so, we don't need to rename it to "Periodic". ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1634733370 From dnsimon at openjdk.org Thu Jul 13 19:22:11 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 13 Jul 2023 19:22:11 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> Message-ID: <_tyPX0dDfjSNBc_CM1eZjJ35eTQxApAI9wm2XoO9XaA=.7a22cc2d-2be7-40f7-a182-57676e2a356d@github.com> On Thu, 13 Jul 2023 18:52:43 GMT, Vladimir Kozlov wrote: >> I'm not so sure. The current naming seems consistent with other existing properties: >> >> map.put("vm.jvmti", this::vmHasJVMTI); >> map.put("vm.jvmci", this::vmJvmci); >> >> However, I can change it if you insist. > > I would prefer naming similar to "vm.hasJFR", "vm.hasSA", "vm.hasDTrace". It is unfortunate that we used "vm.jvmci" and "vm.jvmti" naming to check presence of features. > > Also libgraal is not part of VM. May be you need to use "jdk.hasLibgraal". There are other properties which use "jdk." already. Ok, `hasLibgraal` is fine. In what sense is libgraal not part of the VM? I would have thought that a JIT compiler is more "vm" than "jdk"? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1262967313 From amenkov at openjdk.org Thu Jul 13 19:26:19 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 13 Jul 2023 19:26:19 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads Message-ID: The change fixes handling of "suspended" bit in VT state. The code looks very strange. java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) Per log this code came from loom repo with VT integration. Testing: tier1-4, updated GetThreadStateMountedTest.java ------------- Commit messages: - fix Changes: https://git.openjdk.org/jdk/pull/14878/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14878&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310584 Stats: 4 lines in 2 files changed: 0 ins; 3 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14878.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14878/head:pull/14878 PR: https://git.openjdk.org/jdk/pull/14878 From mchung at openjdk.org Thu Jul 13 19:30:05 2023 From: mchung at openjdk.org (Mandy Chung) Date: Thu, 13 Jul 2023 19:30:05 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() throws UOE if invoked reflectively [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 15:47:35 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Fixed case when calling getCallerClass() from a @CallerSensitive method reflectively I wonder if we should move the check to throw UOE if called by caller-sensitive method in Java as a general guidance to implement the runtime in Java if desirable. That means it requires the VM to fill not only the class in the buffer but also need a flag to indicate the method is caller-sensitive or not. It's a tradeoff of the buffer size. The common case for `getCallerClass` is being invoked statically and should find the class from the first batch. Only if it's invoked reflectively and if filtered in the Java, it'll fetch more batches and hence the performance would not be as fast as filtered in VM but I think that's okay since it's uncommon. Would you have cycle to implement this alternative and determine any performance impact to common cases? Then evaluate this further. The benchmark is at `test/micro/org/openjdk/bench/java/lang/StackWalkBench.java`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1634783438 From iklam at openjdk.org Thu Jul 13 19:47:05 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 13 Jul 2023 19:47:05 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v4] In-Reply-To: References: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> <7RYF5SEdxG-CBlpMo4Ts1ZznVZPTdU9VHmhalQvKW-Q=.0d515dc8-666a-49a5-9098-2caeeaaf8d79@github.com> Message-ID: On Thu, 13 Jul 2023 18:49:47 GMT, Jiangli Zhou wrote: > Maybe it's still early to ask the question, would you be willing to drive the JEP for the namespace changes? I'm happy to write up the JEP otherwise. I'll be happy to facilitate the discussion for this topic, but I am really not an expert on C++ namespaces. Also, I think such a JEP should be led by someone who has an interest of putting HotSpot in namespaces. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1634811516 From shade at openjdk.org Thu Jul 13 19:59:18 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 13 Jul 2023 19:59:18 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 18:56:56 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Bikeshed Trim log lines Realized that in production, we would like to see why trimmer might be late. I think this would look even better: [trimnative-shipilev-2.patch](https://github.com/openjdk/jdk/files/12043977/trimnative-shipilev-2.patch) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1634823724 From kvn at openjdk.org Thu Jul 13 20:54:54 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 13 Jul 2023 20:54:54 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: <_tyPX0dDfjSNBc_CM1eZjJ35eTQxApAI9wm2XoO9XaA=.7a22cc2d-2be7-40f7-a182-57676e2a356d@github.com> References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> <_tyPX0dDfjSNBc_CM1eZjJ35eTQxApAI9wm2XoO9XaA=.7a22cc2d-2be7-40f7-a182-57676e2a356d@github.com> Message-ID: On Thu, 13 Jul 2023 19:18:47 GMT, Doug Simon wrote: >> I would prefer naming similar to "vm.hasJFR", "vm.hasSA", "vm.hasDTrace". It is unfortunate that we used "vm.jvmci" and "vm.jvmti" naming to check presence of features. >> >> Also libgraal is not part of VM. May be you need to use "jdk.hasLibgraal". There are other properties which use "jdk." already. > > Ok, `hasLibgraal` is fine. > In what sense is libgraal not part of the VM? I would have thought that a JIT compiler is more "vm" than "jdk"? In a sense that the libgraal **file** is separate from VM and its generation is not part of HotSpot VM build. `hasLibgraal` property checks the presence of this separate file and not a feature which is part of VM build (like jvmci). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1263041686 From kvn at openjdk.org Thu Jul 13 20:54:55 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 13 Jul 2023 20:54:55 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> <_tyPX0dDfjSNBc_CM1eZjJ35eTQxApAI9wm2XoO9XaA=.7a22cc2d-2be7-40f7-a182-57676e2a356d@github.com> Message-ID: On Thu, 13 Jul 2023 20:49:11 GMT, Vladimir Kozlov wrote: >> Ok, `hasLibgraal` is fine. >> In what sense is libgraal not part of the VM? I would have thought that a JIT compiler is more "vm" than "jdk"? > > In a sense that the libgraal **file** is separate from VM and its generation is not part of HotSpot VM build. > `hasLibgraal` property checks the presence of this separate file and not a feature which is part of VM build (like jvmci). "vm.libgraal.enabled" is VM's property because it enables VM's Graal JIT version which uses code in libgraal. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1263043724 From kvn at openjdk.org Thu Jul 13 21:02:15 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 13 Jul 2023 21:02:15 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> <_tyPX0dDfjSNBc_CM1eZjJ35eTQxApAI9wm2XoO9XaA=.7a22cc2d-2be7-40f7-a182-57676e2a356d@github.com> Message-ID: <6IOVoHyDM9cvcvRairnaxu7gyRHLV9S8N-Vcbfk3cCg=.9dfe5dd5-04a6-4274-a78b-de497ab59757@github.com> On Thu, 13 Jul 2023 20:52:02 GMT, Vladimir Kozlov wrote: >> In a sense that the libgraal **file** is separate from VM and its generation is not part of HotSpot VM build. >> `hasLibgraal` property checks the presence of this separate file and not a feature which is part of VM build (like jvmci). > > "vm.libgraal.enabled" is VM's property because it enables VM's Graal JIT version which uses code in libgraal. On other hand, it is not really big matter. I am fine with "vm. hasLibgraal" since it is used only by VM. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1263049447 From dholmes at openjdk.org Thu Jul 13 21:15:28 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 13 Jul 2023 21:15:28 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v9] In-Reply-To: References: Message-ID: On Fri, 7 Jul 2023 18:01:14 GMT, Doug Simon wrote: >> In the development of libgraal, it has been very useful to see why a given class is loaded (e.g., trying to reduce startup time by avoiding unnecessary eager class loading). One way to do this is to see the stack trace when the VM loads a class. The only possibility currently is to add a static initializer to the class of interest. However, not only is this not always possible but it doesn't correlate with class loading but with class initialization. >> >> This PR adds support for `-Xlog:class+load+cause` and `-Xlog:class+load+cause+native` that produce output according to a new `LogClassLoadingCauseFor` VM flag: >> >> >> product(ccstr, LogClassLoadingCauseFor, nullptr, \ >> "Apply -Xlog:class+load+cause* to classes whose fully " \ >> "qualified name contains this string ("*" matches " \ >> "any class).") \ >> >> >> Example usage: >> >> java "-Xlog:class+load+cause*" -XX:LogClassLoadingCauseFor=java.util.concurrent.ConcurrentHashMap$V --version >> [0.075s][info][class,load,cause] Java stack when loading java.util.concurrent.ConcurrentHashMap$ValuesView: >> [0.075s][info][class,load,cause] at java.util.concurrent.ConcurrentHashMap.values(java.base/ConcurrentHashMap.java:1263) >> [0.075s][info][class,load,cause] at jdk.internal.loader.NativeLibraries.find(java.base/NativeLibraries.java:102) >> [0.075s][info][class,load,cause] at java.lang.ClassLoader.findNative(java.base/ClassLoader.java:2457) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.init(java.base/Native Method) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixNativeDispatcher.(java.base/UnixNativeDispatcher.java:817) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystem.(java.base/UnixFileSystem.java:96) >> [0.075s][info][class,load,cause] at sun.nio.fs.BsdFileSystem.(java.base/BsdFileSystem.java:50) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystem.(java.base/MacOSXFileSystem.java:52) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:44) >> [0.075s][info][class,load,cause] at sun.nio.fs.MacOSXFileSystemProvider.newFileSystem(java.base/MacOSXFileSystemProvider.java:37) >> [0.075s][info][class,load,cause] at sun.nio.fs.UnixFileSystemProvider.(java.base/UnixFileSystemProvider.java:78) >> [0.075s][info][class,load,cau... > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > refactor body of print_class_load_logging in print_class_load_helper src/hotspot/share/oops/instanceKlass.cpp line 3835: > 3833: cfs->length(), > 3834: ClassLoader::crc32(0, (const char*)cfs->buffer(), > 3835: cfs->length())); The indentation changes you made here (and elsewhere) are incorrect - the original was correct: arguments should line up ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1263059028 From dnsimon at openjdk.org Thu Jul 13 21:29:25 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 13 Jul 2023 21:29:25 GMT Subject: RFR: 8193513: add support for printing a stack trace on class loading [v9] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 21:12:18 GMT, David Holmes wrote: >> Doug Simon has updated the pull request incrementally with one additional commit since the last revision: >> >> refactor body of print_class_load_logging in print_class_load_helper > > src/hotspot/share/oops/instanceKlass.cpp line 3835: > >> 3833: cfs->length(), >> 3834: ClassLoader::crc32(0, (const char*)cfs->buffer(), >> 3835: cfs->length())); > > The indentation changes you made here (and elsewhere) are incorrect - the original was correct: arguments should line up Sorry about that. I'll piggy back a fix onto https://github.com/openjdk/jdk/pull/14851. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14553#discussion_r1263067992 From dnsimon at openjdk.org Thu Jul 13 21:36:17 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 13 Jul 2023 21:36:17 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v3] In-Reply-To: <6IOVoHyDM9cvcvRairnaxu7gyRHLV9S8N-Vcbfk3cCg=.9dfe5dd5-04a6-4274-a78b-de497ab59757@github.com> References: <1rgJ4y7S7okKDolNYuSEPgYuN73rw63p8IROzbRQz_w=.a0f34434-e483-42a3-8bb0-bff2838b0603@github.com> <9HNzCfbkF-A9hUY_p7gGDHV3mKVx6CqT41TmtY1A8qU=.e8c6106c-30a6-4fad-8b3c-72c4e20ade75@github.com> <_tyPX0dDfjSNBc_CM1eZjJ35eTQxApAI9wm2XoO9XaA=.7a22cc2d-2be7-40f7-a182-57676e2a356d@github.com> <6IOVoHyDM9cvcvRairnaxu7gyRHLV9S8N-Vcbfk3cCg=.9dfe5dd5-04a6-4274-a78b-de497ab59757@github.com> Message-ID: On Thu, 13 Jul 2023 20:59:10 GMT, Vladimir Kozlov wrote: >> "vm.libgraal.enabled" is VM's property because it enables VM's Graal JIT version which uses code in libgraal. > > On other hand, it is not really big matter. I am fine with "vm. hasLibgraal" since it is used only by VM. You've convinced me - I'll rename it to `jdk.hasLibgraal`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1263072718 From dnsimon at openjdk.org Thu Jul 13 21:55:34 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 13 Jul 2023 21:55:34 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v4] In-Reply-To: References: Message-ID: > This PR adds support for `jdk.hasLibgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. > > * `jdk.hasLibgraal`: the libgraal shared library file is present > * `vm.libgraal.enabled`: libgraal is used as JIT compiler Doug Simon has updated the pull request incrementally with three additional commits since the last revision: - rename vm.hasLibgraal to jdk.hasLibgraal - rename vm.libgraal to vm.hasLibgraal - fix indentation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14851/files - new: https://git.openjdk.org/jdk/pull/14851/files/ce7c1cba..05760ec3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=02-03 Stats: 9 lines in 3 files changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/14851.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14851/head:pull/14851 PR: https://git.openjdk.org/jdk/pull/14851 From duke at openjdk.org Thu Jul 13 22:44:48 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 13 Jul 2023 22:44:48 GMT Subject: RFR: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs Message-ID: Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. I also took this opportunity to fix some other minor issues with logging: 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. Testing: GHA testing passed ------------- Commit messages: - 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs Changes: https://git.openjdk.org/jdk/pull/14880/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14880&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311976 Stats: 39 lines in 4 files changed: 3 ins; 19 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/14880.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14880/head:pull/14880 PR: https://git.openjdk.org/jdk/pull/14880 From kvn at openjdk.org Thu Jul 13 23:02:13 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 13 Jul 2023 23:02:13 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v4] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 21:55:34 GMT, Doug Simon wrote: >> This PR adds support for `jdk.hasLibgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. >> >> * `jdk.hasLibgraal`: the libgraal shared library file is present >> * `vm.libgraal.enabled`: libgraal is used as JIT compiler > > Doug Simon has updated the pull request incrementally with three additional commits since the last revision: > > - rename vm.hasLibgraal to jdk.hasLibgraal > - rename vm.libgraal to vm.hasLibgraal > - fix indentation Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14851#pullrequestreview-1529343167 From kvn at openjdk.org Thu Jul 13 23:10:16 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 13 Jul 2023 23:10:16 GMT Subject: RFR: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs In-Reply-To: References: Message-ID: <4qSH1d3mzgu1jtSa-O3r-MqBqzpzWpCYHK5Bn0lc14Q=.e2aae8a2-5639-4db9-9a25-1718d9f6bbf4@github.com> On Thu, 13 Jul 2023 22:35:43 GMT, Ashutosh Mehra wrote: > Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. > I also took this opportunity to fix some other minor issues with logging: > 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. > 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. > 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. > 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. > 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile > > I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. > > Testing: GHA testing passed @ashu-mehra, can you show output before and after this changes? It would be nice to see what you are fixing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14880#issuecomment-1635039351 From duke at openjdk.org Thu Jul 13 23:56:14 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 13 Jul 2023 23:56:14 GMT Subject: RFR: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs In-Reply-To: <4qSH1d3mzgu1jtSa-O3r-MqBqzpzWpCYHK5Bn0lc14Q=.e2aae8a2-5639-4db9-9a25-1718d9f6bbf4@github.com> References: <4qSH1d3mzgu1jtSa-O3r-MqBqzpzWpCYHK5Bn0lc14Q=.e2aae8a2-5639-4db9-9a25-1718d9f6bbf4@github.com> Message-ID: On Thu, 13 Jul 2023 23:07:35 GMT, Vladimir Kozlov wrote: >> Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. >> I also took this opportunity to fix some other minor issues with logging: >> 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. >> 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. >> 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. >> 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. >> 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile >> >> I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. >> >> Testing: GHA testing passed > > @ashu-mehra, can you show output before and after this changes? It would be nice to see what you are fixing. @vnkozlov First change is in the amount of information in the C1 compilation log. Before the fix with -XX:+LogCompilation (irrespective of CITimeVerbose), one example of each C1 and C2 compilation: ``` Note that for C1 all phases are displayed, but not for C2. To include the phase information for C2, ` CITimeVerbose` needs to be specified. After the fix with -XX:+LogCompilation, example of C1 compilation: and with -XX:+LogCompilation -XX:+CITimeVerbose example of C1 compilation: Note that phase information for C1 compilation is now displayed only with `CITimeVerbose` option. Output of C2 compilation in this context does not change with this patch. The other change is in the `phase_done` tag to include the name of the phase just completed along with other data. Before the fix: After the fix: This makes it easier to trace the start and end of a phase even when there are sub-phases in between. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14880#issuecomment-1635066400 From dholmes at openjdk.org Fri Jul 14 01:03:23 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 01:03:23 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v4] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 21:55:34 GMT, Doug Simon wrote: >> This PR adds support for `jdk.hasLibgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. >> >> * `jdk.hasLibgraal`: the libgraal shared library file is present >> * `vm.libgraal.enabled`: libgraal is used as JIT compiler > > Doug Simon has updated the pull request incrementally with three additional commits since the last revision: > > - rename vm.hasLibgraal to jdk.hasLibgraal > - rename vm.libgraal to vm.hasLibgraal > - fix indentation test/jtreg-ext/requires/VMProps.java line 498: > 496: * @return true if the libgraal shared library file is present. > 497: */ > 498: protected String hasLibraal() { Typo: 'g' is missing ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14851#discussion_r1263172925 From kvn at openjdk.org Fri Jul 14 02:08:04 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 14 Jul 2023 02:08:04 GMT Subject: RFR: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 22:35:43 GMT, Ashutosh Mehra wrote: > Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. > I also took this opportunity to fix some other minor issues with logging: > 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. > 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. > 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. > 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. > 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile > > I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. > > Testing: GHA testing passed Looks good. Thank you for showing output change. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14880#pullrequestreview-1529480673 From dholmes at openjdk.org Fri Jul 14 02:50:15 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 02:50:15 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v4] In-Reply-To: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> References: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> Message-ID: On Wed, 12 Jul 2023 19:29:31 GMT, Jiangli Zhou wrote: >> Move StringTable to 'hotspot_jvm' namespace. > > Jiangli Zhou has updated the pull request incrementally with one additional commit since the last revision: > > Add hotspot_jvm:: to StringTable in javaClasses.hpp. I am concerned that something that [JDK-8303796](https://bugs.openjdk.org/browse/JDK-8303796) claimed to be a small enhancement is in fact requiring a lot of changes in different areas of the codebase. There are already a number of "duplicate symbol definition" issues that have been filed and fixed in an ad-hoc manner in the JDK, which is the kind of whack-a-mole approach that we should not be taking. If static linking requires symbol isolation then a generally applicable approach to doing that should be investigated (in a separate project? Leyden?) and then brought to mainline via a JEP (not just for the hotspot aspect as currently suggested). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1635186576 From fyang at openjdk.org Fri Jul 14 03:23:56 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 14 Jul 2023 03:23:56 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli [v2] In-Reply-To: References: Message-ID: <5ynRnUT23xx3Pszrjx0JuXXOotRcCyo5gT2BrermTOQ=.10875e35-cf7a-4727-bb42-5bf9162dd16a@github.com> On Wed, 12 Jul 2023 07:35:20 GMT, Ilya Gavrilin wrote: >> Please review this small change for slli and slli.uw >> slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) >> addi with 0 has higher chances to be just a register renaming and not utilise ALU at all >> We have observed small positive effect on hifive (and no change on thead). >> Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available >> testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba >> >> performance on hifive, before: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | >> >> with patch: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | > > Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: > > Fix typo in macroAssembler_riscv.cpp Changes requested by fyang (Reviewer). src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4343: > 4341: } else { > 4342: slli(Rd, Rs, shamt + 32); > 4343: srli(Rd, Rd, 32); I don't think this code in else block will work in the case when `shamt` >= 32. Note that the `slli.uw` instruction is the same as `slli` with `zext.w` performed on the least-significant word of `Rs` before shifting. So you might want to do a combination of 32-bit zero extension and `slli` on `Rs` instead. ------------- PR Review: https://git.openjdk.org/jdk/pull/14823#pullrequestreview-1529534808 PR Review Comment: https://git.openjdk.org/jdk/pull/14823#discussion_r1263249121 From sspitsyn at openjdk.org Fri Jul 14 03:45:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 14 Jul 2023 03:45:55 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 19:18:38 GMT, Alex Menkov wrote: > The change fixes handling of "suspended" bit in VT state. > The code looks very strange. > java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) > Per log this code came from loom repo with VT integration. > > Testing: tier1-4, updated GetThreadStateMountedTest.java This looks good. Thank you for filing bug and fixing it! I've one question besides this fix. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14878#pullrequestreview-1529548022 From sspitsyn at openjdk.org Fri Jul 14 03:58:13 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 14 Jul 2023 03:58:13 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 19:18:38 GMT, Alex Menkov wrote: > The change fixes handling of "suspended" bit in VT state. > The code looks very strange. > java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) > Per log this code came from loom repo with VT integration. > > Testing: tier1-4, updated GetThreadStateMountedTest.java @alexmenkov Do you consider backporting this to 21? src/hotspot/share/prims/jvmtiEnvBase.cpp line 804: > 802: if (ext_suspended && ((state & JVMTI_THREAD_STATE_ALIVE) != 0)) { > 803: state |= JVMTI_THREAD_STATE_SUSPENDED; > 804: } One question unrelated to this bug and your fix. I wonder if any check and handling is needed for the case: `if (ext_suspended && ((state & JVMTI_THREAD_STATE_ALIVE) == 0))` Not sure this condition is even possible. But do we need to add an assert here? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14878#issuecomment-1635229248 PR Review Comment: https://git.openjdk.org/jdk/pull/14878#discussion_r1263261620 From dholmes at openjdk.org Fri Jul 14 05:07:14 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 05:07:14 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 18:56:56 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Bikeshed Trim log lines How have you determined all of the places that need `NativeHeapTrimmer::SuspendMark`? Couple of minor comments below but otherwise this seems good. I'll take it for a spin through our CI. Thanks src/hotspot/share/runtime/globals.hpp line 1992: > 1990: "the platform supports that. Lower values will reclaim memory " \ > 1991: "more eagerly at the cost of higher overhead. A value of 0 " \ > 1992: "(default) disables the native heap trimming.") \ Nit: s/the// src/hotspot/share/runtime/synchronizer.cpp line 1650: > 1648: > 1649: static size_t delete_monitors(GrowableArray* delete_list) { > 1650: NativeHeapTrimmer::SuspendMark trim_native_pause("monitor deletion"); `sm` will do for the name - as per other uses. src/hotspot/share/runtime/trimNativeHeap.cpp line 83: > 81: > 82: // in seconds > 83: static double now() { return os::elapsedTime(); } Do you need the wrapper for this rather than using `os::elapsedTime()` directly? test/hotspot/gtest/runtime/test_trim_native.cpp line 44: > 42: NativeHeapTrimmer::SuspendMark sm2("Test2"); > 43: { > 44: NativeHeapTrimmer::SuspendMark sm3("Test3"); What is this actually testing? Everything could be a no-op. ------------- PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1529587038 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1263283862 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1263284320 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1263285447 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1263292200 From vkempik at openjdk.org Fri Jul 14 05:21:15 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 14 Jul 2023 05:21:15 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli [v2] In-Reply-To: <5ynRnUT23xx3Pszrjx0JuXXOotRcCyo5gT2BrermTOQ=.10875e35-cf7a-4727-bb42-5bf9162dd16a@github.com> References: <5ynRnUT23xx3Pszrjx0JuXXOotRcCyo5gT2BrermTOQ=.10875e35-cf7a-4727-bb42-5bf9162dd16a@github.com> Message-ID: On Fri, 14 Jul 2023 03:18:31 GMT, Fei Yang wrote: >> Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix typo in macroAssembler_riscv.cpp > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4343: > >> 4341: } else { >> 4342: slli(Rd, Rs, shamt + 32); >> 4343: srli(Rd, Rd, 32); > > I don't think this code in else block will work in the case when `shamt` >= 32. Note that the `slli.uw` instruction is the same as `slli` with `zext.w` performed on the least-significant word of `Rs` before shifting. So you might want to do a combination of 32-bit zero extension and `slli` on `Rs` instead. slli.uw with shamt >= 32 will be same as ```add Rd, X0, X0```, isn't it ? He can just add handling for that special case then ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14823#discussion_r1263300447 From stuefe at openjdk.org Fri Jul 14 05:31:20 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 05:31:20 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 19:55:49 GMT, Aleksey Shipilev wrote: > Realized that in production, we would like to see why trimmer might be late. I think this would look even better: [trimnative-shipilev-2.patch](https://github.com/openjdk/jdk/files/12043977/trimnative-shipilev-2.patch) I thought about this too, but you don't really want to know if it was suspended for every wait interval, but for every trim interval. In other words, you want to know how many trims had been moved up because a safepoint had been happening, and how many trims had been skipped due to pause. Getting these infos is harder than just increasing a counter. Is it worth the added complexity? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1635294737 From stuefe at openjdk.org Fri Jul 14 05:39:12 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 05:39:12 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v9] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <4A5ePuT8w2r4mV9m1gufLqFEV3qexzw4SLaX73uze64=.d2d1267b-60c0-4bd5-9613-963f2bb38a13@github.com> On Thu, 13 Jul 2023 18:51:09 GMT, Aleksey Shipilev wrote: > Want to replace "Native heap trimmer" with "Periodic native heap trimmer" too? Would be clear that we are suspending only the periodic one. The DCmd command would still be accepted and acted upon. Thinking about it, maybe we should do a follow-up PR and just forward that request to this thread? If so, we don't need to rename it to "Periodic". I had initially considered it and then avoided since it would add too much unnecessary complexity. E.g., you then would have to start the native trimmer thread even if periodic trimming is off in the off-chance that someone may want to issue a trim from outside. Or, you'd need to start the trimmer thread delayed and on-demand if someone issues a dcmd; then you need to deal with the fact that the trimmer thread may or may not be there. Another issue then would be whether trimming would expedite the current trim, aka restart the interval timer, or be additive to the periodic trim. My first solution did all that, and it was a lot more complex. I like the simplicity of this patch. Customers issuing trims concurrently via dcmd will hopefully be a very rare occurrence - all those folks that used to issue the dcmd scripted every second hopefully will just switch to the periodic trimming. Idk. Its all solvable, its just code I guess. Maybe for a later RFE? Integrating the dcmd with periodic trimming may have one pro, that is preventing customers from shooting themselves in the foot who issue the dcmd via script and then forgot they did that :-P ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1635301221 From dholmes at openjdk.org Fri Jul 14 05:41:18 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 05:41:18 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Tue, 4 Jul 2023 13:02:22 GMT, Thomas Stuefe wrote: >> Today, if we use UseTransparentHugePages, we assume that the *static* hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in `/proc/memlimit` "Hugepagesize") >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect: >> >> - whether THPs are enabled should be checked at `/sys/kernel/mm/transparent_hugepage/enabled`, which is a tri-state value ("always", "madvise", "never"). THPs are available for the first two states. >> - The page size employed by `khugepaged` is set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. It can differ from the default page size used for static hugepages. For example, we could configure a system such that it uses 1G static hugepages, but the THP page size would still be 2M. >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback johan Not an area I am familiar with but scanning through the code this seems reasonable. A few minor comments below. Thanks. src/hotspot/os/linux/hugepages.cpp line 92: > 90: > 91: // Given a file that contains a single (integral) number, return that number, 0 and false if failed. > 92: static bool read_number_file(const char* file, size_t* out) { Why `bool` return when you don't check it? Also `*out == 0` also indicates failure. Though why use an out parameter instead of just returning the value? src/hotspot/share/utilities/globalDefinitions.hpp line 413: > 411: > 412: #define EXACTFMT SIZE_FORMAT "%s" > 413: #define EXACTFMTARGS(s) byte_size_in_exact_unit(s), exact_unit_for_byte_size(s) Not sure this is worthy of inclusion in globalDefinitions. And I struggle to understand the use of "exact" in this context (pre-existing). test/hotspot/jtreg/runtime/os/HugePageConfiguration.java line 65: > 63: } > 64: > 65: @java.lang.Override The `java.lang` is not necessary and typically not used. ------------- PR Review: https://git.openjdk.org/jdk/pull/14739#pullrequestreview-1529621792 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1263311652 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1263308053 PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1263305494 From stuefe at openjdk.org Fri Jul 14 05:56:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 05:56:21 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 18:56:56 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Bikeshed Trim log lines Hi David, > How have you determined all of the places that need `NativeHeapTrimmer::SuspendMark`? > We now avoid trimming if the VM is in or nearing a SafePoint, which takes care of the majority of phases where we don't want to trim. The rest of the suspensions was just based on my gut feeling: mallocs are usually spread out more, but some releases happen in bulk, and those are easily identifiable and have the SuspendMark. When thinking this up, I considered a lot of alternatives, e.g., hooking into os::malloc/free and setting a dead mans timer each time to give the process a "cool down" after invocations - all in the assumption that mallocs are clustered. But that was too expensive - we did a lot of work to make the normal malloc paths with NMT=off very cheap - and also ineffective since we don't see malloc use from outside hotspot (JDK or third-party), and those are the main targets. To really catch every malloc, I even hooked into glibc malloc hooks, but that was horribly ugly and they removed the malloc hooks in recent glibc versions. Bottom line, I think the current solution is a good compromise of complexity vs usefulness. > Couple of minor comments below but otherwise this seems good. I'll take it for a spin through our CI. > Thank you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1635314453 From stuefe at openjdk.org Fri Jul 14 06:10:23 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 06:10:23 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 04:47:23 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Bikeshed Trim log lines > > src/hotspot/share/runtime/trimNativeHeap.cpp line 83: > >> 81: >> 82: // in seconds >> 83: static double now() { return os::elapsedTime(); } > > Do you need the wrapper for this rather than using `os::elapsedTime()` directly? I rather keep it since it keeps the caller sites a bit better readable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1263332231 From dholmes at openjdk.org Fri Jul 14 06:15:07 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 06:15:07 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: Message-ID: <96m94D7WjS-lCa9jfrxkz4MNFYisVBjTABH31Ba4roY=.3d35d95a-c253-4e2b-b2c0-7084ca6016f2@github.com> On Thu, 13 Jul 2023 19:18:38 GMT, Alex Menkov wrote: > The change fixes handling of "suspended" bit in VT state. > The code looks very strange. > java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) > Per log this code came from loom repo with VT integration. > > Testing: tier1-4, updated GetThreadStateMountedTest.java The change seems consistent with the definition of `GetThreadState`. But I note that the interrupt bit should also only be set if the target is alive. ------------- PR Review: https://git.openjdk.org/jdk/pull/14878#pullrequestreview-1529676753 From stuefe at openjdk.org Fri Jul 14 06:26:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 06:26:21 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 05:02:34 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Bikeshed Trim log lines > > test/hotspot/gtest/runtime/test_trim_native.cpp line 44: > >> 42: NativeHeapTrimmer::SuspendMark sm2("Test2"); >> 43: { >> 44: NativeHeapTrimmer::SuspendMark sm3("Test3"); > > What is this actually testing? Everything could be a no-op. Whether it blows up or not with internal asserts that test suspend count overflow. But you are right; this raises another issue. One is that this is never tested since we need a gtest variant running with native trimming enabled. Another is that we need a printout of the trim state, at least for VMError::report, and that could be used here too to check the expected suspend count. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1263344482 From dnsimon at openjdk.org Fri Jul 14 06:39:26 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 14 Jul 2023 06:39:26 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v5] In-Reply-To: References: Message-ID: > This PR adds support for `jdk.hasLibgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. > > * `jdk.hasLibgraal`: the libgraal shared library file is present > * `vm.libgraal.enabled`: libgraal is used as JIT compiler Doug Simon has updated the pull request incrementally with one additional commit since the last revision: fix spelling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14851/files - new: https://git.openjdk.org/jdk/pull/14851/files/05760ec3..1fe7a305 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14851&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14851.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14851/head:pull/14851 PR: https://git.openjdk.org/jdk/pull/14851 From dholmes at openjdk.org Fri Jul 14 06:46:13 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 06:46:13 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 18:56:56 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Bikeshed Trim log lines Test needs a fix for non-Linux. ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1529718875 From stuefe at openjdk.org Fri Jul 14 07:00:18 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 07:00:18 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Fri, 14 Jul 2023 05:37:56 GMT, David Holmes wrote: > Not an area I am familiar with but scanning through the code this seems reasonable. A few minor comments below. > > Thanks. Many thanks, @dholmes-ora! Remarks follow. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1635374935 From stuefe at openjdk.org Fri Jul 14 07:04:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 07:04:06 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Fri, 14 Jul 2023 05:30:50 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback johan > > src/hotspot/share/utilities/globalDefinitions.hpp line 413: > >> 411: >> 412: #define EXACTFMT SIZE_FORMAT "%s" >> 413: #define EXACTFMTARGS(s) byte_size_in_exact_unit(s), exact_unit_for_byte_size(s) > > Not sure this is worthy of inclusion in globalDefinitions. And I struggle to understand the use of "exact" in this context (pre-existing). I find it very useful tbh, but of course, that is subjective. These macros complement the existing "PROPER_" format variants above. "PROPER" can lie, since it may round to the chosen SI unit. "PROPER" is useful for human-compatible pretty printing. "EXACT" is similar to proper, but will always choose the units such that no information is lost. E.g. 1G+1K will be shown in "KB". A comment would certainly be useful, maybe for a followup RFE. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1263375519 From stuefe at openjdk.org Fri Jul 14 07:10:18 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 07:10:18 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Fri, 14 Jul 2023 05:36:41 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback johan > > src/hotspot/os/linux/hugepages.cpp line 92: > >> 90: >> 91: // Given a file that contains a single (integral) number, return that number, 0 and false if failed. >> 92: static bool read_number_file(const char* file, size_t* out) { > > Why `bool` return when you don't check it? Also `*out == 0` also indicates failure. Though why use an out parameter instead of just returning the value? I will fix the caller to handle the error. Keeping error state from "0" return separate is just cleaner, and makes it possible to move/reuse this function should the need arise. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14739#discussion_r1263380896 From duke at openjdk.org Fri Jul 14 07:32:49 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Fri, 14 Jul 2023 07:32:49 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli [v3] In-Reply-To: References: Message-ID: > Please review this small change for slli and slli.uw > slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) > addi with 0 has higher chances to be just a register renaming and not utilise ALU at all > We have observed small positive effect on hifive (and no change on thead). > Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available > testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba > > performance on hifive, before: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | > > with patch: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: Revert changes related with slli_uw ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14823/files - new: https://git.openjdk.org/jdk/pull/14823/files/2df17399..6a35ea6f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14823&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14823&range=01-02 Stats: 17 lines in 4 files changed: 1 ins; 12 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14823/head:pull/14823 PR: https://git.openjdk.org/jdk/pull/14823 From dlong at openjdk.org Fri Jul 14 08:09:41 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 08:09:41 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part Message-ID: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. ------------- Commit messages: - uint - use unsigned - counters are uint - make some counters unsigned - misc overflow fixes Changes: https://git.openjdk.org/jdk/pull/14883/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312077 Stats: 310 lines in 25 files changed: 3 ins; 0 del; 307 mod Patch: https://git.openjdk.org/jdk/pull/14883.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14883/head:pull/14883 PR: https://git.openjdk.org/jdk/pull/14883 From dlong at openjdk.org Fri Jul 14 08:38:40 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 08:38:40 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v2] In-Reply-To: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: > This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. Dean Long has updated the pull request incrementally with two additional commits since the last revision: - windows build - windows build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14883/files - new: https://git.openjdk.org/jdk/pull/14883/files/75e87462..7cc45534 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=00-01 Stats: 5 lines in 4 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14883.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14883/head:pull/14883 PR: https://git.openjdk.org/jdk/pull/14883 From aph at openjdk.org Fri Jul 14 08:43:20 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 14 Jul 2023 08:43:20 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v2] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 08:38:40 GMT, Dean Long wrote: >> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. > > Dean Long has updated the pull request incrementally with two additional commits since the last revision: > > - windows build > - windows build src/hotspot/share/jfr/periodic/sampling/jfrThreadSampler.cpp line 557: > 555: } > 556: > 557: if (next_j <= sleep_to_next) { It looks to me like this changes semantics so that it's now broken if `X_period_millis` wraps around. Do you actually want to change semantics in this case? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14883#discussion_r1263467252 From duke at openjdk.org Fri Jul 14 08:43:18 2023 From: duke at openjdk.org (ExE Boss) Date: Fri, 14 Jul 2023 08:43:18 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() throws UOE if invoked reflectively [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 15:47:35 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Fixed case when calling getCallerClass() from a @CallerSensitive method reflectively src/hotspot/share/prims/stackwalk.cpp line 164: > 162: method->method_holder()->is_subtype_of(constructor_accessor) || > 163: // MethodHandle frames are not hidden and StackWalker::getCallerClass has to filter them out > 164: method->method_holder()->name()->starts_with("java/lang/invoke")); Shouldn?t this?be: Suggestion: method->method_holder()->name()->starts_with("java/lang/invoke/")); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14773#discussion_r1263467101 From shade at openjdk.org Fri Jul 14 08:47:23 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 14 Jul 2023 08:47:23 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v9] In-Reply-To: <4A5ePuT8w2r4mV9m1gufLqFEV3qexzw4SLaX73uze64=.d2d1267b-60c0-4bd5-9613-963f2bb38a13@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> <4A5ePuT8w2r4mV9m1gufLqFEV3qexzw4SLaX73uze64=.d2d1267b-60c0-4bd5-9613-963f2bb38a13@github.com> Message-ID: On Fri, 14 Jul 2023 05:36:28 GMT, Thomas Stuefe wrote: > Idk. Its all solvable, its just code I guess. Maybe for a later RFE? Integrating the dcmd with periodic trimming may have one pro, that is preventing customers from shooting themselves in the foot who issue the dcmd via script and then forgot they did that :-P Yes, I am not saying we should do it in this PR. Just thinking ahead on interaction with manual trim. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1635516979 From dholmes at openjdk.org Fri Jul 14 08:47:25 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 08:47:25 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 13 Jul 2023 18:56:56 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Bikeshed Trim log lines test/hotspot/jtreg/runtime/os/TestTrimNative.java line 285: > 283: checkExpectedLogMessages(output, false, 0); > 284: parseOutputAndLookForNegativeTrim(output, 0, 0); > 285: output.shouldContain("Native trim not supported on this platform"); This needs updating - the current log output is: Native heap trim is not supported on this platform ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1263360337 From shade at openjdk.org Fri Jul 14 08:47:24 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 14 Jul 2023 08:47:24 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 05:28:15 GMT, Thomas Stuefe wrote: > > Realized that in production, we would like to see why trimmer might be late. I think this would look even better: [trimnative-shipilev-2.patch](https://github.com/openjdk/jdk/files/12043977/trimnative-shipilev-2.patch) > > I thought about this too, but you don't really want to know if it was suspended for every wait interval, but for every trim interval. In other words, you want to know how many trims had been moved up because a safepoint had been happening, and how many trims had been skipped due to pause. Well, yes. Isn't that what my patch did? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1635518611 From dlong at openjdk.org Fri Jul 14 08:53:33 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 08:53:33 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v3] In-Reply-To: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: > This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. Dean Long has updated the pull request incrementally with one additional commit since the last revision: fix IndexSetWatch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14883/files - new: https://git.openjdk.org/jdk/pull/14883/files/7cc45534..ace9c66a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=01-02 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14883.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14883/head:pull/14883 PR: https://git.openjdk.org/jdk/pull/14883 From stuefe at openjdk.org Fri Jul 14 09:00:13 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 09:00:13 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <1LV3UuuoMAYpY0PRJWxxhJ0OvQdPgtSqOvl5lgdlcWQ=.fcaf7a11-5687-4349-a4c0-d88c7b41b9e2@github.com> On Fri, 14 Jul 2023 08:45:04 GMT, Aleksey Shipilev wrote: > > > Realized that in production, we would like to see why trimmer might be late. I think this would look even better: [trimnative-shipilev-2.patch](https://github.com/openjdk/jdk/files/12043977/trimnative-shipilev-2.patch) > > > > > > I thought about this too, but you don't really want to know if it was suspended for every wait interval, but for every trim interval. In other words, you want to know how many trims had been moved up because a safepoint had been happening, and how many trims had been skipped due to pause. > > Well, yes. Isn't that what my patch did? I thought you wanted to collect overarching stats about how many trims got delayed. I see now you wanted to do something much simpler. Okay. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1635533457 From luhenry at openjdk.org Fri Jul 14 09:17:21 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 14 Jul 2023 09:17:21 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 13:52:02 GMT, Fei Yang wrote: >> Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > Hello, could you please provide some JMH numbers on platforms like T-Head? (-XX:+UseUnalignedAccesses vs -XX:-UseUnalignedAccesses)? I haven't seen such numbers yet. We only tested on Unmatched before: https://cr.openjdk.org/~dzhang/TestUseUnalignedAccesses/ @RealFYang I expect the T-Head to return that it supports fast unaligned memory accesses, as that's clearly the case. @VladimirKempik agreed that 1 flag could be used instead of 2. However, these 2 flags are there on all other platforms already. I would then do a follow-up change to merge them across all platforms and not just RISC-V. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1635556966 From vkempik at openjdk.org Fri Jul 14 09:25:18 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 14 Jul 2023 09:25:18 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 13:52:02 GMT, Fei Yang wrote: >> Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: >> >> review > > Hello, could you please provide some JMH numbers on platforms like T-Head? (-XX:+UseUnalignedAccesses vs -XX:-UseUnalignedAccesses)? I haven't seen such numbers yet. We only tested on Unmatched before: https://cr.openjdk.org/~dzhang/TestUseUnalignedAccesses/ > @RealFYang I expect the T-Head to return that it supports fast unaligned memory accesses, as that's clearly the case. Don't expect thead to be equipped with 6.4+ kernel any time soon ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1635566884 From dlong at openjdk.org Fri Jul 14 09:25:18 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 09:25:18 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v3] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 08:39:53 GMT, Andrew Haley wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> fix IndexSetWatch > > src/hotspot/share/jfr/periodic/sampling/jfrThreadSampler.cpp line 557: > >> 555: } >> 556: >> 557: if (next_j <= sleep_to_next) { > > It looks to me like this changes semantics so that it's now broken if `X_period_millis` wraps around. Do you actually want to change semantics in this case? Yes, if you mean the case where X_period_millis is max_jlong, next_X is near max_jlong, and the old subtract of a negative sleep_to_next would have caused an overflow, I don't think we want to produce a sample in that case. But I could be wrong. I had comments explaining all the pathological cases, but I guess I removed them. The other one is when next_j < 0, next_n < 0, sleep_to_next < 0, next_j != next_n. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14883#discussion_r1263508485 From fyang at openjdk.org Fri Jul 14 09:28:16 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 14 Jul 2023 09:28:16 GMT Subject: RFR: 8311862: RISC-V: small improvements to slli [v3] In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 07:32:49 GMT, Ilya Gavrilin wrote: >> Please review this small change for slli and slli.uw >> slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) >> addi with 0 has higher chances to be just a register renaming and not utilise ALU at all >> We have observed small positive effect on hifive (and no change on thead). >> Also this patch changes slli.uw and allows it to be used without additional check for UseZba, also providing fallback when Zba is not available >> testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba >> >> performance on hifive, before: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | >> >> with patch: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | > > Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes related with slli_uw PS: I think we should also change the JBS title as I see this not only touches `slli` but also `srli` and `srai`. Suggestion: "8311862: RISC-V: small improvements to shift immediate instructions". ------------- PR Comment: https://git.openjdk.org/jdk/pull/14823#issuecomment-1635571224 From dlong at openjdk.org Fri Jul 14 09:37:41 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 09:37:41 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v4] In-Reply-To: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: > This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. Dean Long has updated the pull request incrementally with one additional commit since the last revision: missing include ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14883/files - new: https://git.openjdk.org/jdk/pull/14883/files/ace9c66a..90cde0be Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=02-03 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14883.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14883/head:pull/14883 PR: https://git.openjdk.org/jdk/pull/14883 From simonis at openjdk.org Fri Jul 14 09:38:58 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 14 Jul 2023 09:38:58 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() throws UOE if invoked reflectively [v3] In-Reply-To: References: Message-ID: <6eWskV7TxrdcO1sxuenv_afJOMj9AF-uBvtCjpNJBqY=.f20d0283-c913-46ed-879b-137b92971b48@github.com> On Fri, 14 Jul 2023 08:39:44 GMT, ExE Boss wrote: >> Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: >> >> Fixed case when calling getCallerClass() from a @CallerSensitive method reflectively > > src/hotspot/share/prims/stackwalk.cpp line 164: > >> 162: method->method_holder()->is_subtype_of(constructor_accessor) || >> 163: // MethodHandle frames are not hidden and StackWalker::getCallerClass has to filter them out >> 164: method->method_holder()->name()->starts_with("java/lang/invoke")); > > Shouldn?t this?be: > Suggestion: > > method->method_holder()->name()->starts_with("java/lang/invoke/")); Thanks, you're right, this is a copy/paste error. It currently doesn't make any difference, until somebody introduces a new package in `java.lang` that starts with `invoke*` :) Fixed locally, will be in the next commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14773#discussion_r1263522481 From luhenry at openjdk.org Fri Jul 14 09:40:16 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 14 Jul 2023 09:40:16 GMT Subject: RFR: 8310949: RISC-V: Initialize UseUnalignedAccesses [v5] In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 09:22:12 GMT, Vladimir Kempik wrote: > Don't expect thead to be equipped with 6.4+ kernel any time soon Absolutely, and until that's the case, the behaviour stays the same as it currently is, so no regressions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14676#issuecomment-1635587850 From stuefe at openjdk.org Fri Jul 14 09:52:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 09:52:04 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v11] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - Aleksey trim stats; state printerich; David better gtest; misc stuff - David simple cosmetics ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/c8d46436..34037c8f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=09-10 Stats: 145 lines in 8 files changed: 135 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Fri Jul 14 09:52:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 09:52:06 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 06:43:15 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Bikeshed Trim log lines > > Test needs a fix for non-Linux. Hi @dholmes-ora, @shipilev, New version: - added the log Aleksey suggested. Tested manually with +SafepointALot, works fine. Briefly considered making a patch out of it but did not since such a test is not reliable enough unless one runs a longer time, which I wanted to avoid. - Discussion with David made me realize a status print function would be nice. Added said function. It is now called for hs_err file generation as well as in dcmd VM.info. - I then used that function to beef up the gtest. I also now call the gtest (only the relevant os.trim... subsection) as a separate jtreg-controlled test. - fixed the unsupported-platforms-case test ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1635599120 From mgronlun at openjdk.org Fri Jul 14 09:53:43 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 14 Jul 2023 09:53:43 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v10] In-Reply-To: References: Message-ID: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review if instead of ifdef ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14761/files - new: https://git.openjdk.org/jdk/pull/14761/files/fe248010..2fc39b77 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14761&range=08-09 Stats: 5 lines in 5 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14761.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14761/head:pull/14761 PR: https://git.openjdk.org/jdk/pull/14761 From mgronlun at openjdk.org Fri Jul 14 09:53:51 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 14 Jul 2023 09:53:51 GMT Subject: RFR: 8303134: JFR: Missing stack trace during chunk rotation stress [v9] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 17:56:37 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. >> >> Testing: jdk_jfr, stress testing. >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > remove space > > Co-authored-by: Tobias Hartmann src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 8305: > 8303: } > 8304: > 8305: #ifdef INCLUDE_JFR Suggestion: #if INCLUDE_JFR src/hotspot/cpu/arm/stubGenerator_arm.cpp line 3162: > 3160: } > 3161: > 3162: #ifdef INCLUDE_JFR Suggestion: #if INCLUDE_JFR src/hotspot/cpu/ppc/stubGenerator_ppc.cpp line 4769: > 4767: } > 4768: > 4769: #ifdef INCLUDE_JFR Suggestion: #if INCLUDE_JFR src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4052: > 4050: } > 4051: > 4052: #ifdef INCLUDE_JFR Suggestion: #if INCLUDE_JFR src/hotspot/cpu/s390/stubGenerator_s390.cpp line 3146: > 3144: } > 3145: > 3146: #ifdef INCLUDE_JFR Suggestion: #if INCLUDE_JFR ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14761#discussion_r1263530276 PR Review Comment: https://git.openjdk.org/jdk/pull/14761#discussion_r1263529887 PR Review Comment: https://git.openjdk.org/jdk/pull/14761#discussion_r1263528987 PR Review Comment: https://git.openjdk.org/jdk/pull/14761#discussion_r1263528443 PR Review Comment: https://git.openjdk.org/jdk/pull/14761#discussion_r1263527754 From stuefe at openjdk.org Fri Jul 14 10:07:32 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 10:07:32 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space Message-ID: TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. ------- With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. This approach has many disadvantages: - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. - We only get 1 shot. It's either one of these two addresses. - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must contain the former). The fact that `CompressedOops::end()` returns the heap range end can be seen as aberration since the actual encoding range end usually far outreaches the heap range end. But as long this code relies on it returning the heap range end, we cannot fix that. ------------- This RFE instead proposes a different approach: - Let us have an API, `os::attempt_reserve_memory_below(address max, ...)`. This API will do its best to reserve memory with a given size and alignment below a given max address. It will, on supporting OSes, attempt to use OS-specific means to find a suitable address space hole to place the reservation in. Otherwise, it will do the typical ladder-reservation approach with an adjustable maximum number of tries. - Let's use this API to reserve zero-base-friendly class space. Let's remove all knowledge of heap and CompressedOops. Now we are independent of what the heap does. It may or may not be located in lower address ranges. If it is, the new API will work around it and find a gap to place the class space if it is not, even better. - Let's remove the "leave a gap for class space" logic from Heap reservation. We don't need it. It is harmful: Heap should have the best chance for zero-based - if I only can have one, I rather have a zero-based heap than a zero-based class space. The end result will be a JVM with a much better chance to get zero-based class space *and* zero-based heap; we will have removed dependencies between heap and class space; we will have an API that can be used for similar problems (e.g. an obvious future enhancement would be to use this new reservation API for zero-based heap reservation as well, and other places could use it too, eg. Shenandoah CollectionSet reservation). ------------- Commit messages: - better zero-based reservation strategy Changes: https://git.openjdk.org/jdk/pull/14867/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14867&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312018 Stats: 490 lines in 11 files changed: 458 ins; 30 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14867/head:pull/14867 PR: https://git.openjdk.org/jdk/pull/14867 From stuefe at openjdk.org Fri Jul 14 10:07:32 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 10:07:32 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 10:17:40 GMT, Thomas Stuefe wrote: > TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. > > A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. > > ------- > > With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. > > We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. > > This approach has many disadvantages: > > - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. > > - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. > > - We only get 1 shot. It's either one of these two addresses. > > - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. > > - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). > > - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. > > - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. > > - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must cont... Pinging @iklam ------------- PR Comment: https://git.openjdk.org/jdk/pull/14867#issuecomment-1635621496 From dlong at openjdk.org Fri Jul 14 10:33:51 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 10:33:51 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v5] In-Reply-To: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: > This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. Dean Long has updated the pull request incrementally with one additional commit since the last revision: fix IndexSetWatch range ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14883/files - new: https://git.openjdk.org/jdk/pull/14883/files/90cde0be..7ed08326 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14883.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14883/head:pull/14883 PR: https://git.openjdk.org/jdk/pull/14883 From dzhang at openjdk.org Fri Jul 14 10:49:28 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Fri, 14 Jul 2023 10:49:28 GMT Subject: RFR: 8309258: RISC-V: Add riscv_hwprobe syscall [v8] In-Reply-To: References: <89HFXISwtycccRZBgh11aTPm9S3ZwQudjonnIEQN2qU=.0dd93a20-0bc3-4ab5-84c6-a0a133ee00e6@github.com> Message-ID: On Mon, 19 Jun 2023 17:13:15 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Linux kernel 6.4 will come with the new syscall https://docs.kernel.org/riscv/hwprobe.html to determine CPU and features. >> RV cpus features/capabilities can vastly differ and it is not feasible for users to manually enable the correct feature set. >> Today the VM uses the ELF aux vector and cpuinfo to gather some information about CPU capabilities. >> >> Currently features are track with a bit field, this is insufficient. >> There are many capabilities and these can have values attached to them. >> CPU features should also be possible to turn if we can determine vendor (hwprobe). >> >> This patchs adds the syscall, uses the syscall in combination with the aux and cpuinfo to enable features by default. >> If there is a vendor specific path it calls that in addition. >> Then we build the feature string(and bit field) and update flags accordingly. >> >> Tested t1 and hwprobe with: >> https://lore.kernel.org/qemu-devel/7f8d733df6e9b6151e9efb843d55441348805e70.camel at rivosinc.com/ > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > removed q and zbc from rivos common features Hi, I noticed fastdebug building crashed when running `java -version` on SG2042 (64cores c910): $ ./java -version # # A fatal error has been detected by the Java Runtime Environment: # # SIGILL (0x4) at pc=0x0000003f73a70892, pid=347134, tid=347135 # # JRE version: (22.0) (fastdebug build ) # Java VM: OpenJDK 64-Bit Server VM (fastdebug 22-internal-adhoc.zhangdingli.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-riscv64) # Problematic frame: # v ~StubRoutines::jint_disjoint_arraycopy 0x0000003f73a70892 # # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/perfxlab01/jdkbuild/compare/jdk_after/bin/core.347134) # # An error report file with more information is saved as: # /home/perfxlab01/jdkbuild/compare/jdk_after/bin/hs_err_pid347134.log [0.727s][warning][os] Loading hsdis library failed # # Aborted (core dumped) The instruct at the pc of crash is `020f6007`, which is `vle32.v v0,(t5)`. Complete log: https://cr.openjdk.org/~dzhang/hs_err_pid347134.log System Information: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 22.10 Release: 22.10 Codename: kinetic $ uname -a Linux riscv-02 6.1.31 #6 SMP Tue Jul 11 18:48:27 CST 2023 riscv64 riscv64 riscv64 GNU/Linux $ cat /proc/cpuinfo processor : 0 hart : 4 isa : rv64imafdcv mmu : sv39 mvendorid : 0x5b7 marchid : 0x0 mimpid : 0x0 ... The release building `java -version` doesn't cause a crash, but when set it as JAVA_HOME and then executes `jtreg -version`, it will crash for similar reasons. We can see the flag under release building: $ ./java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -version | grep UseRVV bool UseRVV = true {ARCH experimental} {default} bool UseRVVForBigIntegerShiftIntrinsics = true {ARCH product} {default} The problem looks like c910 only supports rvv0.7.1, while openjdk only supports rvv1.0. `UseRVV` is automatically enabled on SG2042, even though it's a experimental VMOption. Maybe we should set `UPDATE_DEFAULT(UseRVV)` to `NO_UPDATE_DEFAULT` for now or do something depending on the version of rvv? I would appreciate it if some advice is given, thank you! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14445#issuecomment-1635676161 From mgronlun at openjdk.org Fri Jul 14 10:52:22 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 14 Jul 2023 10:52:22 GMT Subject: Integrated: 8303134: JFR: Missing stack trace during chunk rotation stress In-Reply-To: References: Message-ID: On Mon, 3 Jul 2023 18:33:30 GMT, Markus Gr?nlund wrote: > Greetings, > > please help review this fix for some problematic situations in JFR where data can be lost. Most problems originate from writing event data in the wrong epoch due to safepointing. Detailed information about the changes is in the JIRA issue. > > Testing: jdk_jfr, stress testing. > > Thanks > Markus This pull request has now been integrated. Changeset: 7539cc09 Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk/commit/7539cc092d0a6b5604351d19e555101fcff75f58 Stats: 835 lines in 36 files changed: 698 ins; 76 del; 61 mod 8303134: JFR: Missing stack trace during chunk rotation stress Reviewed-by: egahlin, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/14761 From aph at openjdk.org Fri Jul 14 11:03:13 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 14 Jul 2023 11:03:13 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v5] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 09:21:50 GMT, Dean Long wrote: >> src/hotspot/share/jfr/periodic/sampling/jfrThreadSampler.cpp line 557: >> >>> 555: } >>> 556: >>> 557: if (next_j <= sleep_to_next) { >> >> It looks to me like this changes semantics so that it's now broken if `X_period_millis` wraps around. Do you actually want to change semantics in this case? > > Yes, if you mean the case where X_period_millis is max_jlong, next_X is near max_jlong, and the old subtract of a negative sleep_to_next would have caused an overflow, I don't think we want to produce a sample in that case. But I could be wrong. > > I had comments explaining all the pathological cases, but I guess I removed them. The other one is when next_j < 0, next_n < 0, sleep_to_next < 0, next_j != next_n. OK, I have no objections if you're covered all the bases. Having said that, I'm not sure that possibly-significant semantic changes like this one should be almost hidden in a patch like this one, because a maintenance programmer in a decade or two's time might think it was a mistake. So maybe the comments were good. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14883#discussion_r1263600566 From mgronlun at openjdk.org Fri Jul 14 11:48:21 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 14 Jul 2023 11:48:21 GMT Subject: [jdk21] RFR: 8303134: JFR: Missing stack trace during chunk rotation stress Message-ID: Backport. ------------- Commit messages: - Backport 7539cc092d0a6b5604351d19e555101fcff75f58 Changes: https://git.openjdk.org/jdk21/pull/129/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=129&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8303134 Stats: 839 lines in 36 files changed: 700 ins; 76 del; 63 mod Patch: https://git.openjdk.org/jdk21/pull/129.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/129/head:pull/129 PR: https://git.openjdk.org/jdk21/pull/129 From rkennke at openjdk.org Fri Jul 14 11:51:17 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 14 Jul 2023 11:51:17 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 10:17:40 GMT, Thomas Stuefe wrote: > TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. > > A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. > > ------- > > With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. > > We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. > > This approach has many disadvantages: > > - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. > > - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. > > - We only get 1 shot. It's either one of these two addresses. > > - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. > > - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). > > - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. > > - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. > > - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must cont... I am not familiar enough with these places of HotSpot (and the OSes, for that matter), but I have questions/comments. src/hotspot/os/linux/os_linux.cpp line 4245: > 4243: #define ALGNDWN(x) align_down(x, alignment) > 4244: > 4245: FILE* f = os::fopen("/proc/self/maps", "r"); How sane is it to parse /proc stuff, especially when done line-by-line? Is this not racy? I am not familiar with these things. I assume there is no sane(r) API for the purpose? src/hotspot/share/memory/metaspace.cpp line 602: > 600: constexpr int num_tries = 8; > 601: char* addr = nullptr; > 602: for (int i = 0; i < 2 && addr == nullptr; i ++) { I think it would be clearer to simply unroll this loop. ------------- PR Review: https://git.openjdk.org/jdk/pull/14867#pullrequestreview-1530114822 PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1263642120 PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1263615998 From egahlin at openjdk.org Fri Jul 14 12:41:16 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Fri, 14 Jul 2023 12:41:16 GMT Subject: [jdk21] RFR: 8303134: JFR: Missing stack trace during chunk rotation stress In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 11:38:32 GMT, Markus Gr?nlund wrote: > Backport. Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk21/pull/129#pullrequestreview-1530228022 From luhenry at openjdk.org Fri Jul 14 12:43:13 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 14 Jul 2023 12:43:13 GMT Subject: RFR: 8309258: RISC-V: Add riscv_hwprobe syscall [v8] In-Reply-To: References: <89HFXISwtycccRZBgh11aTPm9S3ZwQudjonnIEQN2qU=.0dd93a20-0bc3-4ab5-84c6-a0a133ee00e6@github.com> Message-ID: On Fri, 14 Jul 2023 10:45:52 GMT, Dingli Zhang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> removed q and zbc from rivos common features > > Hi, I noticed fastdebug building crashed when running `java -version` on SG2042 (64cores c910): > > > $ ./java -version > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGILL (0x4) at pc=0x0000003f73a70892, pid=347134, tid=347135 > # > # JRE version: (22.0) (fastdebug build ) > # Java VM: OpenJDK 64-Bit Server VM (fastdebug 22-internal-adhoc.zhangdingli.jdk, mixed mode, tiered, compressed oops, compressed class ptrs, g1 gc, linux-riscv64) > # Problematic frame: > # v ~StubRoutines::jint_disjoint_arraycopy 0x0000003f73a70892 > # > # Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/perfxlab01/jdkbuild/compare/jdk_after/bin/core.347134) > # > # An error report file with more information is saved as: > # /home/perfxlab01/jdkbuild/compare/jdk_after/bin/hs_err_pid347134.log > [0.727s][warning][os] Loading hsdis library failed > # > # > Aborted (core dumped) > > The instruct at the pc of crash is `020f6007`, which is `vle32.v v0,(t5)`. > Complete log: https://cr.openjdk.org/~dzhang/hs_err_pid347134.log > > System Information: > > $ lsb_release -a > No LSB modules are available. > Distributor ID: Ubuntu > Description: Ubuntu 22.10 > Release: 22.10 > Codename: kinetic > > $ uname -a > Linux riscv-02 6.1.31 #6 SMP Tue Jul 11 18:48:27 CST 2023 riscv64 riscv64 riscv64 GNU/Linux > > $ cat /proc/cpuinfo > processor : 0 > hart : 4 > isa : rv64imafdcv > mmu : sv39 > mvendorid : 0x5b7 > marchid : 0x0 > mimpid : 0x0 > ... > > > The release building `java -version` doesn't cause a crash, but when set it as JAVA_HOME and then executes `jtreg -version`, it will crash for similar reasons. We can see the flag under release building: > > $ ./java -XX:+PrintFlagsFinal -XX:+UnlockExperimentalVMOptions -version | grep UseRVV > bool UseRVV = true {ARCH experimental} {default} > bool UseRVVForBigIntegerShiftIntrinsics = true {ARCH product} {default} > > The problem looks like c910 only supports rvv0.7.1, while openjdk only supports rvv1.0. `UseRVV` is automatically enabled on SG2042, even though it's a experimental VMOption. Maybe we should set `UPDATE_DEFAULT(UseRVV)` to `NO_UPDATE_DEFAULT` for now or do something depending on the version of rvv? > > I would appreciate it if some advice is given, thank you! @DingliZhang I expect that it gets enabled because `hwcap` sets the `V` bit to 1, which sets `ext_V.enabled = true` which, when calling `update_flags` sets `UseRVV` to true, correct? `riscv_hwprobe` does support versioning (checking the exact details). So if anything, we should indeed not set `UseRVV` when probing via hwcap, but we _should_ set `UseRVV` when probing via hwprobe (given we have RVV 1.0 of course). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14445#issuecomment-1635806911 From stuefe at openjdk.org Fri Jul 14 12:46:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 12:46:06 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v4] In-Reply-To: References: Message-ID: > Today, if we use UseTransparentHugePages, we assume that the *static* hugepage detection we do is valid for THPs: > - that THPs use the page size (in hotspot used as "default large page size") found in `/proc/memlimit` "Hugepagesize") > - that THPs are enabled if that page size is >0. > > Both assumptions are incorrect: > > - whether THPs are enabled should be checked at `/sys/kernel/mm/transparent_hugepage/enabled`, which is a tri-state value ("always", "madvise", "never"). THPs are available for the first two states. > - The page size employed by `khugepaged` is set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. It can differ from the default page size used for static hugepages. For example, we could configure a system such that it uses 1G static hugepages, but the THP page size would still be 2M. > > ------ > > About the patch: > > This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. > > Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). > > ------------- > > Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: > > > thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled > always madvise [never] > thomas at starfish $ cat /proc/meminfo | grep Hugepage > Hugepagesize: 1048576 kB > > > Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Using the default large page size: 1G > [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G > ... > [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G > > > With patch, we correctly refuse to use large pages (and we log more info): > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) > [0.001s][info][pagesize] default pagesize: 1G > [0... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Feedback David ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14739/files - new: https://git.openjdk.org/jdk/pull/14739/files/69c39c8a..365473c7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14739&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14739&range=02-03 Stats: 9 lines in 2 files changed: 2 ins; 1 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14739.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14739/head:pull/14739 PR: https://git.openjdk.org/jdk/pull/14739 From stuefe at openjdk.org Fri Jul 14 12:46:09 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 12:46:09 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v3] In-Reply-To: References: <0pUGYbeAxaEq3O5k72jU-EFM-nQMbGEeNcM8hlqzXD8=.4b0e973c-1a97-4d6f-8fa4-b6ac89527a23@github.com> Message-ID: On Fri, 14 Jul 2023 05:37:56 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback johan > > Not an area I am familiar with but scanning through the code this seems reasonable. A few minor comments below. > > Thanks. @dholmes-ora : I massaged in your feedback. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1635804412 From duke at openjdk.org Fri Jul 14 12:50:31 2023 From: duke at openjdk.org (sid8606) Date: Fri, 14 Jul 2023 12:50:31 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure Message-ID: All faults on s390x give the address only on page granularity. e.g. if you use 0x123456 as fail address you get si_addr == 0x123000 ------------- Commit messages: - 8312014: [s390x] TestSigInfoInHsErrFile.java Failure Changes: https://git.openjdk.org/jdk/pull/14888/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312014 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From mgronlun at openjdk.org Fri Jul 14 12:52:17 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Fri, 14 Jul 2023 12:52:17 GMT Subject: [jdk21] Integrated: 8303134: JFR: Missing stack trace during chunk rotation stress In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 11:38:32 GMT, Markus Gr?nlund wrote: > Backport. This pull request has now been integrated. Changeset: c199b8c7 Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk21/commit/c199b8c761c14542953a01c1efd6ccec95179234 Stats: 839 lines in 36 files changed: 700 ins; 76 del; 63 mod 8303134: JFR: Missing stack trace during chunk rotation stress Reviewed-by: egahlin Backport-of: 7539cc092d0a6b5604351d19e555101fcff75f58 ------------- PR: https://git.openjdk.org/jdk21/pull/129 From duke at openjdk.org Fri Jul 14 12:59:02 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 14 Jul 2023 12:59:02 GMT Subject: RFR: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs In-Reply-To: <4qSH1d3mzgu1jtSa-O3r-MqBqzpzWpCYHK5Bn0lc14Q=.e2aae8a2-5639-4db9-9a25-1718d9f6bbf4@github.com> References: <4qSH1d3mzgu1jtSa-O3r-MqBqzpzWpCYHK5Bn0lc14Q=.e2aae8a2-5639-4db9-9a25-1718d9f6bbf4@github.com> Message-ID: On Thu, 13 Jul 2023 23:07:35 GMT, Vladimir Kozlov wrote: >> Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. >> I also took this opportunity to fix some other minor issues with logging: >> 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. >> 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. >> 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. >> 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. >> 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile >> >> I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. >> >> Testing: GHA testing passed > > @ashu-mehra, can you show output before and after this changes? It would be nice to see what you are fixing. @vnkozlov thanks for reviewing it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14880#issuecomment-1635826005 From stuefe at openjdk.org Fri Jul 14 13:06:48 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 13:06:48 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v2] In-Reply-To: References: Message-ID: <0a7sc8pYCy_qYoAcFCqoFwJhiVxA6CSbMM16Ani_y6g=.2ee8936b-d207-4b99-b9b9-951ccab91646@github.com> On Fri, 14 Jul 2023 11:47:57 GMT, Roman Kennke wrote: > I am not familiar enough with these places of HotSpot (and the OSes, for that matter), but I have questions/comments. Thank you Roman. I worked in your feedback; while testing I found an off-by-one, and a minor flaw with tracing. About your procfs question, this should be safe. We only do this once, at start, and have a reasonable fallback. Note that hotspot already reads from this file for other purposes, it seems to work well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14867#issuecomment-1635831870 From stuefe at openjdk.org Fri Jul 14 13:06:46 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 13:06:46 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v2] In-Reply-To: References: Message-ID: > TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. > > A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. > > ------- > > With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. > > We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. > > This approach has many disadvantages: > > - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. > > - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. > > - We only get 1 shot. It's either one of these two addresses. > > - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. > > - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). > > - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. > > - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. > > - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must cont... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Feedback Roman; fix off-by-1; fix tracing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14867/files - new: https://git.openjdk.org/jdk/pull/14867/files/22fe1e40..8bb7d705 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14867&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14867&range=00-01 Stats: 25 lines in 2 files changed: 7 ins; 12 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14867/head:pull/14867 PR: https://git.openjdk.org/jdk/pull/14867 From amitkumar at openjdk.org Fri Jul 14 13:31:16 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Fri, 14 Jul 2023 13:31:16 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure In-Reply-To: References: Message-ID: <9HwtHNPda3Utaf9OjVxuv-VO4jSaQnXoDNZeKeWKp54=.a913553b-0cc1-482a-b676-b3f477f5f4f2@github.com> On Fri, 14 Jul 2023 12:44:03 GMT, sid8606 wrote: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr == 0x123000 src/hotspot/share/utilities/vmError.hpp line 210: > 208: > 209: // Non-null address guaranteed to generate a SEGV mapping error on read, for test purposes. > 210: static constexpr intptr_t segfault_address = AIX_ONLY(-1) NOT_AIX(4 * K); Are we sure ARM & RISC-V will be happy with these changes, Maybe using `S390_ONLY` will be appropriate(?) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1263737821 From vkempik at openjdk.org Fri Jul 14 13:31:15 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 14 Jul 2023 13:31:15 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v2] In-Reply-To: <5mpUd4sUz6TSOzToHZeJsNcB31XewH19f_sTWIq9Gxk=.f71c7250-c196-4c0a-8abe-9cdb2bc195cb@github.com> References: <5mpUd4sUz6TSOzToHZeJsNcB31XewH19f_sTWIq9Gxk=.f71c7250-c196-4c0a-8abe-9cdb2bc195cb@github.com> Message-ID: <6Jisk6dB60v93BY374m-QxU48ZrDknVbkRq7Q7iey4Q=.c0bf4e27-49e9-446a-8e80-1d7d17554d08@github.com> On Thu, 13 Jul 2023 09:39:05 GMT, Fei Yang wrote: > OK. I have went through the changes in function `generate_compare_long_string_different_encoding` and here is what I am thinking. I see there are two purposes for these changes: > > 1. Make sure that the memory accesses happened in `SMALL_LOOP` 8-bytes aligned on strL; > 2. Avoid unaligned accessed in the loop tail (the if-else structure with AvoidUnalignedAccesses check); > > I am fine with the changes for the first purpose. In fact, my previous comment was about the second one. Since it's the loop tail, making using of misaligned_load (and thus eliminating the if block) shouldn't affect much. Please consider that. THank you for looking at this, I'll try simplifying the tail and will test performance as soon as I find some time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1635866723 From fyang at openjdk.org Fri Jul 14 13:49:59 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 14 Jul 2023 13:49:59 GMT Subject: RFR: 8311862: RISC-V: small improvements to shift immediate instructions [v3] In-Reply-To: References: Message-ID: <38UmQbnq0WRpINIjt6Y9R0d-rb60-k6Bl-DcAE83EIc=.8dc4256b-e3b0-4b2e-a08a-64be4e46ef9c@github.com> On Fri, 14 Jul 2023 07:32:49 GMT, Ilya Gavrilin wrote: >> Please review this small change for slli srli and srai >> slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) >> addi with 0 has higher chances to be just a register renaming and not utilise ALU at all >> We have observed small positive effect on hifive (and no change on thead). >> >> testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba >> >> performance on hifive, before: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | >> >> with patch: >> | Benchmark | Mode | Cnt | Score | | Error | Units | >> |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| >> | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | >> | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | > > Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: > > Revert changes related with slli_uw Two nits remain. Looks good otherwise. src/hotspot/cpu/riscv/assembler_riscv.hpp line 2793: > 2791: _slli(Rd, Rs1, shamt); \ > 2792: } else { \ > 2793: if(Rd != Rs1) { \ Nit: s/if(Rd != Rs1) {/if (Rd != Rs1) {/ src/hotspot/cpu/riscv/assembler_riscv.hpp line 2814: > 2812: NORMAL_NAME(Rd, Rs1, shamt); \ > 2813: } else { \ > 2814: if(Rd != Rs1) { \ Nit: s/if(Rd != Rs1) {/if (Rd != Rs1) {/ ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14823#pullrequestreview-1530340814 PR Review Comment: https://git.openjdk.org/jdk/pull/14823#discussion_r1263757303 PR Review Comment: https://git.openjdk.org/jdk/pull/14823#discussion_r1263757495 From dnsimon at openjdk.org Fri Jul 14 14:13:18 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 14 Jul 2023 14:13:18 GMT Subject: RFR: 8311946: add support for libgraal specific jtreg tests [v5] In-Reply-To: References: Message-ID: <-40KArW-YDvWd7OPpdppUue_r_jSr31G7JuP9-xHeT4=.cbe1514a-3b41-4f7b-9794-397c06a6b155@github.com> On Fri, 14 Jul 2023 06:39:26 GMT, Doug Simon wrote: >> This PR adds support for `jdk.hasLibgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. >> >> * `jdk.hasLibgraal`: the libgraal shared library file is present >> * `vm.libgraal.enabled`: libgraal is used as JIT compiler > > Doug Simon has updated the pull request incrementally with one additional commit since the last revision: > > fix spelling Thanks for the reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14851#issuecomment-1635917931 From dnsimon at openjdk.org Fri Jul 14 14:13:20 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 14 Jul 2023 14:13:20 GMT Subject: Integrated: 8311946: add support for libgraal specific jtreg tests In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 13:48:23 GMT, Doug Simon wrote: > This PR adds support for `jdk.hasLibgraal` and `vm.libgraal.enabled` properties that can be used with the jtreg `@requires` tag to determine if a libgraal specific test should run. > > * `jdk.hasLibgraal`: the libgraal shared library file is present > * `vm.libgraal.enabled`: libgraal is used as JIT compiler This pull request has now been integrated. Changeset: a63f865f Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/a63f865feba4cb82ec6e6529b9097bc709ace77a Stats: 107 lines in 8 files changed: 93 ins; 8 del; 6 mod 8311946: add support for libgraal specific jtreg tests Reviewed-by: kvn, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/14851 From rkennke at openjdk.org Fri Jul 14 14:37:19 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Fri, 14 Jul 2023 14:37:19 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v2] In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 13:06:46 GMT, Thomas Stuefe wrote: >> TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. >> >> A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. >> >> ------- >> >> With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. >> >> We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. >> >> This approach has many disadvantages: >> >> - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. >> >> - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. >> >> - We only get 1 shot. It's either one of these two addresses. >> >> - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. >> >> - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). >> >> - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. >> >> - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. >> >> - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loos... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback Roman; fix off-by-1; fix tracing Looks good to me now, but somebody else who is more familiar with these things should review it as well. Thank you! ------------- Marked as reviewed by rkennke (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14867#pullrequestreview-1530426524 From duke at openjdk.org Fri Jul 14 14:38:45 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Fri, 14 Jul 2023 14:38:45 GMT Subject: RFR: 8311862: RISC-V: small improvements to shift immediate instructions [v4] In-Reply-To: References: Message-ID: > Please review this small change for slli srli and srai > slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) > addi with 0 has higher chances to be just a register renaming and not utilise ALU at all > We have observed small positive effect on hifive (and no change on thead). > > testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba > > performance on hifive, before: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | > > with patch: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | Ilya Gavrilin has updated the pull request incrementally with one additional commit since the last revision: Fix whitespaces in assembler_riscv.hpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14823/files - new: https://git.openjdk.org/jdk/pull/14823/files/6a35ea6f..c7b04164 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14823&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14823&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14823.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14823/head:pull/14823 PR: https://git.openjdk.org/jdk/pull/14823 From duke at openjdk.org Fri Jul 14 14:38:47 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Fri, 14 Jul 2023 14:38:47 GMT Subject: Integrated: 8311862: RISC-V: small improvements to shift immediate instructions In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 09:02:39 GMT, Ilya Gavrilin wrote: > Please review this small change for slli srli and srai > slli change allows to replace slli Rd, Rs, 0 with addi Rd, Rs, 0 (and no operation emited if Rd == Rs) > addi with 0 has higher chances to be just a register renaming and not utilise ALU at all > We have observed small positive effect on hifive (and no change on thead). > > testing: tier1 and tier2 on hifive, also hotspot/jtreg/compiler/intrinsics/string tests on Qemu with UseZba > > performance on hifive, before: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 4035.143 | ? | 191.262 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 1145.807 | ? | 14.610 | ns/op | > > with patch: > | Benchmark | Mode | Cnt | Score | | Error | Units | > |:-----------------------------------:|:----:|:---:|:--------:|:-:|:-------:|:-----:| > | StringIndexOf.advancedWithShortSub1 | avgt | 25 | 3613.943 | ? | 178.153 | ns/op | > | StringIndexOf.advancedWithShortSub2 | avgt | 25 | 923.169 | ? | 47.123| ns/op | This pull request has now been integrated. Changeset: f3b96f69 Author: Ilya Gavrilin Committer: Vladimir Kempik URL: https://git.openjdk.org/jdk/commit/f3b96f6937395246f09ac2ef3dfca5854217a0da Stats: 14 lines in 1 file changed: 12 ins; 0 del; 2 mod 8311862: RISC-V: small improvements to shift immediate instructions Reviewed-by: luhenry, fjiang, fyang ------------- PR: https://git.openjdk.org/jdk/pull/14823 From simonis at openjdk.org Fri Jul 14 15:43:12 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 14 Jul 2023 15:43:12 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() throws UOE if invoked reflectively [v3] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 19:27:01 GMT, Mandy Chung wrote: > I wonder if we should move the check to throw UOE if called by caller-sensitive method in Java as a general guidance to implement the runtime in Java if desirable. That means it requires the VM to fill not only the class in the buffer but also need a flag to indicate the method is caller-sensitive or not. It's a tradeoff of the buffer size. The common case for `getCallerClass` is being invoked statically and should find the class from the first batch. Only if it's invoked reflectively and if filtered in the Java, it'll fetch more batches and hence the performance would not be as fast as filtered in VM but I think that's okay since it's uncommon. > > Would you have cycle to implement this alternative and determine any performance impact to common cases? Then evaluate this further. > > The benchmark is at `test/micro/org/openjdk/bench/java/lang/StackWalkBench.java`. I'm open to investigate other approaches, but this is a real bug and I'd first like to fix it before thinking about implementation improvements. Notice also, that this issue impacts 17 and 21 as well, so the fix should be easily and quickly downportable. We already have [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447) for performance improvements of `getCallerClass()` (which I'm also working on). I'd therefore suggest to first fix this and postpone further refactorings and improvements to [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447) or another follow-up issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1636039777 From mchung at openjdk.org Fri Jul 14 16:20:14 2023 From: mchung at openjdk.org (Mandy Chung) Date: Fri, 14 Jul 2023 16:20:14 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() throws UOE if invoked reflectively [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 15:47:35 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Fixed case when calling getCallerClass() from a @CallerSensitive method reflectively I expect `getCallerClass` is rarely called via reflection. JDK-8210375 was reported 5 years ago but no customer reports this issue since then. This bug is not critical to be fixed immediately and so we should look into a proper solution. I do have a question if filtering in the VM is the right thing to do. We need to consider the alternatives and understand why this solution is chosen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1636082537 From stuefe at openjdk.org Fri Jul 14 17:11:01 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 17:11:01 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure In-Reply-To: References: Message-ID: <3NM-Es2hN_XnwUtve6UwwpIvXGhda1TJF0HxHuQ8w-g=.9f2a88b8-3318-49d7-8391-682de77993c8@github.com> On Fri, 14 Jul 2023 12:44:03 GMT, sid8606 wrote: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr == 0x123000 I'm surprised that not much more breaks. We rely a lot on si_addr. Are we sure this is not a kernel bug on s390? Any chance to get this behavior changed on the kernel level? ------------- PR Review: https://git.openjdk.org/jdk/pull/14888#pullrequestreview-1530655474 From stuefe at openjdk.org Fri Jul 14 17:11:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 17:11:04 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure In-Reply-To: <9HwtHNPda3Utaf9OjVxuv-VO4jSaQnXoDNZeKeWKp54=.a913553b-0cc1-482a-b676-b3f477f5f4f2@github.com> References: <9HwtHNPda3Utaf9OjVxuv-VO4jSaQnXoDNZeKeWKp54=.a913553b-0cc1-482a-b676-b3f477f5f4f2@github.com> Message-ID: <_-3DnlZiSfWIdQeHYgTKDbAVNmbznTCff6lgHBaK-NM=.e9021c70-dc28-4d33-9b1e-b20610eb2033@github.com> On Fri, 14 Jul 2023 13:27:55 GMT, Amit Kumar wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr == 0x123000 > > src/hotspot/share/utilities/vmError.hpp line 210: > >> 208: >> 209: // Non-null address guaranteed to generate a SEGV mapping error on read, for test purposes. >> 210: static constexpr intptr_t segfault_address = AIX_ONLY(-1) NOT_AIX(4 * K); > > Are we sure ARM & RISC-V will be happy with these changes, Maybe using `S390_ONLY` will be appropriate(?) Yes. And before we start cascading ifdefs here, please spread this definition out into the respective platform files. For the s390 version, could you please add a clear comment describing the reasoning? Do all s390 linux variants using 4K pages - is it valid to hardcode that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1263952812 From matsaave at openjdk.org Fri Jul 14 18:14:41 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 14 Jul 2023 18:14:41 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() Message-ID: Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. // We have finished dumping the static archive. At this point, there may be pending VM // operations. We have changed some global states (such as vmClasses::_klasses) that // may cause these VM operations to fail. For safety, forget these operations and // exit the VM directly. void MetaspaceShared::exit_after_static_dump() { os::_exit(0); } As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. ------------- Commit messages: - Merge branch 'master' into remove_exit_after_dump_8306582 - Windows fix - 8306582: Remove MetaspaceShared::exit_after_static_dump() Changes: https://git.openjdk.org/jdk/pull/14879/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14879&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8306582 Stats: 109 lines in 10 files changed: 76 ins; 26 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14879.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14879/head:pull/14879 PR: https://git.openjdk.org/jdk/pull/14879 From kvn at openjdk.org Fri Jul 14 18:20:00 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 14 Jul 2023 18:20:00 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v5] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 10:33:51 GMT, Dean Long wrote: >> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > fix IndexSetWatch range Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14883#pullrequestreview-1530764752 From amenkov at openjdk.org Fri Jul 14 18:34:12 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 14 Jul 2023 18:34:12 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: Message-ID: <5Go0n-I4fvK1kVngILYSuwhx0APHFlTFxF2us8R9m58=.15f4d1e0-38ea-41e7-a3ef-02e287625380@github.com> On Fri, 14 Jul 2023 03:54:42 GMT, Serguei Spitsyn wrote: > @alexmenkov Do you consider backporting this to 21? maybe it makes sense. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 804: > >> 802: if (ext_suspended && ((state & JVMTI_THREAD_STATE_ALIVE) != 0)) { >> 803: state |= JVMTI_THREAD_STATE_SUSPENDED; >> 804: } > > One question unrelated to this bug and your fix. > I wonder if any check and handling is needed for the case: > `if (ext_suspended && ((state & JVMTI_THREAD_STATE_ALIVE) == 0))` > Not sure this condition is even possible. But do we need to add an assert here? AFAIU it's possible in the case when we have terminated VT and JvmtiVTSuspender is requested to suspend all virtual threads ------------- PR Comment: https://git.openjdk.org/jdk/pull/14878#issuecomment-1636234168 PR Review Comment: https://git.openjdk.org/jdk/pull/14878#discussion_r1264027796 From amenkov at openjdk.org Fri Jul 14 18:44:10 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 14 Jul 2023 18:44:10 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: <96m94D7WjS-lCa9jfrxkz4MNFYisVBjTABH31Ba4roY=.3d35d95a-c253-4e2b-b2c0-7084ca6016f2@github.com> References: <96m94D7WjS-lCa9jfrxkz4MNFYisVBjTABH31Ba4roY=.3d35d95a-c253-4e2b-b2c0-7084ca6016f2@github.com> Message-ID: On Fri, 14 Jul 2023 06:11:55 GMT, David Holmes wrote: > The change seems consistent with the definition of `GetThreadState`. But I note that the interrupt bit should also only be set if the target is alive. we get interrupt bit from Thread object, so the value is consistent with terminated state. suspend bit is a bit different - see my reply to Serguei ------------- PR Comment: https://git.openjdk.org/jdk/pull/14878#issuecomment-1636252686 From shade at openjdk.org Fri Jul 14 18:52:17 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 14 Jul 2023 18:52:17 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v11] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 09:52:04 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Aleksey trim stats; state printerich; David better gtest; misc stuff > - David simple cosmetics I am okay with this, with only minor nits left. Good job! src/hotspot/share/runtime/trimNativeHeap.cpp line 133: > 131: } > 132: > 133: log_trace(trimnative)("Times %u suspended, %u timed, %u safepoint", `Times: `, I think. src/hotspot/share/runtime/trimNativeHeap.cpp line 204: > 202: } > 203: } > 204: log_debug(trimnative)("Trim resumed after %s (%u suspend requests)", reason, n); Can you do it like I did in https://github.com/openjdk/jdk/files/12043977/trimnative-shipilev-2.patch ? We don't say "resumed" if it really want not resumed due to non-zero suspend count. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1530805781 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1264043485 PR Review Comment: https://git.openjdk.org/jdk/pull/14781#discussion_r1264037109 From dlong at openjdk.org Fri Jul 14 18:59:52 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 18:59:52 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v6] In-Reply-To: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: <6TzL1D5xX6aIBystm7zomVWgnt61_4IQWzPdJDzdz-c=.0540949b-d730-4d74-8f41-5add520cd8d2@github.com> > This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. Dean Long has updated the pull request incrementally with one additional commit since the last revision: add comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14883/files - new: https://git.openjdk.org/jdk/pull/14883/files/7ed08326..0c14f96c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=04-05 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14883.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14883/head:pull/14883 PR: https://git.openjdk.org/jdk/pull/14883 From dlong at openjdk.org Fri Jul 14 19:54:25 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 19:54:25 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: > This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. Dean Long has updated the pull request incrementally with one additional commit since the last revision: another counter overflow ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14883/files - new: https://git.openjdk.org/jdk/pull/14883/files/0c14f96c..1153ce96 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14883&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14883.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14883/head:pull/14883 PR: https://git.openjdk.org/jdk/pull/14883 From dlong at openjdk.org Fri Jul 14 19:54:25 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 19:54:25 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v6] In-Reply-To: <6TzL1D5xX6aIBystm7zomVWgnt61_4IQWzPdJDzdz-c=.0540949b-d730-4d74-8f41-5add520cd8d2@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> <6TzL1D5xX6aIBystm7zomVWgnt61_4IQWzPdJDzdz-c=.0540949b-d730-4d74-8f41-5add520cd8d2@github.com> Message-ID: On Fri, 14 Jul 2023 18:59:52 GMT, Dean Long wrote: >> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > add comment My tier10 testing is still running, so I may add a few more fixes to this. I just added one in c1_Compilation.cpp. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14883#issuecomment-1636341783 From stuefe at openjdk.org Fri Jul 14 20:03:24 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 20:03:24 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v12] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: - Display unsupported text with UL warning level, + test - Last Aleksey Feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/34037c8f..d22248f1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=10-11 Stats: 9 lines in 2 files changed: 5 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From stuefe at openjdk.org Fri Jul 14 20:03:26 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 14 Jul 2023 20:03:26 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 08:45:04 GMT, Aleksey Shipilev wrote: >>> Realized that in production, we would like to see why trimmer might be late. I think this would look even better: [trimnative-shipilev-2.patch](https://github.com/openjdk/jdk/files/12043977/trimnative-shipilev-2.patch) >> >> I thought about this too, but you don't really want to know if it was suspended for every wait interval, but for every trim interval. In other words, you want to know how many trims had been moved up because a safepoint had been happening, and how many trims had been skipped due to pause. Getting these infos is harder than just increasing a counter. Is it worth the added complexity? >> >> I'll think something up. > >> > Realized that in production, we would like to see why trimmer might be late. I think this would look even better: [trimnative-shipilev-2.patch](https://github.com/openjdk/jdk/files/12043977/trimnative-shipilev-2.patch) >> >> I thought about this too, but you don't really want to know if it was suspended for every wait interval, but for every trim interval. In other words, you want to know how many trims had been moved up because a safepoint had been happening, and how many trims had been skipped due to pause. > > Well, yes. Isn't that what my patch did? Thanks a lot @shipilev! Fixed the last nits, and fixed an issue where the "this platform does not support..." text was not displayed with warning level and hence not visible without Xlog. Plus, test for that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1636349888 From duke at openjdk.org Fri Jul 14 20:25:14 2023 From: duke at openjdk.org (duke) Date: Fri, 14 Jul 2023 20:25:14 GMT Subject: Withdrawn: 8305898: Alternative self-forwarding mechanism In-Reply-To: References: Message-ID: On Wed, 3 May 2023 13:03:17 GMT, Roman Kennke wrote: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/13779 From dlong at openjdk.org Fri Jul 14 20:59:06 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 14 Jul 2023 20:59:06 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 19:54:25 GMT, Dean Long wrote: >> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > another counter overflow @offamitkumar @TheRealMDoerr @RealFYang @bulasevich please check that I haven't broken the s390/ppc/riscv/arm32 ports. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14883#issuecomment-1636430204 From matsaave at openjdk.org Fri Jul 14 21:16:08 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 14 Jul 2023 21:16:08 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 00:39:30 GMT, Ioi Lam wrote: > This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. > > The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. > > Example with `-XX:-UseCompressedOops`: > > > 0x00000000100001f0: @@ Object java.lang.String > 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 4 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 > 0x0000000010000210: @@ Object [B length: 6 > 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W > > > Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: > > > 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String > 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 3 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W I have two style suggestions but otherwise this looks good! src/hotspot/share/cds/archiveBuilder.cpp line 1100: > 1098: print_oop_with_requested_addr_cr(_st, _source_obj->obj_field(fd->offset())); > 1099: break; > 1100: default: The if block inside the default case leads to a lot of indentation. Maybe this could be it's own method? src/hotspot/share/cds/archiveHeapWriter.cpp line 546: > 544: > 545: BitMap::idx_t idx = requested_field_addr - (Metadata**) _requested_bottom; > 546: return idx < heap_info->ptrmap()->size() && heap_info->ptrmap()->at(idx); I believe this should be something like `return (idx < heap_info->ptrmap()->size()) && (heap_info->ptrmap()->at(idx) != nullptr);` ------------- Changes requested by matsaave (Committer). PR Review: https://git.openjdk.org/jdk/pull/14841#pullrequestreview-1531047633 PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1264175808 PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1264179028 From coleenp at openjdk.org Fri Jul 14 21:28:31 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 14 Jul 2023 21:28:31 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp Message-ID: Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug ------------- Commit messages: - 8312121: Fix -Wconversion warnings in tribool.hpp Changes: https://git.openjdk.org/jdk/pull/14892/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312121 Stats: 13 lines in 5 files changed: 7 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14892.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14892/head:pull/14892 PR: https://git.openjdk.org/jdk/pull/14892 From dholmes at openjdk.org Fri Jul 14 21:35:13 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 14 Jul 2023 21:35:13 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: <5Go0n-I4fvK1kVngILYSuwhx0APHFlTFxF2us8R9m58=.15f4d1e0-38ea-41e7-a3ef-02e287625380@github.com> References: <5Go0n-I4fvK1kVngILYSuwhx0APHFlTFxF2us8R9m58=.15f4d1e0-38ea-41e7-a3ef-02e287625380@github.com> Message-ID: On Fri, 14 Jul 2023 18:30:21 GMT, Alex Menkov wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 804: >> >>> 802: if (ext_suspended && ((state & JVMTI_THREAD_STATE_ALIVE) != 0)) { >>> 803: state |= JVMTI_THREAD_STATE_SUSPENDED; >>> 804: } >> >> One question unrelated to this bug and your fix. >> I wonder if any check and handling is needed for the case: >> `if (ext_suspended && ((state & JVMTI_THREAD_STATE_ALIVE) == 0))` >> Not sure this condition is even possible. But do we need to add an assert here? > > AFAIU it's possible in the case when we have terminated VT and JvmtiVTSuspender is requested to suspend all virtual threads So there is a window in which a VT is marked as terminated yet is still visible for actions like this? For regular threads we would always have filtered out thread in the process of exiting. Seeing terminated threads seems potentially problematic but perhaps all the VT code is prepared to handle this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14878#discussion_r1264190561 From amenkov at openjdk.org Fri Jul 14 22:29:04 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 14 Jul 2023 22:29:04 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: <5Go0n-I4fvK1kVngILYSuwhx0APHFlTFxF2us8R9m58=.15f4d1e0-38ea-41e7-a3ef-02e287625380@github.com> Message-ID: On Fri, 14 Jul 2023 21:31:21 GMT, David Holmes wrote: >> AFAIU it's possible in the case when we have terminated VT and JvmtiVTSuspender is requested to suspend all virtual threads > > So there is a window in which a VT is marked as terminated yet is still visible for actions like this? For regular threads we would always have filtered out thread in the process of exiting. Seeing terminated threads seems potentially problematic but perhaps all the VT code is prepared to handle this. I hope there is no such window in GetThreadState() case, but get_vthread_state method is also called from MultipleStackTracesCollector::fill_frames and there is a comment there: // Note that either or both of thr and thread_oop // may be null if the thread is new or has exited. I keep this check for safety (though fill_frames does not care about suspend bit) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14878#discussion_r1264216410 From dlong at openjdk.org Sat Jul 15 01:34:12 2023 From: dlong at openjdk.org (Dean Long) Date: Sat, 15 Jul 2023 01:34:12 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 19:47:39 GMT, Coleen Phillimore wrote: > Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. > Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug Marked as reviewed by dlong (Reviewer). src/hotspot/share/utilities/tribool.hpp line 44: > 42: TriBool() : _value(0) {} > 43: TriBool(bool value) : _value(value) { > 44: _value = _value | 2; You can also do like line 39 and use `(u1)(expr) & 3` or `((expr & 3u)` also seems to work. ------------- PR Review: https://git.openjdk.org/jdk/pull/14892#pullrequestreview-1531210148 PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1264277785 From amitkumar at openjdk.org Sat Jul 15 05:23:12 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Sat, 15 Jul 2023 05:23:12 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 19:54:25 GMT, Dean Long wrote: >> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > another counter overflow I couldn't see any s390-specific code change here. Still tested fastdebug & release builds. Looks Good. ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/14883#pullrequestreview-1531285512 From stuefe at openjdk.org Sat Jul 15 06:40:26 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 15 Jul 2023 06:40:26 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 00:39:30 GMT, Ioi Lam wrote: > This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. > > The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. > > Example with `-XX:-UseCompressedOops`: > > > 0x00000000100001f0: @@ Object java.lang.String > 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 4 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 > 0x0000000010000210: @@ Object [B length: 6 > 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W > > > Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: > > > 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String > 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 3 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W Looks useful. Found nothing to note but a small style nit. But a more extensive test would be nice, e.g. matching the output of this log against a known object layout. src/hotspot/share/cds/archiveBuilder.cpp line 1152: > 1150: } > 1151: } else { > 1152: st.print_cr(" - ---- fields (total size " SIZE_FORMAT " words):", source_oop->size()); Nit, proposal: "fields (xx words):" (no leading dashes, no "total size") for more density ------------- PR Review: https://git.openjdk.org/jdk/pull/14841#pullrequestreview-1531297755 PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1264342465 From luhenry at openjdk.org Sat Jul 15 06:53:11 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Sat, 15 Jul 2023 06:53:11 GMT Subject: Integrated: 8310949: RISC-V: Initialize UseUnalignedAccesses In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:45:14 GMT, Ludovic Henry wrote: > 8310949: RISC-V: Initialize UseUnalignedAccesses This pull request has now been integrated. Changeset: e8f66bf8 Author: Ludovic Henry URL: https://git.openjdk.org/jdk/commit/e8f66bf88ceb30383b50d1fac7a2583e3339ece0 Stats: 7 lines in 1 file changed: 7 ins; 0 del; 0 mod 8310949: RISC-V: Initialize UseUnalignedAccesses Reviewed-by: rehn, vkempik, fyang ------------- PR: https://git.openjdk.org/jdk/pull/14676 From rkennke at openjdk.org Sat Jul 15 07:57:26 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Sat, 15 Jul 2023 07:57:26 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v18] In-Reply-To: References: Message-ID: On Fri, 16 Jun 2023 15:01:45 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: > > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - Update comment about mark-word layout > - Merge branch 'JDK-8305896' into JDK-8305898 > - Fix tests on 32bit builds > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - wqRevert "Rename self-forwarded -> forward-failed" > > This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 > - ... and 20 more: https://git.openjdk.org/jdk/compare/524f9c52...3838ac05 /Open ------------- PR Comment: https://git.openjdk.org/jdk/pull/13779#issuecomment-1636702592 From stuefe at openjdk.org Sat Jul 15 15:43:58 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 15 Jul 2023 15:43:58 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v3] In-Reply-To: References: Message-ID: > TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. > > A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. > > ------- > > With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. > > We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. > > This approach has many disadvantages: > > - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. > > - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. > > - We only get 1 shot. It's either one of these two addresses. > > - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. > > - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). > > - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. > > - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. > > - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must cont... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Fix Windows ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14867/files - new: https://git.openjdk.org/jdk/pull/14867/files/8bb7d705..8d6a1ed4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14867&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14867&range=01-02 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14867/head:pull/14867 PR: https://git.openjdk.org/jdk/pull/14867 From dholmes at openjdk.org Mon Jul 17 03:02:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Jul 2023 03:02:05 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp In-Reply-To: References: Message-ID: <-HYDaQSWc6DgSZk1cm-MpRBw-vc8y1Kh42kTAeR73uo=.8f1d85ce-501e-4ac8-bdf6-6ce441c58d47@github.com> On Sat, 15 Jul 2023 01:30:49 GMT, Dean Long wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > src/hotspot/share/utilities/tribool.hpp line 44: > >> 42: TriBool() : _value(0) {} >> 43: TriBool(bool value) : _value(value) { >> 44: _value = _value | 2; > > You can also do like line 39 and use `(u1)(expr) & 3` or `((expr & 3u)` also seems to work. Or cast the `2` to `(u1)2` ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1264802303 From fyang at openjdk.org Mon Jul 17 03:06:19 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 17 Jul 2023 03:06:19 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v8] In-Reply-To: References: Message-ID: <4fZ1UvR2P-V-1SdEAYH9PpJxy3DHHXjZ3-xNCeh6uMI=.30d65371-d006-4ba0-8ed5-cf5b23f9db88@github.com> On Wed, 12 Jul 2023 15:08:41 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > RISCV port src/hotspot/cpu/riscv/templateTable_riscv.cpp line 2429: > 2427: InterpreterRuntime::post_field_access), > 2428: c_rarg1, c_rarg2); > 2429: __ load_field_entry(cache, index, 1); Nit: maybe remove the 3rd parameter as it is the same as the default value for `bcp_offset`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1264802108 From pli at openjdk.org Mon Jul 17 03:25:47 2023 From: pli at openjdk.org (Pengfei Li) Date: Mon, 17 Jul 2023 03:25:47 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options Message-ID: As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. ------------- Commit messages: - 8311130: AArch64: Sync SVE related CPU features with VM options Changes: https://git.openjdk.org/jdk/pull/14897/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14897&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311130 Stats: 151 lines in 3 files changed: 114 ins; 15 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/14897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14897/head:pull/14897 PR: https://git.openjdk.org/jdk/pull/14897 From dholmes at openjdk.org Mon Jul 17 04:24:18 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Jul 2023 04:24:18 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v4] In-Reply-To: References: Message-ID: <3bx5DK65cKaLh81dSAPrcirHNobLVQ3W62O9er4POLY=.0509afcb-10ba-41e9-b455-bb89175bc055@github.com> On Fri, 14 Jul 2023 12:46:06 GMT, Thomas Stuefe wrote: >> Today, if we use UseTransparentHugePages, we assume that the *static* hugepage detection we do is valid for THPs: >> - that THPs use the page size (in hotspot used as "default large page size") found in `/proc/memlimit` "Hugepagesize") >> - that THPs are enabled if that page size is >0. >> >> Both assumptions are incorrect: >> >> - whether THPs are enabled should be checked at `/sys/kernel/mm/transparent_hugepage/enabled`, which is a tri-state value ("always", "madvise", "never"). THPs are available for the first two states. >> - The page size employed by `khugepaged` is set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. It can differ from the default page size used for static hugepages. For example, we could configure a system such that it uses 1G static hugepages, but the THP page size would still be 2M. >> >> ------ >> >> About the patch: >> >> This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. >> >> Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). >> >> ------------- >> >> Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: >> >> >> thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled >> always madvise [never] >> thomas at starfish $ cat /proc/meminfo | grep Hugepage >> Hugepagesize: 1048576 kB >> >> >> Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Using the default large page size: 1G >> [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G >> ... >> [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G >> >> >> With patch, we correctly refuse to use large pages (and we log more info): >> >> >> thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version >> [0.001s][info][pagesize] Static... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Feedback David Nothing further from me. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14739#pullrequestreview-1531913585 From dholmes at openjdk.org Mon Jul 17 04:40:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Jul 2023 04:40:05 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: <96m94D7WjS-lCa9jfrxkz4MNFYisVBjTABH31Ba4roY=.3d35d95a-c253-4e2b-b2c0-7084ca6016f2@github.com> Message-ID: On Fri, 14 Jul 2023 18:41:01 GMT, Alex Menkov wrote: > > The change seems consistent with the definition of `GetThreadState`. But I note that the interrupt bit should also only be set if the target is alive. > > we get interrupt bit from Thread object, so the value is consistent with terminated state. suspend bit is a bit different - see my reply to Serguei Sorry I don't follow. I don't see anything that prevents the target from terminating after you have read the interrupt bit from the thread object, but before you read the actual state. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14878#issuecomment-1637365223 From stuefe at openjdk.org Mon Jul 17 04:59:11 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 17 Jul 2023 04:59:11 GMT Subject: RFR: JDK-8310233: Fix THP detection on Linux [v4] In-Reply-To: <3bx5DK65cKaLh81dSAPrcirHNobLVQ3W62O9er4POLY=.0509afcb-10ba-41e9-b455-bb89175bc055@github.com> References: <3bx5DK65cKaLh81dSAPrcirHNobLVQ3W62O9er4POLY=.0509afcb-10ba-41e9-b455-bb89175bc055@github.com> Message-ID: On Mon, 17 Jul 2023 04:20:38 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Feedback David > > Nothing further from me. > > Thanks. Thanks, @dholmes-ora ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14739#issuecomment-1637378365 From stuefe at openjdk.org Mon Jul 17 04:59:14 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 17 Jul 2023 04:59:14 GMT Subject: Integrated: JDK-8310233: Fix THP detection on Linux In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 16:26:43 GMT, Thomas Stuefe wrote: > Today, if we use UseTransparentHugePages, we assume that the *static* hugepage detection we do is valid for THPs: > - that THPs use the page size (in hotspot used as "default large page size") found in `/proc/memlimit` "Hugepagesize") > - that THPs are enabled if that page size is >0. > > Both assumptions are incorrect: > > - whether THPs are enabled should be checked at `/sys/kernel/mm/transparent_hugepage/enabled`, which is a tri-state value ("always", "madvise", "never"). THPs are available for the first two states. > - The page size employed by `khugepaged` is set in `/sys/kernel/mm/transparent_hugepage/hpage_pmd_size`. It can differ from the default page size used for static hugepages. For example, we could configure a system such that it uses 1G static hugepages, but the THP page size would still be 2M. > > ------ > > About the patch: > > This is a limited, minimally invasive patch to fix THP detection. The patch aims to be easy to downport. There is more work to do, which I will do in subsequent RFEs. > > Functionally, for *static* (non-THP) pages nothing changes. THP-mode now correctly detects THP support in the OS, and uses the correct page size (see examples below). > > ------------- > > Example 1: System has THPs disabled, but static hugepages (1g, 2m) configured: > > > thomas at starfish $ cat /sys/kernel/mm/transparent_hugepage/enabled > always madvise [never] > thomas at starfish $ cat /proc/meminfo | grep Hugepage > Hugepagesize: 1048576 kB > > > Without patch, we incorrectly assume THPs are enabled, and that THP page size is 1G (!), which we then proceed and use as heap page size, causing the heap size to be rounded up from 512m -> 1G. But - even though it is printed as "1G page backed" in log output - in reality it will still 4K-page-backed: the `madvise(2)` we use to set the THP page size will be ignored by the system, since THPs are disabled. > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Using the default large page size: 1G > [0.001s][info][pagesize] Usable page sizes: 4k, 2M, 1G > ... > [0.016s][info][pagesize] Heap: min=1G max=1G base=0x00000000c0000000 size=1G page_size=1G > > > With patch, we correctly refuse to use large pages (and we log more info): > > > thomas at starfish $ ./images/jdk/bin/java -Xmx512m -XX:+UseLargePages -XX:+UseTransparentHugePages -Xlog:pagesize -version > [0.001s][info][pagesize] Static hugepage support: 2M, 1G (default) > [0.001s][info][pagesize] default pagesize: 1G > [0... This pull request has now been integrated. Changeset: 37ca9024 Author: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/37ca9024ef59d99cae0bd7e25b2e6d3c1e085f97 Stats: 815 lines in 7 files changed: 684 ins; 99 del; 32 mod 8310233: Fix THP detection on Linux Reviewed-by: jsjolen, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/14739 From dholmes at openjdk.org Mon Jul 17 05:10:08 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Jul 2023 05:10:08 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v3] In-Reply-To: References: Message-ID: On Sat, 15 Jul 2023 15:43:58 GMT, Thomas Stuefe wrote: >> TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. >> >> A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. >> >> ------- >> >> With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. >> >> We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. >> >> This approach has many disadvantages: >> >> - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. >> >> - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. >> >> - We only get 1 shot. It's either one of these two addresses. >> >> - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. >> >> - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). >> >> - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. >> >> - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. >> >> - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loos... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Fix Windows src/hotspot/share/memory/metaspace.cpp line 808: > 806: } > 807: > 808: // ...otherwise let JVM chose the best placing: s/chose/choose/ src/hotspot/share/utilities/globalDefinitions.hpp line 187: > 185: return (uintptr_t) p; > 186: } > 187: Seems overkill for one use. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1264867564 PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1264868462 From aph at openjdk.org Mon Jul 17 08:54:13 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 17 Jul 2023 08:54:13 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options In-Reply-To: References: Message-ID: <49kVY3ZIX668UDgxn_sMGyQFT-Sx2brRkNN38xV4G-4=.898e951d-51f3-4df6-8cfb-a4fd34e035f8@github.com> On Mon, 17 Jul 2023 03:19:10 GMT, Pengfei Li wrote: > As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. > > We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 211: > 209: if (_cpu == CPU_ARM && ((_model == 0xd0c || _model2 == 0xd0c) || > 210: (_model == 0xd49 || _model2 == 0xd49) || > 211: (_model == 0xd40 || _model2 == 0xd40))) { Perhaps we need a function to encapsulate the logic here. bool model_is(unsigned in cpu_model) { return _model == cpu_model || _model2 == cpu_model; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14897#discussion_r1265058828 From alanb at openjdk.org Mon Jul 17 09:47:14 2023 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 17 Jul 2023 09:47:14 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: <96m94D7WjS-lCa9jfrxkz4MNFYisVBjTABH31Ba4roY=.3d35d95a-c253-4e2b-b2c0-7084ca6016f2@github.com> Message-ID: On Mon, 17 Jul 2023 04:37:08 GMT, David Holmes wrote: > Sorry I don't follow. I don't see anything that prevents the target from terminating after you have read the interrupt bit from the thread object, but before you read the actual state. The virtual thread state and the interrupt status are separate. That's okay for the suspended case, assuming not resumed while JVMTI GetThreadState executes. If not suspended then it looks like it could give an inconsistent view of the state. I don't know why GetThreadState defined a state flag for interrupted. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14878#issuecomment-1637714962 From duke at openjdk.org Mon Jul 17 10:44:21 2023 From: duke at openjdk.org (sid8606) Date: Mon, 17 Jul 2023 10:44:21 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure In-Reply-To: <_-3DnlZiSfWIdQeHYgTKDbAVNmbznTCff6lgHBaK-NM=.e9021c70-dc28-4d33-9b1e-b20610eb2033@github.com> References: <9HwtHNPda3Utaf9OjVxuv-VO4jSaQnXoDNZeKeWKp54=.a913553b-0cc1-482a-b676-b3f477f5f4f2@github.com> <_-3DnlZiSfWIdQeHYgTKDbAVNmbznTCff6lgHBaK-NM=.e9021c70-dc28-4d33-9b1e-b20610eb2033@github.com> Message-ID: On Fri, 14 Jul 2023 17:04:56 GMT, Thomas Stuefe wrote: >> src/hotspot/share/utilities/vmError.hpp line 210: >> >>> 208: >>> 209: // Non-null address guaranteed to generate a SEGV mapping error on read, for test purposes. >>> 210: static constexpr intptr_t segfault_address = AIX_ONLY(-1) NOT_AIX(4 * K); >> >> Are we sure ARM & RISC-V will be happy with these changes, Maybe using `S390_ONLY` will be appropriate(?) > > Yes. And before we start cascading ifdefs here, please spread this definition out into the respective platform files. For the s390 version, could you please add a clear comment describing the reasoning? Do all s390 linux variants using 4K pages - is it valid to hardcode that? Thank you for the reviews @tstuefe and @offamitkumar . The all linux variants on s390x uses 4K page setting. I am making changes to move segfault_address to platform files. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1265170247 From simonis at openjdk.org Mon Jul 17 11:06:24 2023 From: simonis at openjdk.org (Volker Simonis) Date: Mon, 17 Jul 2023 11:06:24 GMT Subject: RFR: 8311500: StackWalker.getCallerClass() throws UOE if invoked reflectively [v3] In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 15:47:35 GMT, Volker Simonis wrote: >> As the included jtreg test demonstrates, `StackWalker.getCallerClass()` can throw an `UnsupportedOperationException` if called reflectively. Currently this only happens if we invoke `StackWalker.getCallerClass()` recursively reflectively, but this issue will become more prominent once we fix [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447). The gory details follow below: >> >> The protocol between the Java API and the JVM for `StackWalker.getCallerClass()/walk()` is as follows: >> - On the Java side, `StackWalker` calls into `StackStreamFactory` for the real work. >> - For `StackWalker.getCallerClass()` `StackStreamFactory` basically creates a `Class[]` which will be passed down and filled in the JVM. For `StackWalker.walk()` it will normally be a `StackFrameInfo[]` (or a `LiveStackFrameInfo[]` if the internal `ExtendedOption.LOCALS_AND_OPERANDS` option was used). >> - The default size of this arrays is currently `StackStreamFactory.SMALL_BATCH` which is 8 (but see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). >> - `StackStreamFactory` than calls `AbstractStackWalker.callStackWalk()` which is a natively implemented in the VM by `JVM_CallStackWalk()`. >> - `JVM_CallStackWalk()` calls `StackWalk::walk()` which calls `StackWalk::fetchFirstBatch()` which calls `StackWalk::fill_in_frames()` which walks the stack and fills in the available class/stackframe slots in the passed in array until the array is full or there are no more stack frames, >> - Once `StackWalk::fill_in_frames()` returns, `StackWalk::fetchFirstBatch()` calls back to Java by invoking `AbstractStackWalker::doStackWalk()` to consume the result. >> - `AbstractStackWalker::doStackWalk()` calls `consumeFrames()` (which is overridden depending on whether we initially called `getCallerClass()` or `walk()`) which consumes the frames until it either finishes (e.g. finds the caller class) or until there are no more frames. >> - In the latter case `consumeFrames()` will call into the the VM again by calling `AbstractStackWalker.fetchStackFrames()` to fetch additional frames from the stack. >> - `AbstractStackWalker.fetchStackFrames()` is implemented by `JVM_MoreStackWalk()` which calls `StackWalk::fetchNextBatch()` which calls `StackWalk::fill_in_frames()` (the same method that already fetched the initial batch of frames). >> >> Following is a stacktrace of what I've explained so far: >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [libjvm.so+0x1... > > Volker Simonis has updated the pull request incrementally with one additional commit since the last revision: > > Fixed case when calling getCallerClass() from a @CallerSensitive method reflectively We actually have two problems here: 1. If called refelctively, `getCallerClass()` can throw an UOE even if it was **not** called from a `@CallerSensitive` method (see inital test case). 2. If called reflectively from a `@CallerSensitive` method, `getCallerClass()` will currently not throw a UOE as expected (see extended test in this PR). `getCallerClass()` **is** performance sensitive and we want to improve its performance rather than slow it down (see [JDK-8285447](https://bugs.openjdk.org/browse/JDK-8285447)). I think performance-wise it would be better to do all the filtering in the VM, where we have all the required information at hand and minimize the amount of data that needs to be passed between Java and the VM. Before starting to implement a new version of the fix which moves all the checks to Java, I'd like to hear some more opinions about whether we agree to move all the filtering and checks from the VM to Java (even at the cost of performance regressions) or if we better want to go the other way and do all the filtering/checks in the JVM. @shipilev, @dholmes-ora, @dfuch, @AlanBateman - what do you think? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14773#issuecomment-1637900903 From amitkumar at openjdk.org Mon Jul 17 11:29:33 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 17 Jul 2023 11:29:33 GMT Subject: RFR: 8310596: [PPC/s390] Utilize existing method frame::interpreter_frame_monitor_size_in_bytes() Message-ID: [small cleanup/enhancement] Adds & updates some code to use `interpreter_frame_monitor_size_in_bytes()` method. I wasn't sure whether we should remove it from PPC & S390 implementation or add it to other archs. For now I have added it but will be happy to revert as PR reviews. For s390 `interpreter_frame_interpreterstate_size_in_bytes()` was also cleaned up as it is not being used. fastdebug build tested on **s390** & **apple m1-pro**. ------------- Commit messages: - updates header year - risc-v - arm - x86 - aarch - ppc - s390 Changes: https://git.openjdk.org/jdk/pull/14902/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14902&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310596 Stats: 63 lines in 25 files changed: 24 ins; 5 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/14902.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14902/head:pull/14902 PR: https://git.openjdk.org/jdk/pull/14902 From shade at openjdk.org Mon Jul 17 11:31:07 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 17 Jul 2023 11:31:07 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v12] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <9l4RSCPEVhUIIuZQ3HFLdpvFh18iC4CJ16P-QNq4cYA=.3b53cce7-eae5-4703-89cf-f364e28a2f1e@github.com> On Fri, 14 Jul 2023 20:03:24 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Display unsupported text with UL warning level, + test > - Last Aleksey Feedback Still okay with it. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1532544914 From duke at openjdk.org Mon Jul 17 12:48:35 2023 From: duke at openjdk.org (sid8606) Date: Mon, 17 Jul 2023 12:48:35 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v2] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with one additional commit since the last revision: Move segfault_address definition out into the respective platform files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/0e79f195..ba81c1b0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=00-01 Stats: 18 lines in 9 files changed: 16 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From mseledtsov at openjdk.org Mon Jul 17 17:18:12 2023 From: mseledtsov at openjdk.org (Mikhailo Seledtsov) Date: Mon, 17 Jul 2023 17:18:12 GMT Subject: RFR: 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 04:45:31 GMT, Hao Sun wrote: > Three groups of runtime routines, i.e. arraycopy, copy and fill, are tested inside function `initialize_final_stubs()`. The test runs every time the debug VM is started. > > I think it's a usual convention that it's better not to run functional tests on startup. Hence, this patch proposes to move the stub test under `test/hotspot/gtest`. > > It's one copy-paste patch, except the following two minor changes: > > 1) Remove `ASSERT` condition check, and the gtest case will be run for release build as well. > > 2) Use the gtest helper `ASSERT_TRUE()` to replace the `assert()`. > > Note that the downside is that we won't catch stub implementation errors immediately on startup. > > Test: > > 1) Cross compilations on arm32/s390/ppc/riscv passed. > > 2) tier1 test passed on Linux/AArch64, Linux/x86_64 and macOS/Apple silicon. Note that hotspot/gtest is run in tier1. > > 3) VM builds with several different options (e.g., zero build, release build, fastdebug build, client variant, no C1/c2) passed on Linux/AArch64 and Linux/x86_64. > > 4) I manually injected errors in arraycopy/copy/fill stubs and verified `make test TEST="gtest:StubRoutines"` fails or not. But this check only worked for array fill stub. For the remaining two stubs, VM build (`make images`) failed firstly before running the gtest, because the built VM with these "erroneous" stubs were executed to do something such as loading Java core library code. Marked as reviewed by mseledtsov (Committer). Changes look good to me. Thank you for doing this change. ------------- PR Review: https://git.openjdk.org/jdk/pull/14765#pullrequestreview-1533291249 PR Comment: https://git.openjdk.org/jdk/pull/14765#issuecomment-1638553998 From iklam at openjdk.org Mon Jul 17 17:20:59 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 17 Jul 2023 17:20:59 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 21:25:17 GMT, Matias Saavedra Silva wrote: > Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. > > > // We have finished dumping the static archive. At this point, there may be pending VM > // operations. We have changed some global states (such as vmClasses::_klasses) that > // may cause these VM operations to fail. For safety, forget these operations and > // exit the VM directly. > void MetaspaceShared::exit_after_static_dump() { > os::_exit(0); > } > > > As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: > 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead > 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. > 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. Changes requested by iklam (Reviewer). src/hotspot/share/cds/archiveBuilder.cpp line 262: > 260: // dynamic archive, we might need to sort the symbols alphabetically (also see > 261: // DynamicArchiveBuilder::sort_methods()). > 262: log_info(cds)("Sorting symbols and fixing identity hash ... "); "and fixing identity hash" should be removed, as the has is no longer being fixed here. src/hotspot/share/cds/archiveBuilder.cpp line 638: > 636: memcpy(dest, src, bytes); > 637: > 638: // Update the hash of buffered sorted symbols for static dump Please append to the comments with ` so that the symbols have deterministic contents` src/hotspot/share/cds/heapShared.cpp line 345: > 343: void HeapShared::init_scratch_references() { > 344: if (_scratch_references_table == nullptr) > 345: _scratch_references_table = new (mtClass)ResolvedReferenceScratchTable(); These two lines are outside of a lock so you could run into a race condition. I think you can remove this function and move these two lines to just before calling `_scratch_references_table->put()` in `add_scratch_resolved_references`. src/hotspot/share/cds/heapShared.hpp line 288: > 286: 36137, // prime number > 287: AnyObj::C_HEAP, > 288: mtClassShared> ResolvedReferenceScratchTable; You are using `oop->identity_hash()` as the key for this table. However, it's possible for two `resolved_references` arrays to have the exact same identity. It's better to to use `ResourceHashtableidentity_hash()` and the `EQUALS` function can compare the values of `OopHandle::resolve()`. For the coding style, you can search for tables that use `HeapShared::oop_hash` for examples. src/hotspot/share/classfile/classLoaderData.cpp line 1085: > 1083: guarantee(this == class_loader_data(cl) || has_class_mirror_holder(), "Must be the same"); > 1084: guarantee(cl != nullptr || this == ClassLoaderData::the_null_class_loader_data() || has_class_mirror_holder(), "must be"); > 1085: } Why is this necessary? src/java.base/share/native/libjli/java.c line 1447: > 1445: /* > 1446: * Check for CDS option > 1447: */ Comments need to be indented. ------------- PR Review: https://git.openjdk.org/jdk/pull/14879#pullrequestreview-1533239235 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1265645038 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1265669208 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1265653371 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1265667818 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1265648116 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1265671279 From jiangli at openjdk.org Mon Jul 17 18:06:17 2023 From: jiangli at openjdk.org (Jiangli Zhou) Date: Mon, 17 Jul 2023 18:06:17 GMT Subject: RFR: 8311661: Resolve duplicate symbol of StringTable::StringTable with JDK static linking [v4] In-Reply-To: References: <3mXE9RE6cj-apUgKbUvy0yxneWeSXHmMURj0JMhZKxw=.0a532743-5258-47c2-84bf-590f0a2660cc@github.com> Message-ID: On Fri, 14 Jul 2023 02:46:47 GMT, David Holmes wrote: > I am concerned that something that [JDK-8303796](https://bugs.openjdk.org/browse/JDK-8303796) claimed to be a small enhancement is in fact requiring a lot of changes in different areas of the codebase. @dholmes-ora, sorry for the slow response, I was OOO. The fixes and enhancements for JDK-8303796 include: - duplicate symbol fixes - Make file changes to create the full set of static libraries - Runtime fixes open sourced at https://github.com/jianglizhou/jdk/tree/static-java - Fix runtime crashes - Enhance builtin libraries lookup - Use runtime checks to detect JDK static build. Allow removing STATIC_BUILD macro and building static and dynamic libraries without rebuild .o files The overall changes are not small. Individually each fix or enhancement, poetically the symbol fix is relatively small and non-complicated (except the built-in libraries lookup changes). Handling them as separate bugs and enhancements seems to be reasonable. The built-in library lookup and runtime changes may be suitable for a new JEP or putting in a Leyden project repo and propose later. It's built on top of the existing JEP 178: Statically-Linked JNI Libraries. To me, handling it with a new JEP as an enhancement on top of JEP 178 is reasonable. Any thoughts on that? > There are already a number of "duplicate symbol definition" issues that have been filed and fixed in an ad-hoc manner in the JDK, which is the kind of whack-a-mole approach that we should not be taking. If static linking requires symbol isolation then a generally applicable approach to doing that should be investigated (in a separate project? Leyden?) and then brought to mainline via a JEP (not just for the hotspot aspect as currently suggested). The duplicated symbol issues consist three parts: 1. Symbol conflicts among the JDK native libraries and hotspot Each issue is different and has been fixed in an ad-hoc manner. All issues have fixed by bugs linked with JDK-8303796. With enabling static JDK build (not yet done currently), we will be able to detect and prevent introducing new symbol issues in the future. 2. Symbol conflicts between hotspot and user code This is discussed by the thread in this PR. Moving hotspot code to hotspot unique namespace as suggested by @JornVernee is a good general solution. 3. Symbol conflicts between JDK library and user code With our prototype we found about 5 of those. I think these are outliers. Either fixing them as individually bugs or putting them in a Leyden repo and proposing together seems ok. Since the number is small enough, fixing individually might be slightly cleaner. Please let me know your thoughts. The JNI code defines APIs with naming convention consisting Java package and class names. There is no symbol issue with those. To prevent future issues, following the same naming style for helper methods and variables in JDK native code would work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14808#issuecomment-1638622132 From psandoz at openjdk.org Mon Jul 17 18:15:55 2023 From: psandoz at openjdk.org (Paul Sandoz) Date: Mon, 17 Jul 2023 18:15:55 GMT Subject: RFR: 8306136: [vectorapi] Intrinsics of VectorMask.laneIsSet() [v2] In-Reply-To: References: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> Message-ID: On Tue, 4 Jul 2023 01:03:20 GMT, Eric Liu wrote: >> VectorMask.laneIsSet() [1] is implemented based on VectorMask.toLong() [2], and it's performance highly depends on the intrinsification of toLong(). However, if `toLong()` is failed to intrinsify, on some architectures or unsupported species, it's much more expensive than pure getBits(). Besides, some CPUs (e.g. with Arm Neon) may not have efficient instructions to implementation toLong(), so we propose to intrinsify VectorMask.laneIsSet separately. >> >> This patch optimize laneIsSet() by calling the existing intrinsic method VectorSupport.extract(), which actually does not introduce new intrinsic method. The C2 compiler intrinsification logic to support _VectorExtract has also been extended to better support laneIsSet(). It tries to extract the mask's lane value with an ExtractUB node if the hardware backend supports it. While on hardware without ExtractUB backend support , c2 will still try to generate toLong() related nodes, which behaves the same as before the patch. >> >> Key changes in this patch: >> >> 1. Reuse intrinsic `VectorSupport.extract()` in Java side. No new intrinsic method is introduced. >> 2. In compiler, `ExtractUBNode` is generated if backend support is. If not, the original "toLong" pattern is generated if it's implemented. Otherwise, it uses the default Java `getBits[i]` rather than the expensive and complicated toLong() based implementation. >> 3. Enable `ExtractUBNode` on AArch64 to extract the lane value for a vector mask in compiler, together with changing its bottom type to TypeInt::BOOL. This helps optimize the conditional selection generated by >> >> ``` >> >> public boolean laneIsSet(int i) { >> return VectorSupport.extract(..., defaultImpl) == 1L; >> } >> >> ``` >> >> [Test] >> hotspot:compiler/vectorapi and jdk/incubator/vector passed. >> >> [Performance] >> >> Below shows the performance gain on 128-bit vector size Neon machine. For 64 and 128 SPECIES, the improvment caused by this intrinsics. For other SPECIES which can not be intrinfied, performance gain comes from the default Java implementation changes, i.e. getBits[i] vs. toLong(). >> >> >> Benchmark Gain (after/before) >> microMaskLaneIsSetByte128_con 2.47 >> microMaskLaneIsSetByte128_var 1.82 >> microMaskLaneIsSetByte256_con 3.01 >> microMaskLaneIsSetByte256_var 3.04 >> microMaskLaneIsSetByte512_con 4.83 >> microMaskLaneIsSetByte512_var 4.86 >> microMaskLaneIsSetByt... > > Eric Liu has updated the pull request incrementally with one additional commit since the last revision: > > Bug fix > > Change-Id: Ib223c4048b29875a62a27d6081ad36a125dec144 Tests pass. Java changes look good, but HotSpot reviews are also required. ------------- Marked as reviewed by psandoz (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14200#pullrequestreview-1533400401 From stuefe at openjdk.org Mon Jul 17 18:38:26 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 17 Jul 2023 18:38:26 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v10] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 06:43:15 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Bikeshed Trim log lines > > Test needs a fix for non-Linux. @dholmes-ora Are you okay with this final version? Did your CI report any problems? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1638668239 From stuefe at openjdk.org Mon Jul 17 18:40:05 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 17 Jul 2023 18:40:05 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v2] In-Reply-To: <_-3DnlZiSfWIdQeHYgTKDbAVNmbznTCff6lgHBaK-NM=.e9021c70-dc28-4d33-9b1e-b20610eb2033@github.com> References: <9HwtHNPda3Utaf9OjVxuv-VO4jSaQnXoDNZeKeWKp54=.a913553b-0cc1-482a-b676-b3f477f5f4f2@github.com> <_-3DnlZiSfWIdQeHYgTKDbAVNmbznTCff6lgHBaK-NM=.e9021c70-dc28-4d33-9b1e-b20610eb2033@github.com> Message-ID: <1Q6xmFxzz7wtCxvEpua98dlj-FdXx7HWocOSZ_TCdRM=.e9a2163f-9737-4e70-a4c9-fc87ce54ba6f@github.com> On Fri, 14 Jul 2023 17:04:56 GMT, Thomas Stuefe wrote: >> src/hotspot/share/utilities/vmError.hpp line 210: >> >>> 208: >>> 209: // Non-null address guaranteed to generate a SEGV mapping error on read, for test purposes. >>> 210: static constexpr intptr_t segfault_address = AIX_ONLY(-1) NOT_AIX(4 * K); >> >> Are we sure ARM & RISC-V will be happy with these changes, Maybe using `S390_ONLY` will be appropriate(?) > > Yes. And before we start cascading ifdefs here, please spread this definition out into the respective platform files. For the s390 version, could you please add a clear comment describing the reasoning? Do all s390 linux variants using 4K pages - is it valid to hardcode that? > Thank you for the reviews @tstuefe and @offamitkumar . The all linux variants on s390x uses 4K page setting. > I am making changes to move segfault_address to platform files. Thank you @sid8606 . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1265761139 From dlong at openjdk.org Mon Jul 17 18:49:06 2023 From: dlong at openjdk.org (Dean Long) Date: Mon, 17 Jul 2023 18:49:06 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp In-Reply-To: <-HYDaQSWc6DgSZk1cm-MpRBw-vc8y1Kh42kTAeR73uo=.8f1d85ce-501e-4ac8-bdf6-6ce441c58d47@github.com> References: <-HYDaQSWc6DgSZk1cm-MpRBw-vc8y1Kh42kTAeR73uo=.8f1d85ce-501e-4ac8-bdf6-6ce441c58d47@github.com> Message-ID: <6NQeZaB-D8hr-onoolXrScRurp3aFgk5ew-Qi8QNVi8=.7a634414-4b4e-4cc1-ac4b-c50462ff2453@github.com> On Mon, 17 Jul 2023 02:59:08 GMT, David Holmes wrote: >> src/hotspot/share/utilities/tribool.hpp line 44: >> >>> 42: TriBool() : _value(0) {} >>> 43: TriBool(bool value) : _value(value) { >>> 44: _value = _value | 2; >> >> You can also do like line 39 and use `(u1)(expr) & 3` or `((expr & 3u)` also seems to work. > > Or cast the `2` to `(u1)2` ? It does seem strange that this code is using a bit-field, but still using & | and shift. Why not represent the high and low bit with two separate bit-fields of width 1? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1265769859 From cslucas at openjdk.org Mon Jul 17 21:50:21 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 17 Jul 2023 21:50:21 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v18] In-Reply-To: <72OcyhmFKGyTwDy8LQ0blp5HG5dg5l9OsU5dh9osVxo=.73b3a79e-ff24-4f41-b39b-650a9036ee76@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> <-A7bd8C0q5o1WuRSeSkYYnUoApV4s9uijPmiNB2Wteo=.c5bc944c-88a3-4228-bd41-091ac6c8fb1d@github.com> <72OcyhmFKGyTwDy8LQ0blp5HG5dg5l9OsU5dh9osVxo=.73b3a79e-ff24-4f41-b39b-650a9036ee76@github.com> Message-ID: On Tue, 20 Jun 2023 16:44:28 GMT, Vladimir Ivanov wrote: >> Thank you once more for the comments @iwanowww . I?ll address them asap. >> >> Can I ask what requirements are there for a product flag? > >> Can I ask what requirements are there for a product flag? > > Product flags are treated as part of public API of the JVM. So, changes in behavior have to go through CSR process. Also, a product flag has to be deprecated/obsoleted first before it can be removed which takes multiple releases to happen. Better to avoid introducing new product flags unless it is well-justified or necessary. @iwanowww @vnkozlov - can I ask one of you to sponsor this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/12897#issuecomment-1638931020 From dholmes at openjdk.org Mon Jul 17 21:52:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Jul 2023 21:52:53 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 21:25:17 GMT, Matias Saavedra Silva wrote: > Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. > > > // We have finished dumping the static archive. At this point, there may be pending VM > // operations. We have changed some global states (such as vmClasses::_klasses) that > // may cause these VM operations to fail. For safety, forget these operations and > // exit the VM directly. > void MetaspaceShared::exit_after_static_dump() { > os::_exit(0); > } > > > As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: > 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead > 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. > 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. Hi Matias, I'm not keen on having to teach the launcher about `-Xshare:dump` and that it is a terminal operation. But after looking into it I have to admit it is a cleaner and more consistent termination strategy. That said you need to be aware that this will now do a clean shutdown of the VM after the dump, so shutdown hooks will now run. Lets see what core-libs folk think too. Thanks. src/hotspot/share/runtime/threads.cpp line 812: > 810: if (DumpSharedSpaces) { > 811: MetaspaceShared::preload_and_dump(); > 812: *canTryAgain = false; // don't let caller call JNI_CreateJavaVM again This is only set when create_vm fails. A successful creation of the VM prevents subsequent re-creation attempts. ------------- PR Review: https://git.openjdk.org/jdk/pull/14879#pullrequestreview-1531828953 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1264773183 From iklam at openjdk.org Mon Jul 17 21:59:13 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 17 Jul 2023 21:59:13 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() In-Reply-To: References: Message-ID: <7hvK_YLrrijcBRv-7Bx1MPGr_MofMhUv-AmnHlQUzp4=.40d9fc01-7239-4a3d-a779-f900a08ad4b0@github.com> On Mon, 17 Jul 2023 17:14:01 GMT, Ioi Lam wrote: >> Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. >> >> >> // We have finished dumping the static archive. At this point, there may be pending VM >> // operations. We have changed some global states (such as vmClasses::_klasses) that >> // may cause these VM operations to fail. For safety, forget these operations and >> // exit the VM directly. >> void MetaspaceShared::exit_after_static_dump() { >> os::_exit(0); >> } >> >> >> As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: >> 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead >> 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. >> 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. > > src/hotspot/share/cds/heapShared.hpp line 288: > >> 286: 36137, // prime number >> 287: AnyObj::C_HEAP, >> 288: mtClassShared> ResolvedReferenceScratchTable; > > You are using `oop->identity_hash()` as the key for this table. However, it's possible for two `resolved_references` arrays to have the exact same identity. It's better to to use `ResourceHashtable > > unsigned (*HASH) (K const&), > bool (*EQUALS)(K const&, K const&) > > where the `K` type is `OopHandle`. > > The `HASH` function can return `OopHandle::resolve()->identity_hash()` and the `EQUALS` function can compare the values of `OopHandle::resolve()`. > > For the coding style, you can search for tables that use `HeapShared::oop_hash` for examples. Actually, using `Oophandle` is too complicated. You can just use the `ConstantPool*` as the key, since each original `resolved_references` array is associated with a unique `ConstantPool*`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1265941938 From amenkov at openjdk.org Mon Jul 17 22:33:04 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Mon, 17 Jul 2023 22:33:04 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 19:18:38 GMT, Alex Menkov wrote: > The change fixes handling of "suspended" bit in VT state. > The code looks very strange. > java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) > Per log this code came from loom repo with VT integration. > > Testing: tier1-4, updated GetThreadStateMountedTest.java So AFAIU GetThreadState for platform threads (get_thread_state_base) don't have similar issue because suspended/interrupted values are read after reading main thread state value. For virtual threads suspended/interrupted values are read before. There is a comment in the line 796: `// This call can trigger a safepoint, so thread_oop must not be used after it.` I suppose this is the reason to read them earlier. Then I think we need to ensure the thread is still alive before applying suspended/interrupted bits ------------- PR Comment: https://git.openjdk.org/jdk/pull/14878#issuecomment-1638975987 From cslucas at openjdk.org Mon Jul 17 23:05:29 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 17 Jul 2023 23:05:29 GMT Subject: Integrated: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges In-Reply-To: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Tue, 7 Mar 2023 01:40:48 GMT, Cesar Soares Lucas wrote: > Can I please get reviews for this PR? > > The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. > > With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: > > ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) > > What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: > > ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) > > This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. > > The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. > > The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. > > I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. This pull request has now been integrated. Changeset: a53345ad Author: Cesar Soares Lucas Committer: Vladimir Kozlov URL: https://git.openjdk.org/jdk/commit/a53345ad03e07ab2a990721a506ebc25eed0f7c9 Stats: 2733 lines in 26 files changed: 2485 ins; 108 del; 140 mod 8287061: Support for rematerializing scalar replaced objects participating in allocation merges Reviewed-by: kvn, vlivanov ------------- PR: https://git.openjdk.org/jdk/pull/12897 From dholmes at openjdk.org Mon Jul 17 23:16:11 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 17 Jul 2023 23:16:11 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v12] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 20:03:24 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Display unsupported text with UL warning level, + test > - Last Aleksey Feedback Nothing further from me. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1533828530 From dholmes at openjdk.org Tue Jul 18 00:15:01 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 00:15:01 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 07:41:00 GMT, David Holmes wrote: > [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. > > I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. > > The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). > > Testing: > - tiers 1-3 (sanity) > - TestNativeStack regression test > > Thanks Ping! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14862#issuecomment-1639063434 From amenkov at openjdk.org Tue Jul 18 00:48:30 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 18 Jul 2023 00:48:30 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads [v2] In-Reply-To: References: Message-ID: > The change fixes handling of "suspended" bit in VT state. > The code looks very strange. > java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) > Per log this code came from loom repo with VT integration. > > Testing: tier1-4, updated GetThreadStateMountedTest.java Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: check that VT is alive before applying suspend/interrupt bits ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14878/files - new: https://git.openjdk.org/jdk/pull/14878/files/e2cf7c3c..084e9707 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14878&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14878&range=00-01 Stats: 8 lines in 1 file changed: 3 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14878.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14878/head:pull/14878 PR: https://git.openjdk.org/jdk/pull/14878 From dholmes at openjdk.org Tue Jul 18 00:55:14 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 00:55:14 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads [v2] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 00:48:30 GMT, Alex Menkov wrote: >> The change fixes handling of "suspended" bit in VT state. >> The code looks very strange. >> java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) >> Per log this code came from loom repo with VT integration. >> >> Testing: tier1-4, updated GetThreadStateMountedTest.java > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > check that VT is alive before applying suspend/interrupt bits Seems fine to me, assuming all tests pass. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14878#pullrequestreview-1533917232 From fyang at openjdk.org Tue Jul 18 01:04:31 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 18 Jul 2023 01:04:31 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Sat, 15 Jul 2023 05:19:47 GMT, Amit Kumar wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> another counter overflow > > I couldn't see any s390-specific code change here. Still tested fastdebug & release builds. > > Looks Good. > @offamitkumar @TheRealMDoerr @RealFYang @bulasevich please check that I haven't broken the s390/ppc/riscv/arm32 ports. FYI: Performed tier1, tier2 and tier3 tests on linux-riscv64 platform. Result is clean. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14883#issuecomment-1639117069 From lmesnik at openjdk.org Tue Jul 18 02:06:54 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 18 Jul 2023 02:06:54 GMT Subject: RFR: 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 04:45:31 GMT, Hao Sun wrote: > Three groups of runtime routines, i.e. arraycopy, copy and fill, are tested inside function `initialize_final_stubs()`. The test runs every time the debug VM is started. > > I think it's a usual convention that it's better not to run functional tests on startup. Hence, this patch proposes to move the stub test under `test/hotspot/gtest`. > > It's one copy-paste patch, except the following two minor changes: > > 1) Remove `ASSERT` condition check, and the gtest case will be run for release build as well. > > 2) Use the gtest helper `ASSERT_TRUE()` to replace the `assert()`. > > Note that the downside is that we won't catch stub implementation errors immediately on startup. > > Test: > > 1) Cross compilations on arm32/s390/ppc/riscv passed. > > 2) tier1 test passed on Linux/AArch64, Linux/x86_64 and macOS/Apple silicon. Note that hotspot/gtest is run in tier1. > > 3) VM builds with several different options (e.g., zero build, release build, fastdebug build, client variant, no C1/c2) passed on Linux/AArch64 and Linux/x86_64. > > 4) I manually injected errors in arraycopy/copy/fill stubs and verified `make test TEST="gtest:StubRoutines"` fails or not. But this check only worked for array fill stub. For the remaining two stubs, VM build (`make images`) failed firstly before running the gtest, because the built VM with these "erroneous" stubs were executed to do something such as loading Java core library code. Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14765#pullrequestreview-1533979973 From sspitsyn at openjdk.org Tue Jul 18 02:47:52 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 18 Jul 2023 02:47:52 GMT Subject: RFR: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads [v2] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 00:48:30 GMT, Alex Menkov wrote: >> The change fixes handling of "suspended" bit in VT state. >> The code looks very strange. >> java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) >> Per log this code came from loom repo with VT integration. >> >> Testing: tier1-4, updated GetThreadStateMountedTest.java > > Alex Menkov has updated the pull request incrementally with one additional commit since the last revision: > > check that VT is alive before applying suspend/interrupt bits The update looks good. Need to make sure there are no new failures though. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14878#pullrequestreview-1534021752 From dlong at openjdk.org Tue Jul 18 03:29:54 2023 From: dlong at openjdk.org (Dean Long) Date: Tue, 18 Jul 2023 03:29:54 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Tue, 18 Jul 2023 01:01:22 GMT, Fei Yang wrote: >> I couldn't see any s390-specific code change here. Still tested fastdebug & release builds. >> >> Looks Good. > >> @offamitkumar @TheRealMDoerr @RealFYang @bulasevich please check that I haven't broken the s390/ppc/riscv/arm32 ports. > > FYI: Performed tier1, tier2 and tier3 tests on linux-riscv64 platform. Result is clean. Thanks Vladimir, Andrew, @RealFYang and @offamitkumar. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14883#issuecomment-1639281693 From djelinski at openjdk.org Tue Jul 18 04:39:17 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Tue, 18 Jul 2023 04:39:17 GMT Subject: RFR: 8312190: Fix c++11-narrowing warnings in hotspot code Message-ID: This patch fixes compilation warnings produced by Clang when compiling on Windows. Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. ------------- Commit messages: - Update copyright - Fix clang narrowing warnings Changes: https://git.openjdk.org/jdk/pull/14907/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14907&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312190 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14907.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14907/head:pull/14907 PR: https://git.openjdk.org/jdk/pull/14907 From dholmes at openjdk.org Tue Jul 18 05:23:10 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 05:23:10 GMT Subject: RFR: 8312190: Fix c++11-narrowing warnings in hotspot code In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 18:31:41 GMT, Daniel Jeli?ski wrote: > This patch fixes compilation warnings produced by Clang when compiling on Windows. > > Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. > > See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. Seems reasonable. Thanks. src/hotspot/share/utilities/debug.hpp line 248: > 246: > 247: // types of VM error - originally in vmError.hpp > 248: enum VMErrorType : unsigned int { Why 'unsigned int' when the other changes are `uint`? ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14907#pullrequestreview-1534191903 PR Review Comment: https://git.openjdk.org/jdk/pull/14907#discussion_r1266231566 From dholmes at openjdk.org Tue Jul 18 05:40:11 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 05:40:11 GMT Subject: RFR: 8312190: Fix c++11-narrowing warnings in hotspot code In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 18:31:41 GMT, Daniel Jeli?ski wrote: > This patch fixes compilation warnings produced by Clang when compiling on Windows. > > Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. > > See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. Of course please ensure all our CI builds are successful too! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14907#issuecomment-1639517134 From haosun at openjdk.org Tue Jul 18 06:02:20 2023 From: haosun at openjdk.org (Hao Sun) Date: Tue, 18 Jul 2023 06:02:20 GMT Subject: RFR: 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 17:15:20 GMT, Mikhailo Seledtsov wrote: >> Three groups of runtime routines, i.e. arraycopy, copy and fill, are tested inside function `initialize_final_stubs()`. The test runs every time the debug VM is started. >> >> I think it's a usual convention that it's better not to run functional tests on startup. Hence, this patch proposes to move the stub test under `test/hotspot/gtest`. >> >> It's one copy-paste patch, except the following two minor changes: >> >> 1) Remove `ASSERT` condition check, and the gtest case will be run for release build as well. >> >> 2) Use the gtest helper `ASSERT_TRUE()` to replace the `assert()`. >> >> Note that the downside is that we won't catch stub implementation errors immediately on startup. >> >> Test: >> >> 1) Cross compilations on arm32/s390/ppc/riscv passed. >> >> 2) tier1 test passed on Linux/AArch64, Linux/x86_64 and macOS/Apple silicon. Note that hotspot/gtest is run in tier1. >> >> 3) VM builds with several different options (e.g., zero build, release build, fastdebug build, client variant, no C1/c2) passed on Linux/AArch64 and Linux/x86_64. >> >> 4) I manually injected errors in arraycopy/copy/fill stubs and verified `make test TEST="gtest:StubRoutines"` fails or not. But this check only worked for array fill stub. For the remaining two stubs, VM build (`make images`) failed firstly before running the gtest, because the built VM with these "erroneous" stubs were executed to do something such as loading Java core library code. > > Changes look good to me. Thank you for doing this change. Thanks for your reviews! @mseledts and @lmesnik The GHA tests are clean. Let me integrate it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14765#issuecomment-1639535291 From haosun at openjdk.org Tue Jul 18 06:02:21 2023 From: haosun at openjdk.org (Hao Sun) Date: Tue, 18 Jul 2023 06:02:21 GMT Subject: Integrated: 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest In-Reply-To: References: Message-ID: On Tue, 4 Jul 2023 04:45:31 GMT, Hao Sun wrote: > Three groups of runtime routines, i.e. arraycopy, copy and fill, are tested inside function `initialize_final_stubs()`. The test runs every time the debug VM is started. > > I think it's a usual convention that it's better not to run functional tests on startup. Hence, this patch proposes to move the stub test under `test/hotspot/gtest`. > > It's one copy-paste patch, except the following two minor changes: > > 1) Remove `ASSERT` condition check, and the gtest case will be run for release build as well. > > 2) Use the gtest helper `ASSERT_TRUE()` to replace the `assert()`. > > Note that the downside is that we won't catch stub implementation errors immediately on startup. > > Test: > > 1) Cross compilations on arm32/s390/ppc/riscv passed. > > 2) tier1 test passed on Linux/AArch64, Linux/x86_64 and macOS/Apple silicon. Note that hotspot/gtest is run in tier1. > > 3) VM builds with several different options (e.g., zero build, release build, fastdebug build, client variant, no C1/c2) passed on Linux/AArch64 and Linux/x86_64. > > 4) I manually injected errors in arraycopy/copy/fill stubs and verified `make test TEST="gtest:StubRoutines"` fails or not. But this check only worked for array fill stub. For the remaining two stubs, VM build (`make images`) failed firstly before running the gtest, because the built VM with these "erroneous" stubs were executed to do something such as loading Java core library code. This pull request has now been integrated. Changeset: 4b9ec824 Author: Hao Sun URL: https://git.openjdk.org/jdk/commit/4b9ec8245187a2eaccc711a6e5d3d4915dd022c9 Stats: 272 lines in 2 files changed: 153 ins; 119 del; 0 mod 8310355: Move the stub test from initialize_final_stubs() to test/hotspot/gtest Reviewed-by: mseledtsov, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/14765 From bulasevich at openjdk.org Tue Jul 18 06:53:53 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Tue, 18 Jul 2023 06:53:53 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 19:54:25 GMT, Dean Long wrote: >> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > another counter overflow Hi, the change looks good for me. ARM32 tests are OK. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14883#issuecomment-1639604305 From djelinski at openjdk.org Tue Jul 18 07:07:14 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Tue, 18 Jul 2023 07:07:14 GMT Subject: RFR: 8312190: Fix c++11-narrowing warnings in hotspot code In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 05:11:52 GMT, David Holmes wrote: >> This patch fixes compilation warnings produced by Clang when compiling on Windows. >> >> Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. >> >> See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. > > src/hotspot/share/utilities/debug.hpp line 248: > >> 246: >> 247: // types of VM error - originally in vmError.hpp >> 248: enum VMErrorType : unsigned int { > > Why 'unsigned int' when the other changes are `uint`? `uint` is not available in this header file, and I didn't want to add includes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14907#discussion_r1266324719 From kbarrett at openjdk.org Tue Jul 18 08:11:03 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 18 Jul 2023 08:11:03 GMT Subject: RFR: 8312190: Fix c++11-narrowing warnings in hotspot code In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 18:31:41 GMT, Daniel Jeli?ski wrote: > This patch fixes compilation warnings produced by Clang when compiling on Windows. > > Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. > > See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. Looks good. ------------- Marked as reviewed by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14907#pullrequestreview-1534451544 From pli at openjdk.org Tue Jul 18 09:01:30 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 18 Jul 2023 09:01:30 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options [v2] In-Reply-To: <49kVY3ZIX668UDgxn_sMGyQFT-Sx2brRkNN38xV4G-4=.898e951d-51f3-4df6-8cfb-a4fd34e035f8@github.com> References: <49kVY3ZIX668UDgxn_sMGyQFT-Sx2brRkNN38xV4G-4=.898e951d-51f3-4df6-8cfb-a4fd34e035f8@github.com> Message-ID: On Mon, 17 Jul 2023 08:51:19 GMT, Andrew Haley wrote: >> Pengfei Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - Simplify some checks >> - Address aph's comment > > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 211: > >> 209: if (_cpu == CPU_ARM && ((_model == 0xd0c || _model2 == 0xd0c) || >> 210: (_model == 0xd49 || _model2 == 0xd49) || >> 211: (_model == 0xd40 || _model2 == 0xd40))) { > > Perhaps we need a function to encapsulate the logic here. > > > bool model_is(unsigned in cpu_model) { > return _model == cpu_model || _model2 == cpu_model; > } I have addressed this in a new commit. Thanks for suggestion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14897#discussion_r1266459146 From pli at openjdk.org Tue Jul 18 09:01:29 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 18 Jul 2023 09:01:29 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 03:19:10 GMT, Pengfei Li wrote: > As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. > > We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. Some checks are simplified in my latest commit as flags and features are in sync. May I have a review now? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14897#issuecomment-1639814434 From pli at openjdk.org Tue Jul 18 09:01:29 2023 From: pli at openjdk.org (Pengfei Li) Date: Tue, 18 Jul 2023 09:01:29 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options [v2] In-Reply-To: References: Message-ID: > As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. > > We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. Pengfei Li has updated the pull request incrementally with two additional commits since the last revision: - Simplify some checks - Address aph's comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14897/files - new: https://git.openjdk.org/jdk/pull/14897/files/379fa9ce..85a3c05d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14897&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14897&range=00-01 Stats: 21 lines in 8 files changed: 4 ins; 3 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/14897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14897/head:pull/14897 PR: https://git.openjdk.org/jdk/pull/14897 From aph at openjdk.org Tue Jul 18 09:53:13 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 18 Jul 2023 09:53:13 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options [v2] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 09:01:29 GMT, Pengfei Li wrote: >> As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. >> >> We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. > > Pengfei Li has updated the pull request incrementally with two additional commits since the last revision: > > - Simplify some checks > - Address aph's comment Looks good, but please fix the dangling else. src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 580: > 578: int buf_used_len = os::snprintf_checked(buf, sizeof(buf), "0x%02x:0x%x:0x%03x:%d", _cpu, _variant, _model, _revision); > 579: if (_model2) os::snprintf_checked(buf + buf_used_len, sizeof(buf) - buf_used_len, "(0x%03x)", _model2); > 580: #define ADD_FEATURE_IF_SUPPORTED(id, name, bit) if (VM_Version::supports_##name()) strcat(buf, ", " #name); ```suggestion \ #define ADD_FEATURE_IF_SUPPORTED(id, name, bit) \ do { \ if (VM_Version::supports_##name()) strcat(buf, ", " #name); \ while(0); Reason: no dangling else in macros, ever. ------------- PR Review: https://git.openjdk.org/jdk/pull/14897#pullrequestreview-1534658549 PR Review Comment: https://git.openjdk.org/jdk/pull/14897#discussion_r1266528503 From jpbempel at openjdk.org Tue Jul 18 13:03:34 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Tue, 18 Jul 2023 13:03:34 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v2] In-Reply-To: References: Message-ID: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. Jean-Philippe Bempel has updated the pull request incrementally with one additional commit since the last revision: add jtreg test for leak ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14780/files - new: https://git.openjdk.org/jdk/pull/14780/files/bf523df7..2ff1721b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14780&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14780&range=00-01 Stats: 128 lines in 2 files changed: 128 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14780.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14780/head:pull/14780 PR: https://git.openjdk.org/jdk/pull/14780 From duke at openjdk.org Tue Jul 18 13:39:13 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 18 Jul 2023 13:39:13 GMT Subject: RFR: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 22:35:43 GMT, Ashutosh Mehra wrote: > Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. > I also took this opportunity to fix some other minor issues with logging: > 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. > 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. > 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. > 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. > 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile > > I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. > > Testing: GHA testing passed Looking for one more reviewer for this patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14880#issuecomment-1640243716 From duke at openjdk.org Tue Jul 18 14:21:34 2023 From: duke at openjdk.org (sid8606) Date: Tue, 18 Jul 2023 14:21:34 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v3] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with one additional commit since the last revision: Fix build errors and rename variable ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/ba81c1b0..fa8ea22b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=01-02 Stats: 11 lines in 8 files changed: 2 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From jpbempel at openjdk.org Tue Jul 18 16:43:37 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Tue, 18 Jul 2023 16:43:37 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v3] In-Reply-To: References: Message-ID: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. Jean-Philippe Bempel has updated the pull request incrementally with one additional commit since the last revision: Revert resolved class to unresolved for comparison remove is_unresolved_class_mismatch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14780/files - new: https://git.openjdk.org/jdk/pull/14780/files/2ff1721b..05071a56 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14780&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14780&range=01-02 Stats: 45 lines in 3 files changed: 7 ins; 38 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14780.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14780/head:pull/14780 PR: https://git.openjdk.org/jdk/pull/14780 From stuefe at openjdk.org Tue Jul 18 16:46:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 18 Jul 2023 16:46:54 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v3] In-Reply-To: References: Message-ID: <9fVP_lyTTosYFRBIrune1jZ0xfsT_alMGDjfFUJdEcQ=.ba4dfad4-a309-45df-9a60-af2f61a67b6d@github.com> On Tue, 18 Jul 2023 14:21:34 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Fix build errors and rename variable I think this looks good. Cheers, Thomas ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14888#pullrequestreview-1535481573 From stuefe at openjdk.org Tue Jul 18 16:48:09 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 18 Jul 2023 16:48:09 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v3] In-Reply-To: References: Message-ID: <5ZwKWOnzM422HYdUcSEb7EbtFCpmsczIWQJypi5rbF0=.93b8d479-1b12-46ed-b202-07b3816a2abb@github.com> On Sat, 15 Jul 2023 15:43:58 GMT, Thomas Stuefe wrote: >> TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. >> >> A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. >> >> ------- >> >> With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. >> >> We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. >> >> This approach has many disadvantages: >> >> - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. >> >> - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. >> >> - We only get 1 shot. It's either one of these two addresses. >> >> - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. >> >> - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). >> >> - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. >> >> - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. >> >> - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loos... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Fix Windows Thanks @dholmes-ora . I worked in your feedback. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14867#issuecomment-1640571612 From stuefe at openjdk.org Tue Jul 18 16:48:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 18 Jul 2023 16:48:04 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v4] In-Reply-To: References: Message-ID: > TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. > > A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. > > ------- > > With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. > > We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. > > This approach has many disadvantages: > > - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. > > - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. > > - We only get 1 shot. It's either one of these two addresses. > > - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. > > - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). > > - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. > > - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. > > - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must cont... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Feedback David - Merge branch 'master' into JDK-8312018-Improve-zero-base-optimized-reservation-of-class-space - Fix Windows - Feedback Roman; fix off-by-1; fix tracing - better zero-based reservation strategy ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14867/files - new: https://git.openjdk.org/jdk/pull/14867/files/8d6a1ed4..60923aad Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14867&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14867&range=02-03 Stats: 8259 lines in 261 files changed: 6693 ins; 758 del; 808 mod Patch: https://git.openjdk.org/jdk/pull/14867.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14867/head:pull/14867 PR: https://git.openjdk.org/jdk/pull/14867 From jpbempel at openjdk.org Tue Jul 18 16:48:55 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Tue, 18 Jul 2023 16:48:55 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v4] In-Reply-To: References: Message-ID: > Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. Jean-Philippe Bempel has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: Revert resolved class to unresolved for comparison remove is_unresolved_class_mismatch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14780/files - new: https://git.openjdk.org/jdk/pull/14780/files/05071a56..c1a2d7c7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14780&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14780&range=02-03 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14780.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14780/head:pull/14780 PR: https://git.openjdk.org/jdk/pull/14780 From jpbempel at openjdk.org Tue Jul 18 16:52:08 2023 From: jpbempel at openjdk.org (Jean-Philippe Bempel) Date: Tue, 18 Jul 2023 16:52:08 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 14:34:38 GMT, Coleen Phillimore wrote: >> Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. > > Also there is a nice test harness for class redefinition in the test/hotspot/jtreg/serviceability/jvmti/RedefineClasses tests that you might be able to use to add a test for this. Thanks @coleenp for the hints about fixing this issue. Please review again the new changes that include: - jtreg test that reproduces the issue - revert the unresolved class state if the current entry is resolved - remove `is_unresolved_class_mismatch` method as useless now. Thanks (sorry to the force push, old habit) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14780#issuecomment-1640580798 From duke at openjdk.org Tue Jul 18 17:21:46 2023 From: duke at openjdk.org (sid8606) Date: Tue, 18 Jul 2023 17:21:46 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v4] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with one additional commit since the last revision: Fix #ifdef AIX ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/fa8ea22b..b4590e22 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From amitkumar at openjdk.org Tue Jul 18 17:43:45 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 18 Jul 2023 17:43:45 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 17:21:46 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Fix #ifdef AIX Changes requested by amitkumar (Committer). src/hotspot/cpu/s390/globalDefinitions_s390.hpp line 34: > 32: > 33: //All faults on s390x give the address only on page granularity. > 34: //Set Pdsegfault_address to minumum one page address. Suggestion: // All faults on s390x give the address only on page granularity. // Set pd_segfault_address to minimum one page address. test/hotspot/jtreg/runtime/ErrorHandling/TestSigInfoInHsErrFile.java line 73: > 71: //All faults on s390x give the address only on page granularity. > 72: //Hence fault address is first page address. > 73: String crashAddress = Platform.isAix() ? "0xffffffffffffffff" : Platform.isS390x() ? "0x0*1000" : "0x0*400"; Probably we can create a if-else ladder here and move this comment inside that if/else block ? something like this: if (AIX) { crashAddress for AIX } else if (S390) { // your comment crashAddress for S390 } else { crashAddress for other archs } ------------- PR Review: https://git.openjdk.org/jdk/pull/14888#pullrequestreview-1535568302 PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267111048 PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267102169 From amitkumar at openjdk.org Tue Jul 18 17:56:44 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Tue, 18 Jul 2023 17:56:44 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 17:21:46 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Fix #ifdef AIX @RealFYang and @backwaterred, your approval would be highly appreciated for this PR :-) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14888#issuecomment-1640692293 From pchilanomate at openjdk.org Tue Jul 18 18:24:40 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Tue, 18 Jul 2023 18:24:40 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 07:41:00 GMT, David Holmes wrote: > [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. > > I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. > > The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). > > Testing: > - tiers 1-3 (sanity) > - TestNativeStack regression test > > Thanks Looks good to me. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14862#pullrequestreview-1535657063 From shade at openjdk.org Tue Jul 18 18:28:41 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 18 Jul 2023 18:28:41 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 07:41:00 GMT, David Holmes wrote: > [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. > > I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. > > The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). > > Testing: > - tiers 1-3 (sanity) > - TestNativeStack regression test > > Thanks Looks reasonable. Is `_WIN32` defined for Windows 64 as well? Consider a nit, if you have any other changes in the pipe for this PR. Ignore otherwise. test/hotspot/jtreg/runtime/jni/nativeStack/libnativeStack.c line 110: > 108: printf("Native thread terminating\n"); > 109: > 110: #ifndef _WIN32 I'd flip it over to have `#ifdef` without negation, but that's minor. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14862#pullrequestreview-1535661069 PR Review Comment: https://git.openjdk.org/jdk/pull/14862#discussion_r1267152619 From stuefe at openjdk.org Tue Jul 18 18:56:42 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 18 Jul 2023 18:56:42 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 07:41:00 GMT, David Holmes wrote: > [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. > > I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. > > The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). > > Testing: > - tiers 1-3 (sanity) > - TestNativeStack regression test > > Thanks Good. Small nit inline. Cheers, Thomas test/hotspot/jtreg/runtime/jni/nativeStack/libnativeStack.c line 75: > 73: > 74: #ifdef _WIN32 > 75: unsigned __stdcall `__stdcall` is only used for 32-bit windows. I'd probably skip it since we are about to remove 32-bit support anyway. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14862#pullrequestreview-1535703266 PR Review Comment: https://git.openjdk.org/jdk/pull/14862#discussion_r1267180181 From stuefe at openjdk.org Tue Jul 18 19:09:52 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 18 Jul 2023 19:09:52 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v12] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 20:03:24 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Display unsupported text with UL warning level, + test > - Last Aleksey Feedback I'll wait for any last remarks until tomorrow morning (CET). Barring any objections, I'll push. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1640822900 From iklam at openjdk.org Tue Jul 18 19:35:47 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Jul 2023 19:35:47 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 16:48:04 GMT, Thomas Stuefe wrote: >> TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. >> >> A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. >> >> ------- >> >> With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. >> >> We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. >> >> This approach has many disadvantages: >> >> - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. >> >> - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. >> >> - We only get 1 shot. It's either one of these two addresses. >> >> - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. >> >> - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). >> >> - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. >> >> - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. >> >> - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loos... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Feedback David > - Merge branch 'master' into JDK-8312018-Improve-zero-base-optimized-reservation-of-class-space > - Fix Windows > - Feedback Roman; fix off-by-1; fix tracing > - better zero-based reservation strategy I think we should document the interaction with "-Xshare:dump". Maybe we should add comments that the value of `CompressedKlass::base()` is irrelevant to the dumped CDS archive when running "java -Xshared:dump", because of this code. I.e., if CDS is enabled, we always use a non-zero based encoding. narrowKlass ArchiveBuilder::get_requested_narrow_klass(Klass* k) { assert(DumpSharedSpaces, "sanity"); k = get_buffered_klass(k); Klass* requested_k = to_requested(k); address narrow_klass_base = _requested_static_archive_bottom; // runtime encoding base == runtime mapping start const int narrow_klass_shift = ArchiveHeapWriter::precomputed_narrow_klass_shift; return CompressedKlassPointers::encode_not_null(requested_k, narrow_klass_base, narrow_klass_shift); } ------------- PR Comment: https://git.openjdk.org/jdk/pull/14867#issuecomment-1640886673 From iklam at openjdk.org Tue Jul 18 19:47:47 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Jul 2023 19:47:47 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 16:48:04 GMT, Thomas Stuefe wrote: >> TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. >> >> A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. >> >> ------- >> >> With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. >> >> We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. >> >> This approach has many disadvantages: >> >> - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. >> >> - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. >> >> - We only get 1 shot. It's either one of these two addresses. >> >> - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. >> >> - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). >> >> - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. >> >> - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. >> >> - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loos... > > Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Feedback David > - Merge branch 'master' into JDK-8312018-Improve-zero-base-optimized-reservation-of-class-space > - Fix Windows > - Feedback Roman; fix off-by-1; fix tracing > - better zero-based reservation strategy src/hotspot/share/memory/metaspace.cpp line 597: > 595: { > 596: // First try for zero-base zero-shift (lower 4G); failing that, try for zero-based with max shift (lower 32G) > 597: constexpr int num_tries = 8; num_tries should be computed instead of hard-coded. src/hotspot/share/runtime/os.cpp line 1787: > 1785: #endif > 1786: - os::vm_page_size()); > 1787: } I am not sure what "attach" means in this sense. If it's the usable address range, wouldn't it need to be OS-specific? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1267238281 PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1267235883 From amenkov at openjdk.org Tue Jul 18 20:14:39 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 18 Jul 2023 20:14:39 GMT Subject: [jdk21] RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach In-Reply-To: References: Message-ID: <9DmCaQOV7zXsVP2mMuXus9QhRnn_iaRbCDMQqOWb5SQ=.94ff5efa-2781-46a0-9d41-caa3d08626bf@github.com> On Thu, 13 Jul 2023 03:41:28 GMT, Serguei Spitsyn wrote: > Clean backport from mainline jdk repo to jdk21 for the fix of: > [8311556](https://bugs.openjdk.org/browse/JDK-8311556): GetThreadLocalStorage not working for vthreads mounted during JVMTI attach > > Testing: > - TBD: mach5 tiers 1-5 Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk21/pull/117#pullrequestreview-1535835390 From amenkov at openjdk.org Tue Jul 18 20:19:49 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 18 Jul 2023 20:19:49 GMT Subject: Integrated: JDK-8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 19:18:38 GMT, Alex Menkov wrote: > The change fixes handling of "suspended" bit in VT state. > The code looks very strange. > java_lang_VirtualThread::RUNNING == 3, so line 803 clears JVMTI_THREAD_STATE_ALIVE(1) and JVMTI_THREAD_STATE_TERMINATED(2) > Per log this code came from loom repo with VT integration. > > Testing: tier1-4, updated GetThreadStateMountedTest.java This pull request has now been integrated. Changeset: af5bf817 Author: Alex Menkov URL: https://git.openjdk.org/jdk/commit/af5bf81754072fa5879726cfacb7404892b553f0 Stats: 10 lines in 2 files changed: 2 ins; 2 del; 6 mod 8310584: GetThreadState reports blocked and runnable for pinned suspended virtual threads Reviewed-by: sspitsyn, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/14878 From sspitsyn at openjdk.org Tue Jul 18 20:39:59 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 18 Jul 2023 20:39:59 GMT Subject: [jdk21] Integrated: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 03:41:28 GMT, Serguei Spitsyn wrote: > Clean backport from mainline jdk repo to jdk21 for the fix of: > [8311556](https://bugs.openjdk.org/browse/JDK-8311556): GetThreadLocalStorage not working for vthreads mounted during JVMTI attach > > Testing: > - TBD: mach5 tiers 1-5 This pull request has now been integrated. Changeset: e3cfb56d Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk21/commit/e3cfb56d8fa3852f07d4f3af038955c98eee742c Stats: 196 lines in 3 files changed: 181 ins; 13 del; 2 mod 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach Reviewed-by: amenkov Backport-of: 11a5115caf179a1bbed5311e12ed3851e026c5c5 ------------- PR: https://git.openjdk.org/jdk21/pull/117 From sspitsyn at openjdk.org Tue Jul 18 20:39:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 18 Jul 2023 20:39:57 GMT Subject: [jdk21] RFR: 8311556: GetThreadLocalStorage not working for vthreads mounted during JVMTI attach In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 03:41:28 GMT, Serguei Spitsyn wrote: > Clean backport from mainline jdk repo to jdk21 for the fix of: > [8311556](https://bugs.openjdk.org/browse/JDK-8311556): GetThreadLocalStorage not working for vthreads mounted during JVMTI attach > > Testing: > - TBD: mach5 tiers 1-5 Thank you for review, Alex. ------------- PR Comment: https://git.openjdk.org/jdk21/pull/117#issuecomment-1640956299 From dlong at openjdk.org Tue Jul 18 21:27:44 2023 From: dlong at openjdk.org (Dean Long) Date: Tue, 18 Jul 2023 21:27:44 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Tue, 18 Jul 2023 06:50:40 GMT, Boris Ulasevich wrote: >> Dean Long has updated the pull request incrementally with one additional commit since the last revision: >> >> another counter overflow > > Hi, the change looks good for me. ARM32 tests are OK. Thanks @bulasevich. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14883#issuecomment-1641011160 From dlong at openjdk.org Tue Jul 18 21:31:41 2023 From: dlong at openjdk.org (Dean Long) Date: Tue, 18 Jul 2023 21:31:41 GMT Subject: RFR: 8312077: Fix signed integer overflow, final part [v7] In-Reply-To: References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: <8KKRAJf74MABR4l7EXPN84iqZrWvzc0xKYYj8XWf7tw=.c73eeef9-e926-4bed-937b-529b15d34107@github.com> On Fri, 14 Jul 2023 19:54:25 GMT, Dean Long wrote: >> This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. > > Dean Long has updated the pull request incrementally with one additional commit since the last revision: > > another counter overflow I'm still waiting on some tier10 tests that take 7 days to complete. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14883#issuecomment-1641016106 From dholmes at openjdk.org Tue Jul 18 21:53:43 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 21:53:43 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: <_Ket2NFVTlnHL8J-PeWd1i2o2mwexZL6mnynKzAd5kk=.44ebd3b2-5d04-4517-a127-efe19647bda2@github.com> On Tue, 18 Jul 2023 18:50:42 GMT, Thomas Stuefe wrote: >> [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. >> >> I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. >> >> The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). >> >> Testing: >> - tiers 1-3 (sanity) >> - TestNativeStack regression test >> >> Thanks > > test/hotspot/jtreg/runtime/jni/nativeStack/libnativeStack.c line 75: > >> 73: >> 74: #ifdef _WIN32 >> 75: unsigned __stdcall > > `__stdcall` is only used for 32-bit windows. I'd probably skip it since we are about to remove 32-bit support anyway. `__stdcall` is required for native/unmanaged code per: https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/beginthread-beginthreadex?view=msvc-170 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14862#discussion_r1267336563 From dholmes at openjdk.org Tue Jul 18 22:04:41 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 22:04:41 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 18:25:56 GMT, Aleksey Shipilev wrote: > Is _WIN32 defined for Windows 64 as well? Yes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14862#issuecomment-1641049755 From dholmes at openjdk.org Tue Jul 18 22:04:42 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 22:04:42 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 18:53:43 GMT, Thomas Stuefe wrote: >> [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. >> >> I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. >> >> The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). >> >> Testing: >> - tiers 1-3 (sanity) >> - TestNativeStack regression test >> >> Thanks > > Good. Small nit inline. > > Cheers, Thomas Thanks for the reviews @tstuefe and @shipilev ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14862#issuecomment-1641050661 From dholmes at openjdk.org Tue Jul 18 22:07:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 18 Jul 2023 22:07:50 GMT Subject: Integrated: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 07:41:00 GMT, David Holmes wrote: > [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. > > I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. > > The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). > > Testing: > - tiers 1-3 (sanity) > - TestNativeStack regression test > > Thanks This pull request has now been integrated. Changeset: c2f421b8 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/c2f421b8bf920665e05bbbb56bc4d7f55430d5e1 Stats: 51 lines in 5 files changed: 43 ins; 1 del; 7 mod 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms Reviewed-by: pchilanomate, shade, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/14862 From iklam at openjdk.org Tue Jul 18 23:50:08 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 18 Jul 2023 23:50:08 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v2] In-Reply-To: References: Message-ID: <6BUeghcYvjW1602hxr1lVz7Aukpu1RT1jmiDHyQXIgU=.064da5c0-29fd-4110-b696-6268ca9c8454@github.com> > This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. > > The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. > > Example with `-XX:-UseCompressedOops`: > > > 0x00000000100001f0: @@ Object java.lang.String > 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 4 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 > 0x0000000010000210: @@ Object [B length: 6 > 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W > > > Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: > > > 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String > 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 3 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops - @tstuefe and @matias9927 comments - 8308903: Print detailed info for Java objects in -Xlog:cds+map ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14841/files - new: https://git.openjdk.org/jdk/pull/14841/files/c379404d..14f3d15b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=00-01 Stats: 21380 lines in 678 files changed: 13267 ins; 6867 del; 1246 mod Patch: https://git.openjdk.org/jdk/pull/14841.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14841/head:pull/14841 PR: https://git.openjdk.org/jdk/pull/14841 From lzhai at openjdk.org Wed Jul 19 03:09:40 2023 From: lzhai at openjdk.org (Leslie Zhai) Date: Wed, 19 Jul 2023 03:09:40 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 PING ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1641314769 From dholmes at openjdk.org Wed Jul 19 03:46:00 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 19 Jul 2023 03:46:00 GMT Subject: RFR: 8311541: JavaThread::print_jni_stack doesn't support native stacks on all platforms In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 18:21:54 GMT, Patricio Chilano Mateo wrote: >> [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) modified print_jni_stack to print the native stack when there are no Java frames. To do that it used VMError::print_native_stack, however that function is only for platforms that support stack-walking by following frames, on other platforms (i.e. Windows and AIX) we need to use os::platform_print_native_stack. >> >> I'm not trying to consolidate the different versions of the stack printing code in this PR so that it is more easily backported to where [JDK-8295974](https://bugs.openjdk.org/browse/JDK-8295974) was. >> >> The test has been updated to work on Windows (taking advantage of two other recent enhancements - see JBS for details). >> >> Testing: >> - tiers 1-3 (sanity) >> - TestNativeStack regression test >> >> Thanks > > Looks good to me. Oops! Thanks for the review @pchilano - sorry I initially missed seeing it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14862#issuecomment-1641368386 From duke at openjdk.org Wed Jul 19 04:59:20 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 04:59:20 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v5] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with one additional commit since the last revision: Use if-else ladder ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/b4590e22..6a0447fe Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=03-04 Stats: 8 lines in 1 file changed: 5 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From duke at openjdk.org Wed Jul 19 04:59:21 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 04:59:21 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v4] In-Reply-To: References: Message-ID: <0DNtYn6pTK213izIhgWrWqdzq74lW9aSOKnLwmrDLjw=.035231a5-a754-47b0-899b-6f7f8b07501f@github.com> On Tue, 18 Jul 2023 17:39:28 GMT, Amit Kumar wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix #ifdef AIX > > src/hotspot/cpu/s390/globalDefinitions_s390.hpp line 34: > >> 32: >> 33: //All faults on s390x give the address only on page granularity. >> 34: //Set Pdsegfault_address to minumum one page address. > > Suggestion: > > // All faults on s390x give the address only on page granularity. > // Set pd_segfault_address to minimum one page address. Fixed, Thanks > test/hotspot/jtreg/runtime/ErrorHandling/TestSigInfoInHsErrFile.java line 73: > >> 71: //All faults on s390x give the address only on page granularity. >> 72: //Hence fault address is first page address. >> 73: String crashAddress = Platform.isAix() ? "0xffffffffffffffff" : Platform.isS390x() ? "0x0*1000" : "0x0*400"; > > Probably we can create a if-else ladder here and move this comment inside that if/else block ? something like this: > > if (AIX) { > crashAddress for AIX > } else if (S390) { > // your comment > crashAddress for S390 > } else { > crashAddress for other archs > } I have adapted this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267533348 PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267533250 From thartmann at openjdk.org Wed Jul 19 05:24:42 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 19 Jul 2023 05:24:42 GMT Subject: RFR: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 22:35:43 GMT, Ashutosh Mehra wrote: > Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. > I also took this opportunity to fix some other minor issues with logging: > 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. > 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. > 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. > 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. > 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile > > I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. > > Testing: GHA testing passed Looks good to me. ------------- Marked as reviewed by thartmann (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14880#pullrequestreview-1536268202 From myano at openjdk.org Wed Jul 19 05:28:43 2023 From: myano at openjdk.org (Masanori Yano) Date: Wed, 19 Jul 2023 05:28:43 GMT Subject: RFR: 8308751: Create new switch to print error reporting output to both hs_err_pid file and stdout/stderr [v3] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 11:04:12 GMT, Masanori Yano wrote: >> I think it makes sense to add the ErrorFileWithStdout and ErrorFileWithStderr for troubleshooting. >> I would appriciate if someone could review it. > > Masanori Yano has updated the pull request incrementally with one additional commit since the last revision: > > 8308751: Create new switch to print error reporting output to both hs_err_pid file and stdout/stderr I understand why you think this should not be added. I would like to withdraw this PR and CSR in order to reconsider. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14114#issuecomment-1641435649 From iklam at openjdk.org Wed Jul 19 05:31:19 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 19 Jul 2023 05:31:19 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v3] In-Reply-To: References: Message-ID: <5BqTfe26uwDHKB7w1CAF3Vyr2wvTmXbOqSL44AQUmu0=.90324000-dfd7-4657-b922-4d9491c34696@github.com> > This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. > > The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. > > Example with `-XX:-UseCompressedOops`: > > > 0x00000000100001f0: @@ Object java.lang.String > 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 4 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 > 0x0000000010000210: @@ Object [B length: 6 > 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W > > > Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: > > > 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String > 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 3 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: added test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14841/files - new: https://git.openjdk.org/jdk/pull/14841/files/14f3d15b..6a898268 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=01-02 Stats: 384 lines in 2 files changed: 384 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14841.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14841/head:pull/14841 PR: https://git.openjdk.org/jdk/pull/14841 From iklam at openjdk.org Wed Jul 19 05:36:05 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 19 Jul 2023 05:36:05 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v4] In-Reply-To: References: Message-ID: <5kkQK1D8Y2-7Wc2Taj-FNAU7xRwu8a8YdXIdlmSD-p0=.9fec646b-693e-48fd-8e0e-6227a3f176da@github.com> > This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. > > The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. > > Example with `-XX:-UseCompressedOops`: > > > 0x00000000100001f0: @@ Object java.lang.String > 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 4 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 > 0x0000000010000210: @@ Object [B length: 6 > 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W > > > Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: > > > 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String > 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 3 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: added hints in test case ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14841/files - new: https://git.openjdk.org/jdk/pull/14841/files/6a898268..2f916d4c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=02-03 Stats: 3 lines in 1 file changed: 3 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14841.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14841/head:pull/14841 PR: https://git.openjdk.org/jdk/pull/14841 From amitkumar at openjdk.org Wed Jul 19 05:46:42 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 19 Jul 2023 05:46:42 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v4] In-Reply-To: <0DNtYn6pTK213izIhgWrWqdzq74lW9aSOKnLwmrDLjw=.035231a5-a754-47b0-899b-6f7f8b07501f@github.com> References: <0DNtYn6pTK213izIhgWrWqdzq74lW9aSOKnLwmrDLjw=.035231a5-a754-47b0-899b-6f7f8b07501f@github.com> Message-ID: <99zMr2US18O0Bs8j1ev6NIQ_TBuY0LHlVwsKZn8DN7U=.f29cc8e2-7bac-4cb5-8701-6110554ab50c@github.com> On Wed, 19 Jul 2023 04:53:53 GMT, sid8606 wrote: >> src/hotspot/cpu/s390/globalDefinitions_s390.hpp line 34: >> >>> 32: >>> 33: //All faults on s390x give the address only on page granularity. >>> 34: //Set Pdsegfault_address to minumum one page address. >> >> Suggestion: >> >> // All faults on s390x give the address only on page granularity. >> // Set pd_segfault_address to minimum one page address. > > Fixed, Thanks you need to commit these changes, They are not reflected as of now here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267566207 From duke at openjdk.org Wed Jul 19 06:01:08 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 06:01:08 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v6] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with two additional commits since the last revision: - Add a space - Fix typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/6a0447fe..93065f41 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=04-05 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From duke at openjdk.org Wed Jul 19 06:01:08 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 06:01:08 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v4] In-Reply-To: <99zMr2US18O0Bs8j1ev6NIQ_TBuY0LHlVwsKZn8DN7U=.f29cc8e2-7bac-4cb5-8701-6110554ab50c@github.com> References: <0DNtYn6pTK213izIhgWrWqdzq74lW9aSOKnLwmrDLjw=.035231a5-a754-47b0-899b-6f7f8b07501f@github.com> <99zMr2US18O0Bs8j1ev6NIQ_TBuY0LHlVwsKZn8DN7U=.f29cc8e2-7bac-4cb5-8701-6110554ab50c@github.com> Message-ID: On Wed, 19 Jul 2023 05:44:22 GMT, Amit Kumar wrote: >> Fixed, Thanks > > you need to commit these changes, They are not reflected as of now here. ahh, I missed that. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267572762 From amitkumar at openjdk.org Wed Jul 19 06:08:45 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 19 Jul 2023 06:08:45 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v6] In-Reply-To: References: Message-ID: <9IIH2jz45MG5YGN4N47wyWHK08asYF54sAJ5bzF4zfg=.6bb9b592-4dbb-4573-b3f2-d88cfb0cc519@github.com> On Wed, 19 Jul 2023 06:01:08 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with two additional commits since the last revision: > > - Add a space > - Fix typo src/hotspot/cpu/s390/globalDefinitions_s390.hpp line 34: > 32: > 33: // All faults on s390x give the address only on page granularity. > 34: // Set Pdsegfault_address to miniumum one page address. Suggestion: // Set Pdsegfault_address to minimum one page address. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267579828 From duke at openjdk.org Wed Jul 19 06:27:17 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 06:27:17 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v7] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with one additional commit since the last revision: Fix Typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/93065f41..d23f02a6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=05-06 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From duke at openjdk.org Wed Jul 19 06:27:17 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 06:27:17 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v6] In-Reply-To: <9IIH2jz45MG5YGN4N47wyWHK08asYF54sAJ5bzF4zfg=.6bb9b592-4dbb-4573-b3f2-d88cfb0cc519@github.com> References: <9IIH2jz45MG5YGN4N47wyWHK08asYF54sAJ5bzF4zfg=.6bb9b592-4dbb-4573-b3f2-d88cfb0cc519@github.com> Message-ID: On Wed, 19 Jul 2023 06:05:44 GMT, Amit Kumar wrote: >> sid8606 has updated the pull request incrementally with two additional commits since the last revision: >> >> - Add a space >> - Fix typo > > src/hotspot/cpu/s390/globalDefinitions_s390.hpp line 34: > >> 32: >> 33: // All faults on s390x give the address only on page granularity. >> 34: // Set Pdsegfault_address to miniumum one page address. > > Suggestion: > > // Set Pdsegfault_address to minimum one page address. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1267591104 From fyang at openjdk.org Wed Jul 19 06:29:43 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 19 Jul 2023 06:29:43 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v4] In-Reply-To: References: Message-ID: <4xUworpfD3Q9DPO1xsf9ZWYM4VfAzhJDld6e8gqsV1E=.45c78e02-0bb8-4162-80f9-7c0f97c243fb@github.com> On Tue, 18 Jul 2023 17:53:46 GMT, Amit Kumar wrote: > @RealFYang and @backwaterred, your approval would be highly appreciated for this PR :-) Seems that this will only make a difference to zero and s390 (I am not familar with s390). And I see the test will fail with zero build on linux-aarch64 platform. So you might want to distinguish s390 for zero too. I think it will be safer to set `pd_segfault_address` to 4096 only for s390 and keep the original setting for others. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14888#issuecomment-1641491458 From djelinski at openjdk.org Wed Jul 19 07:52:50 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Wed, 19 Jul 2023 07:52:50 GMT Subject: Integrated: 8312190: Fix c++11-narrowing warnings in hotspot code In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 18:31:41 GMT, Daniel Jeli?ski wrote: > This patch fixes compilation warnings produced by Clang when compiling on Windows. > > Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. > > See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. This pull request has now been integrated. Changeset: f677793d Author: Daniel Jeli?ski URL: https://git.openjdk.org/jdk/commit/f677793d02a7aa5d01c06023000762b12b8cee91 Stats: 4 lines in 2 files changed: 0 ins; 0 del; 4 mod 8312190: Fix c++11-narrowing warnings in hotspot code Reviewed-by: dholmes, kbarrett ------------- PR: https://git.openjdk.org/jdk/pull/14907 From djelinski at openjdk.org Wed Jul 19 07:52:49 2023 From: djelinski at openjdk.org (Daniel =?UTF-8?B?SmVsacWEc2tp?=) Date: Wed, 19 Jul 2023 07:52:49 GMT Subject: RFR: 8312190: Fix c++11-narrowing warnings in hotspot code In-Reply-To: References: Message-ID: <7v2STKkQMw5UNgAjVYlFPMO6BpoTnsXSXkXMNXXSRMg=.fb773240-2143-44d0-819d-ab5db5d70213@github.com> On Mon, 17 Jul 2023 18:31:41 GMT, Daniel Jeli?ski wrote: > This patch fixes compilation warnings produced by Clang when compiling on Windows. > > Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. > > See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. Mach5 came back clean. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14907#issuecomment-1641592344 From eastigeevich at openjdk.org Wed Jul 19 09:13:47 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 19 Jul 2023 09:13:47 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains one commit: > > 8307352: AARCH64: Improve itable_stub src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1217: > 1215: assert_different_registers(method_result, recv_klass, holder_klass, temp_itbl_klass, scan_temp, holder_offset); > 1216: > 1217: int vtable_base = in_bytes(Klass::vtable_start_offset()); Why not to use `vtable_start_offset` to be explicit? src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1218: > 1216: > 1217: int vtable_base = in_bytes(Klass::vtable_start_offset()); > 1218: int itentry_off = in_bytes(itableMethodEntry::method_offset()); `itmentry_method_offset`? src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1219: > 1217: int vtable_base = in_bytes(Klass::vtable_start_offset()); > 1218: int itentry_off = in_bytes(itableMethodEntry::method_offset()); > 1219: int scan_step = itableOffsetEntry::size() * wordSize; `itoentry_size_in_bytes`? src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1221: > 1219: int scan_step = itableOffsetEntry::size() * wordSize; > 1220: int vte_size = vtableEntry::size_in_bytes(); > 1221: int ioffset = in_bytes(itableOffsetEntry::interface_offset()); `itoentry_interface_offset`? src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1222: > 1220: int vte_size = vtableEntry::size_in_bytes(); > 1221: int ioffset = in_bytes(itableOffsetEntry::interface_offset()); > 1222: int ooffset = in_bytes(itableOffsetEntry::offset_offset()); `itoentry_offset_offset`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267779597 PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267782148 PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267786493 PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267788162 PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267792916 From eastigeevich at openjdk.org Wed Jul 19 09:23:47 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 19 Jul 2023 09:23:47 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: <4Zkiwd4YBo1Z_PFWBV6RgSIu95OCQ0-gsZKQoi4q9BY=.4c293358-cf53-4984-a12f-3898d81799ed@github.com> On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > 8307352: AARCH64: Improve itable_stub src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1245: > 1243: cbz(temp_itbl_klass, L_no_such_interface); > 1244: > 1245: // Loop: Look for holder_klass record in itable `// Loop: Look for itableOffsetEntry containing holder_klass` src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1253: > 1251: // } while (temp_itbl_klass != 0); > 1252: // goto L_no_such_interface // Not found. > 1253: Label L_scan_holder; `L_search_holder_class`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267808189 PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267803691 From eastigeevich at openjdk.org Wed Jul 19 09:37:47 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 19 Jul 2023 09:37:47 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > 8307352: AARCH64: Improve itable_stub src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1211: > 1209: Register temp_itbl_klass, > 1210: Register scan_temp, > 1211: Register holder_offset, `holder_klass_itoentry`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1267824521 From duke at openjdk.org Wed Jul 19 09:48:07 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 09:48:07 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v8] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with one additional commit since the last revision: Set pd_segfault_address to 4096 oly for s390x for zero build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/d23f02a6..29d254a1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=06-07 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From duke at openjdk.org Wed Jul 19 09:48:07 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 09:48:07 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v7] In-Reply-To: References: Message-ID: On Wed, 19 Jul 2023 06:27:17 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Fix Typo > Now with this commit should pass on architecture than s390x. Thank you @RealFYang for the review and testing the PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14888#issuecomment-1641767127 From sspitsyn at openjdk.org Wed Jul 19 10:01:46 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 19 Jul 2023 10:01:46 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v4] In-Reply-To: References: Message-ID: <3caMuTqUS6CJ_NhT-o_G62gfy3FaDPCKy2A9ZOnbg3U=.670425ac-6e2a-4e6a-96a0-eec53c1ab96a@github.com> On Tue, 18 Jul 2023 16:48:55 GMT, Jean-Philippe Bempel wrote: >> Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. > > Jean-Philippe Bempel has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Revert resolved class to unresolved for comparison > > remove is_unresolved_class_mismatch src/hotspot/share/oops/constantPool.cpp line 1295: > 1293: t1 = JVM_CONSTANT_UnresolvedClass; > 1294: } > 1295: All consequences of this change are not clear to me yet. The lines 1307-1314 become not needed anymore. Also, should the same be done for t2 as well? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1267851499 From amitkumar at openjdk.org Wed Jul 19 11:06:43 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Wed, 19 Jul 2023 11:06:43 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v8] In-Reply-To: References: Message-ID: On Wed, 19 Jul 2023 09:48:07 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Set pd_segfault_address to 4096 oly for s390x for zero build Thanks Sid, LGTM ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/14888#pullrequestreview-1536846653 From aph at openjdk.org Wed Jul 19 11:25:41 2023 From: aph at openjdk.org (Andrew Haley) Date: Wed, 19 Jul 2023 11:25:41 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 `membar_volatile()` is a really bad name for this operation. It isn't any such thing: on some processors a volatile access requires more than just a storeLoad, and on some processors a storeLoad isn't needed for a volatile store. If you want a storeLoad you should call it storeLoad, please. Calling it `volatile` is just wrong. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1641902803 From aph at openjdk.org Wed Jul 19 11:25:43 2023 From: aph at openjdk.org (Andrew Haley) Date: Wed, 19 Jul 2023 11:25:43 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Thu, 29 Jun 2023 20:47:19 GMT, Martin Doerr wrote: > I wonder if a performance improvement can be measured on any other platform. I guess StoreLoad is typically so slow that other orderings don't matter? I know of no other processor on which StoreLoad is not also a full barrier. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1641903765 From aph at openjdk.org Wed Jul 19 11:28:40 2023 From: aph at openjdk.org (Andrew Haley) Date: Wed, 19 Jul 2023 11:28:40 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Fri, 30 Jun 2023 01:28:32 GMT, SUN Guoyun wrote: > Maybe I should write a performance test case for this PR. Seriously? You never measured the speed problem before writing a patch for it? How did you know it was a problem? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1641907421 From fyang at openjdk.org Wed Jul 19 12:23:42 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 19 Jul 2023 12:23:42 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v8] In-Reply-To: References: Message-ID: On Wed, 19 Jul 2023 09:48:07 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Set pd_segfault_address to 4096 oly for s390x for zero build Also note that the original `segfault_address` for zero build on AIX is -1 instead of 1024. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14888#issuecomment-1641984410 From stuefe at openjdk.org Wed Jul 19 12:41:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 19 Jul 2023 12:41:43 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 19:32:30 GMT, Ioi Lam wrote: > I think we should document the interaction with "-Xshare:dump". Maybe we should add comments that the value of `CompressedKlass::base()` is irrelevant to the dumped CDS archive when running "java -Xshared:dump", because of this code. > Okay. Maybe in a different RFE? Since its a bit tangential to what this patch does. > I.e., if CDS is enabled, we always use a non-zero based encoding. Not necessarily. If CDS is enabled and we don't get the preferred mapping address, we will fallback to traditional Klass range reservation and potentially go zero based. With this patch, that path is optimized too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14867#issuecomment-1642009430 From duke at openjdk.org Wed Jul 19 13:41:59 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 13:41:59 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v9] In-Reply-To: References: Message-ID: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 sid8606 has updated the pull request incrementally with one additional commit since the last revision: Add AIX pd_segfault_address to -1 on zero build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14888/files - new: https://git.openjdk.org/jdk/pull/14888/files/29d254a1..ec8ed70b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14888&range=07-08 Stats: 7 lines in 1 file changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14888.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14888/head:pull/14888 PR: https://git.openjdk.org/jdk/pull/14888 From duke at openjdk.org Wed Jul 19 15:05:49 2023 From: duke at openjdk.org (Wojciech Kudla) Date: Wed, 19 Jul 2023 15:05:49 GMT Subject: RFR: JDK-8305506: Add support for fractional values of SafepointTimeoutDelay [v2] In-Reply-To: References: Message-ID: On Thu, 27 Apr 2023 06:11:44 GMT, David Holmes wrote: >> Would be helpful if you could enable the Pre-submit test (GitHub actions). > > @TheRealMDoerr , @w-kudla I have filed the CSR request for this change and reviewed it. @dholmes-ora is there anything else required from me at this point? What are the next steps for this PR? ------------- PR Comment: https://git.openjdk.org/jdk/pull/13373#issuecomment-1642261729 From duke at openjdk.org Wed Jul 19 17:52:42 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 17:52:42 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v9] In-Reply-To: References: Message-ID: <_HVk8njvDxGD-61LZGN2Uxo9jKksaG9Jgmzj7DM__4k=.d0a9fb53-0c31-4ba6-9b77-dd931693e3ba@github.com> On Wed, 19 Jul 2023 13:41:59 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Add AIX pd_segfault_address to -1 on zero build Thank you for all the reviews which made this PR more mature. Tier1 tests are passing on fastdebug and slowdebug builds. Let's integrate this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14888#issuecomment-1642505544 From tsteele at openjdk.org Wed Jul 19 18:49:51 2023 From: tsteele at openjdk.org (Tyler Steele) Date: Wed, 19 Jul 2023 18:49:51 GMT Subject: RFR: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure [v9] In-Reply-To: References: Message-ID: On Wed, 19 Jul 2023 13:41:59 GMT, sid8606 wrote: >> All faults on s390x give the address only on page granularity. >> e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Add AIX pd_segfault_address to -1 on zero build Marked as reviewed by tsteele (Committer). Ran it on AIX just to be sure, and the test still passes. I have noted one cosmetic suggestion. LGTM ? Ha. I guess I should have refreshed the page before sending my review & comment. src/hotspot/cpu/zero/globalDefinitions_zero.hpp line 45: > 43: const size_t pd_segfault_address = 4096; > 44: #else > 45: const size_t pd_segfault_address = 1024; Format police! I find it's a bit more readable to indent the non-macro lines. #if defined(AIX) const size_t pd_segfault_address = -1; #elif ... ------------- PR Review: https://git.openjdk.org/jdk/pull/14888#pullrequestreview-1537587323 PR Comment: https://git.openjdk.org/jdk/pull/14888#issuecomment-1642581574 PR Comment: https://git.openjdk.org/jdk/pull/14888#issuecomment-1642582731 PR Review Comment: https://git.openjdk.org/jdk/pull/14888#discussion_r1268385859 From duke at openjdk.org Wed Jul 19 18:52:49 2023 From: duke at openjdk.org (sid8606) Date: Wed, 19 Jul 2023 18:52:49 GMT Subject: Integrated: 8312014: [s390x] TestSigInfoInHsErrFile.java Failure In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 12:44:03 GMT, sid8606 wrote: > All faults on s390x give the address only on page granularity. > e.g. if you use 0x123456 as fail address you get si_addr = 0x123000 This pull request has now been integrated. Changeset: 6f662130 Author: Sidraya Committer: Tyler Steele URL: https://git.openjdk.org/jdk/commit/6f6621303ad54a7dfd880c9472a387706a4466ff Stats: 31 lines in 10 files changed: 29 ins; 0 del; 2 mod 8312014: [s390x] TestSigInfoInHsErrFile.java Failure Reviewed-by: stuefe, amitkumar, tsteele ------------- PR: https://git.openjdk.org/jdk/pull/14888 From dlong at openjdk.org Wed Jul 19 20:39:40 2023 From: dlong at openjdk.org (Dean Long) Date: Wed, 19 Jul 2023 20:39:40 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 It's confusing to me to rename the post-barrier but not the pre-barrier. Instead of changing membar_release + membar to membar_release + membar_volatile, something like volatile_write_pre_barrier + volatile_write_post_barrier in the shared code makes more sense to me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1642721198 From eastigeevich at openjdk.org Wed Jul 19 20:48:49 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Wed, 19 Jul 2023 20:48:49 GMT Subject: RFR: 8307352: AARCH64: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 05:37:35 GMT, Boris Ulasevich wrote: >> This is a change for AARCH similar to https://github.com/openjdk/jdk/pull/13460 >> >> The change replaces two separate iterations over the itable with a new algorithm consisting of two loops. First, we look for a match with resolved_klass, checking for a match with holder_klass along the way. Then we continue iterating (not starting over) the itable using the second loop, checking only for a match with holder_klass. >> >> InterfaceCalls openjdk benchmark performance results on A53, A72, Neoverse N1 and V1 micro-architectures: >> >> >> Cortex-A53 (Pi 3 Model B Rev 1.2) >> >> test1stInt2Types 37.5 37.358 0.38 >> test1stInt3Types 160.166 148.04 8.19 >> test1stInt5Types 158.131 147.955 6.88 >> test2ndInt2Types 52.634 53.291 -1.23 >> test2ndInt3Types 201.39 181.603 10.90 >> test2ndInt5Types 195.722 176.707 10.76 >> testIfaceCall 157.453 140.498 12.07 >> testIfaceExtCall 175.46 154.351 13.68 >> testMonomorphic 32.052 32.039 0.04 >> AVG: 6.85 >> >> Cortex-A72 (Pi 4 Model B Rev 1.2) >> >> test1stInt2Types 27.4796 27.4738 0.02 >> test1stInt3Types 66.0085 64.9374 1.65 >> test1stInt5Types 67.9812 66.2316 2.64 >> test2ndInt2Types 32.0581 32.062 -0.01 >> test2ndInt3Types 68.2715 65.6643 3.97 >> test2ndInt5Types 68.1012 65.8024 3.49 >> testIfaceCall 64.0684 64.1811 -0.18 >> testIfaceExtCall 91.6226 81.5867 12.30 >> testMonomorphic 26.7161 26.7142 0.01 >> AVG: 2.66 >> >> Neoverse N1 (m6g.metal) >> >> test1stInt2Types 2.9104 2.9086 0.06 >> test1stInt3Types 10.9642 10.2909 6.54 >> test1stInt5Types 10.9607 10.2856 6.56 >> test2ndInt2Types 3.3410 3.3478 -0.20 >> test2ndInt3Types 12.3291 11.3089 9.02 >> test2ndInt5Types 12.328 11.2704 9.38 >> testIfaceCall 11.0598 10.3657 6.70 >> testIfaceExtCall 13.0692 11.2826 15.84 >> testMonomorphic 2.2354 2.2341 0.06 >> AVG: 6.00 >> >> Neoverse V1 (c7g.2xlarge) >> >> test1stInt2Types 2.2317 2.2320 -0.01 >> test1stInt3Types 6.6884 6.1911 8.03 >> test1stInt5Types 6.7334 6.2193 8.27 >> test2ndInt2Types 2.4002 2.4013 -0.04 >> test2ndInt3Types 7.9603 7.0372 13.12 >> test2ndInt5Types 7.9532 7.0474 12.85 >> testIfaceCall 6.7028 6.3272 5.94 >> testIfaceExtCall 8.3253 6.941... > > Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains one additional commit since the last revision: > > 8307352: AARCH64: Improve itable_stub src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 1274: > 1272: // } > 1273: // } while (temp_itbl_klass != 0); > 1274: // goto L_no_such_interface // Not found. How about the following variant, it will have back jumps to one point instead of two points: bind(L_loop_scan_resolved); ldr(temp_itbl_klass, Address(pre(scan_temp, scan_step))); bind(L_loop_scan_resolved_entry); cbz(temp_itbl_klass, L_no_such_interface); cmp(resolved_klass, temp_itbl_klass); br(Assembler::EQ, L_resolved_found); cmp(holder_klass, temp_itbl_klass); br(Assembler::NE, L_loop_scan_resolved); mov(holder_offset, scan_temp); b(L_loop_scan_resolved); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13792#discussion_r1268648723 From pli at openjdk.org Thu Jul 20 04:01:07 2023 From: pli at openjdk.org (Pengfei Li) Date: Thu, 20 Jul 2023 04:01:07 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options [v3] In-Reply-To: References: Message-ID: > As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. > > We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: Fix dangling else ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14897/files - new: https://git.openjdk.org/jdk/pull/14897/files/85a3c05d..7b439467 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14897&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14897&range=01-02 Stats: 7 lines in 1 file changed: 5 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14897/head:pull/14897 PR: https://git.openjdk.org/jdk/pull/14897 From pli at openjdk.org Thu Jul 20 04:01:07 2023 From: pli at openjdk.org (Pengfei Li) Date: Thu, 20 Jul 2023 04:01:07 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options [v2] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 09:48:35 GMT, Andrew Haley wrote: >> Pengfei Li has updated the pull request incrementally with two additional commits since the last revision: >> >> - Simplify some checks >> - Address aph's comment > > src/hotspot/cpu/aarch64/vm_version_aarch64.cpp line 580: > >> 578: int buf_used_len = os::snprintf_checked(buf, sizeof(buf), "0x%02x:0x%x:0x%03x:%d", _cpu, _variant, _model, _revision); >> 579: if (_model2) os::snprintf_checked(buf + buf_used_len, sizeof(buf) - buf_used_len, "(0x%03x)", _model2); >> 580: #define ADD_FEATURE_IF_SUPPORTED(id, name, bit) if (VM_Version::supports_##name()) strcat(buf, ", " #name); > > ```suggestion \ > #define ADD_FEATURE_IF_SUPPORTED(id, name, bit) \ > do { \ > if (VM_Version::supports_##name()) strcat(buf, ", " #name); \ > } while(0); > > Reason: no dangling else in macros, ever. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14897#discussion_r1268905526 From dholmes at openjdk.org Thu Jul 20 05:13:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 20 Jul 2023 05:13:52 GMT Subject: RFR: JDK-8305506: Add support for fractional values of SafepointTimeoutDelay [v5] In-Reply-To: References: Message-ID: <0Ofmblrfv41fpfP7-9Tb-AF27XArCbyWumBjp9U2wHw=.3fed74dd-a0fd-4d21-97dd-787de6993c80@github.com> On Fri, 5 May 2023 11:07:26 GMT, Wojciech Kudla wrote: >> As stated in https://bugs.openjdk.org/browse/JDK-8305506 this change replaces SafepointTimeoutDelay as integer value with a floating point type to support sub-millisecond SafepointTimeout thresholds. >> This is immensely useful for investigating time-to-safepoint issues in low latency space. > > Wojciech Kudla has updated the pull request incrementally with one additional commit since the last revision: > > Adjusted test case to verify integer value This is good to go. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13373#issuecomment-1643204293 From duke at openjdk.org Thu Jul 20 05:13:53 2023 From: duke at openjdk.org (Wojciech Kudla) Date: Thu, 20 Jul 2023 05:13:53 GMT Subject: Integrated: JDK-8305506: Add support for fractional values of SafepointTimeoutDelay In-Reply-To: References: Message-ID: On Thu, 6 Apr 2023 13:23:40 GMT, Wojciech Kudla wrote: > As stated in https://bugs.openjdk.org/browse/JDK-8305506 this change replaces SafepointTimeoutDelay as integer value with a floating point type to support sub-millisecond SafepointTimeout thresholds. > This is immensely useful for investigating time-to-safepoint issues in low latency space. This pull request has now been integrated. Changeset: 37c756a7 Author: Wojciech Kudla Committer: David Holmes URL: https://git.openjdk.org/jdk/commit/37c756a7be87153693c919f22d55189f3108ea2e Stats: 18 lines in 3 files changed: 7 ins; 0 del; 11 mod 8305506: Add support for fractional values of SafepointTimeoutDelay Reviewed-by: mdoerr, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/13373 From sspitsyn at openjdk.org Thu Jul 20 06:59:46 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 20 Jul 2023 06:59:46 GMT Subject: RFR: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist Message-ID: This problem is encountered when a JVMTI agent is loaded into running VM. The JvmtiExport::get_jvmti_interface() is called from the agent's Agent_OnAttach entrypoint. To support virtual threads it enables JVMTI notifications from the VirtualThread class with a call to: `JvmtiEnvBase::enable_virtual_threads_notify_jvmti()`. The problem is that there is no JVMTI environments at this point yet. This assert is hit when a virtual thread is created concurrently after the JVMTI notifications have been enabled but the requested JVMTI environment has not been created yet. The fix is to create a JVMTI env first and only then to enable the JVMTI notifications. This issue is very hard to reproduce. I had to use some tricks with adding `os::naked_short_nanosleep()` and also by refactoring the test `VThreadTLSTest.java`. At least, I was able to verify this test does not fail with my fix anymore. Testing: - submitted hundreds of `VThreadTLSTest.java` mach5 runs on several platforms in the `fastdebug` mode - in progress: mach5 tiers 1-6 ------------- Commit messages: - 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist Changes: https://git.openjdk.org/jdk/pull/14945/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14945&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8300051 Stats: 26 lines in 2 files changed: 14 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14945.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14945/head:pull/14945 PR: https://git.openjdk.org/jdk/pull/14945 From duke at openjdk.org Thu Jul 20 07:16:43 2023 From: duke at openjdk.org (SUN Guoyun) Date: Thu, 20 Jul 2023 07:16:43 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Wed, 19 Jul 2023 11:26:12 GMT, Andrew Haley wrote: > > Maybe I should write a performance test case for this PR. > > Seriously? You never measured the speed problem before writing a patch for it? How did you know it was a problem? I've tested specjbb2015 and some related JMH tests, really haven't found a performance improvement. But for the LoongArch architecture, StoreLoad is not equal to full barrier. And compared to C2's volatile write `MembarRelease-StoreNode-MembarVolatile`implementation, it is argued that c1 here membar() should be membar_storeload(). I previously thought that PPC, RISC-V's StoreLoad was not equivalent to full barrier, so I submitted the PR. Now it seems that only LoongArch has this problem, so should I make the following changes or abandon this PR?

--- a/src/hotspot/share/gc/shared/c1/barrierSetC1.cpp
+++ b/src/hotspot/share/gc/shared/c1/barrierSetC1.cpp
@@ -161,7 +161,7 @@ void BarrierSetC1::store_at_resolved(LIRAccess& access, LIR_Opr value) {
   }
 
   if (is_volatile && !support_IRIW_for_not_multiple_copy_atomic_cpu) {
-    __ membar();
+    __ membar_storeload();
   }
 }
------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1643400051 From dholmes at openjdk.org Thu Jul 20 07:30:42 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 20 Jul 2023 07:30:42 GMT Subject: RFR: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 06:53:26 GMT, Serguei Spitsyn wrote: > This problem is encountered when a JVMTI agent is loaded into running VM. The JvmtiExport::get_jvmti_interface() is called from the agent's Agent_OnAttach entrypoint. To support virtual threads it enables JVMTI notifications from the VirtualThread class with a call to: > `JvmtiEnvBase::enable_virtual_threads_notify_jvmti()`. > The problem is that there is no JVMTI environments at this point yet. This assert is hit when a virtual thread is created concurrently after the JVMTI notifications have been enabled but the requested JVMTI environment has not been created yet. > The fix is to create a JVMTI env first and only then to enable the JVMTI notifications. > > This issue is very hard to reproduce. I had to use some tricks with adding `os::naked_short_nanosleep()` and also by refactoring the test `VThreadTLSTest.java`. At least, I was able to verify this test does not fail with my fix anymore. > > Testing: > - submitted hundreds of `VThreadTLSTest.java` mach5 runs on several platforms in the `fastdebug` mode > - in progress: mach5 tiers 1-6 This seems a reasonable solution. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14945#pullrequestreview-1538610965 From mbaesken at openjdk.org Thu Jul 20 07:50:50 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 07:50:50 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray Message-ID: There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. Those assertions can be improved, e.g. by showing the bad index value used in case of failure. Example for an assertion in the 'at(index i)' access method old assertion looks like assert(0 <= i && i < _len) failed: illegal index new assertion looks like assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements ------------- Commit messages: - JDK-8312395 Changes: https://git.openjdk.org/jdk/pull/14946/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14946&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312395 Stats: 10 lines in 1 file changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14946.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14946/head:pull/14946 PR: https://git.openjdk.org/jdk/pull/14946 From dholmes at openjdk.org Thu Jul 20 08:01:42 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 20 Jul 2023 08:01:42 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 07:43:51 GMT, Matthias Baesken wrote: > There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. > Those assertions can be improved, e.g. by showing the bad index value used in case of failure. > > Example for an assertion in the 'at(index i)' access method > old assertion looks like > assert(0 <= i && i < _len) failed: illegal index > > new assertion looks like > assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements Approval in principle but I have a suggestion. Thanks src/hotspot/share/utilities/growableArray.hpp line 145: > 143: > 144: E& at(int i) { > 145: assert(0 <= i && i < _len, "illegal index %d, %d accessible elements", i, _len); How about: "Illegal index %d for length %d" ? ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14946#pullrequestreview-1538665870 PR Review Comment: https://git.openjdk.org/jdk/pull/14946#discussion_r1269080964 From mbaesken at openjdk.org Thu Jul 20 08:06:40 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 08:06:40 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 07:56:55 GMT, David Holmes wrote: >> There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. >> Those assertions can be improved, e.g. by showing the bad index value used in case of failure. >> >> Example for an assertion in the 'at(index i)' access method >> old assertion looks like >> assert(0 <= i && i < _len) failed: illegal index >> >> new assertion looks like >> assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements > > src/hotspot/share/utilities/growableArray.hpp line 145: > >> 143: >> 144: E& at(int i) { >> 145: assert(0 <= i && i < _len, "illegal index %d, %d accessible elements", i, _len); > > How about: > > "Illegal index %d for length %d" > > ? That sounds like a good suggestions ! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14946#discussion_r1269093770 From rehn at openjdk.org Thu Jul 20 08:28:55 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 20 Jul 2023 08:28:55 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v12] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 14 Jul 2023 20:03:24 GMT, Thomas Stuefe wrote: >> This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. >> >> --------------- >> >> This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. >> >> ### Background: >> >> The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. >> >> This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. >> >> To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. >> >> #### GLIBC internals >> >> The following information I took from the glibc source code and experimenting. >> >> ##### Why do we need to trim manually? Does the Glibc not trim on free? >> >> Upon `free()`, glibc may return memory to the OS if: >> - the returned block was mmap'ed >> - the returned block was not added to tcache or to fastbins >> - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: >> a) for the main arena, glibc attempts to lower the brk() >> b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. >> In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. >> >> So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. >> >> To increase the ... > > Thomas Stuefe has updated the pull request incrementally with two additional commits since the last revision: > > - Display unsupported text with UL warning level, + test > - Last Aleksey Feedback Marked as reviewed by rehn (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14781#pullrequestreview-1538731282 From vkempik at openjdk.org Thu Jul 20 08:31:56 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Thu, 20 Jul 2023 08:31:56 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: > Please review this fix. it eliminates misaligned loads in String.compare on risc-v > > for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, > it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. > > so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. > > I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. > > Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. > > Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. > > for large strings, the instrinsics from stubGenerator.cpp is used > for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. > > large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: > These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. > > This also enables regression test for string.Compare which previously was aarch64-only > > Testing: tier1 and tier2 clean on hifive. > > JMH testing, hifive: > before: > > Benchmark (delta) (size) Mode Cnt Score Error Units > StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op > StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op > StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op > StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op > StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op > StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? 0.458 ms/op > StringCompareToDifferentLength.compareToLL 2 ... Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: Simplify case for long LU UL compares ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14534/files - new: https://git.openjdk.org/jdk/pull/14534/files/3784bae9..054a885f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=01-02 Stats: 50 lines in 1 file changed: 0 ins; 30 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/14534.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14534/head:pull/14534 PR: https://git.openjdk.org/jdk/pull/14534 From mbaesken at openjdk.org Thu Jul 20 08:47:49 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 08:47:49 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v2] In-Reply-To: References: Message-ID: > There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. > Those assertions can be improved, e.g. by showing the bad index value used in case of failure. > > Example for an assertion in the 'at(index i)' access method > old assertion looks like > assert(0 <= i && i < _len) failed: illegal index > > new assertion looks like > assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: adjust some output and some asserts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14946/files - new: https://git.openjdk.org/jdk/pull/14946/files/16d7fabe..d4e3ba68 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14946&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14946&range=00-01 Stats: 8 lines in 1 file changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/14946.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14946/head:pull/14946 PR: https://git.openjdk.org/jdk/pull/14946 From mbaesken at openjdk.org Thu Jul 20 08:48:40 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 08:48:40 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 07:43:51 GMT, Matthias Baesken wrote: > There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. > Those assertions can be improved, e.g. by showing the bad index value used in case of failure. > > Example for an assertion in the 'at(index i)' access method > old assertion looks like > assert(0 <= i && i < _len) failed: illegal index > > new assertion looks like > assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements I adjusted the output following your suggestion and also added some output to a few more asserts. Btw any idea why so much 'this->_len' syntax is used at some places and not just _len or other field names ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14946#issuecomment-1643525521 From vkempik at openjdk.org Thu Jul 20 08:54:43 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Thu, 20 Jul 2023 08:54:43 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 08:31:56 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > Simplify case for long LU UL compares Results for latest update, from thead +AvoidUnaligned Benchmark (delta) (size) Mode Cnt Score Error Units StringCompareToDifferentLength.compareToLL 2 24 avgt 9 4.000 ? 0.106 ms/op StringCompareToDifferentLength.compareToLL 2 36 avgt 9 4.562 ? 0.089 ms/op StringCompareToDifferentLength.compareToLL 2 72 avgt 9 7.536 ? 0.085 ms/op StringCompareToDifferentLength.compareToLL 2 128 avgt 9 10.341 ? 0.287 ms/op StringCompareToDifferentLength.compareToLL 2 256 avgt 9 15.275 ? 0.249 ms/op StringCompareToDifferentLength.compareToLL 2 512 avgt 9 21.731 ? 0.413 ms/op StringCompareToDifferentLength.compareToLL 2 520 avgt 9 20.255 ? 0.287 ms/op StringCompareToDifferentLength.compareToLL 2 523 avgt 9 22.114 ? 0.641 ms/op StringCompareToDifferentLength.compareToLU 2 24 avgt 9 7.615 ? 0.032 ms/op StringCompareToDifferentLength.compareToLU 2 36 avgt 9 10.566 ? 0.096 ms/op StringCompareToDifferentLength.compareToLU 2 72 avgt 9 21.975 ? 0.288 ms/op StringCompareToDifferentLength.compareToLU 2 128 avgt 9 36.078 ? 0.419 ms/op StringCompareToDifferentLength.compareToLU 2 256 avgt 9 65.567 ? 0.715 ms/op StringCompareToDifferentLength.compareToLU 2 512 avgt 9 124.196 ? 0.636 ms/op StringCompareToDifferentLength.compareToLU 2 520 avgt 9 126.580 ? 1.431 ms/op StringCompareToDifferentLength.compareToLU 2 523 avgt 9 129.830 ? 1.857 ms/op StringCompareToDifferentLength.compareToUL 2 24 avgt 9 10.386 ? 0.368 ms/op StringCompareToDifferentLength.compareToUL 2 36 avgt 9 12.981 ? 0.271 ms/op StringCompareToDifferentLength.compareToUL 2 72 avgt 9 23.726 ? 0.532 ms/op StringCompareToDifferentLength.compareToUL 2 128 avgt 9 37.997 ? 0.482 ms/op StringCompareToDifferentLength.compareToUL 2 256 avgt 9 67.834 ? 0.915 ms/op StringCompareToDifferentLength.compareToUL 2 512 avgt 9 126.500 ? 0.771 ms/op StringCompareToDifferentLength.compareToUL 2 520 avgt 9 128.853 ? 2.059 ms/op StringCompareToDifferentLength.compareToUL 2 523 avgt 9 132.825 ? 3.318 ms/op StringCompareToDifferentLength.compareToUU 2 24 avgt 9 4.013 ? 0.012 ms/op StringCompareToDifferentLength.compareToUU 2 36 avgt 9 4.845 ? 0.148 ms/op StringCompareToDifferentLength.compareToUU 2 72 avgt 9 10.276 ? 0.313 ms/op StringCompareToDifferentLength.compareToUU 2 128 avgt 9 14.338 ? 0.201 ms/op StringCompareToDifferentLength.compareToUU 2 256 avgt 9 20.912 ? 0.550 ms/op StringCompareToDifferentLength.compareToUU 2 512 avgt 9 34.264 ? 0.660 ms/op StringCompareToDifferentLength.compareToUU 2 520 avgt 9 34.557 ? 0.252 ms/op StringCompareToDifferentLength.compareToUU 2 523 avgt 9 34.841 ? 0.380 ms/op -AvoidUnaligned Benchmark (delta) (size) Mode Cnt Score Error Units StringCompareToDifferentLength.compareToLL 2 24 avgt 9 2.557 ? 0.034 ms/op StringCompareToDifferentLength.compareToLL 2 36 avgt 9 3.507 ? 0.035 ms/op StringCompareToDifferentLength.compareToLL 2 72 avgt 9 7.513 ? 0.033 ms/op StringCompareToDifferentLength.compareToLL 2 128 avgt 9 9.095 ? 0.210 ms/op StringCompareToDifferentLength.compareToLL 2 256 avgt 9 13.666 ? 0.134 ms/op StringCompareToDifferentLength.compareToLL 2 512 avgt 9 20.131 ? 0.234 ms/op StringCompareToDifferentLength.compareToLL 2 520 avgt 9 20.115 ? 0.065 ms/op StringCompareToDifferentLength.compareToLL 2 523 avgt 9 20.865 ? 0.224 ms/op StringCompareToDifferentLength.compareToLU 2 24 avgt 9 7.091 ? 0.067 ms/op StringCompareToDifferentLength.compareToLU 2 36 avgt 9 9.883 ? 0.109 ms/op StringCompareToDifferentLength.compareToLU 2 72 avgt 9 22.037 ? 0.327 ms/op StringCompareToDifferentLength.compareToLU 2 128 avgt 9 35.914 ? 0.307 ms/op StringCompareToDifferentLength.compareToLU 2 256 avgt 9 65.673 ? 1.075 ms/op StringCompareToDifferentLength.compareToLU 2 512 avgt 9 124.257 ? 0.722 ms/op StringCompareToDifferentLength.compareToLU 2 520 avgt 9 126.128 ? 0.453 ms/op StringCompareToDifferentLength.compareToLU 2 523 avgt 9 129.413 ? 1.567 ms/op StringCompareToDifferentLength.compareToUL 2 24 avgt 9 9.661 ? 0.440 ms/op StringCompareToDifferentLength.compareToUL 2 36 avgt 9 12.106 ? 0.290 ms/op StringCompareToDifferentLength.compareToUL 2 72 avgt 9 23.903 ? 0.441 ms/op StringCompareToDifferentLength.compareToUL 2 128 avgt 9 38.722 ? 1.049 ms/op StringCompareToDifferentLength.compareToUL 2 256 avgt 9 67.640 ? 0.957 ms/op StringCompareToDifferentLength.compareToUL 2 512 avgt 9 126.744 ? 1.904 ms/op StringCompareToDifferentLength.compareToUL 2 520 avgt 9 129.400 ? 2.463 ms/op StringCompareToDifferentLength.compareToUL 2 523 avgt 9 130.664 ? 1.380 ms/op StringCompareToDifferentLength.compareToUU 2 24 avgt 9 3.662 ? 0.166 ms/op StringCompareToDifferentLength.compareToUU 2 36 avgt 9 4.552 ? 0.217 ms/op StringCompareToDifferentLength.compareToUU 2 72 avgt 9 9.399 ? 0.270 ms/op StringCompareToDifferentLength.compareToUU 2 128 avgt 9 13.688 ? 0.294 ms/op StringCompareToDifferentLength.compareToUU 2 256 avgt 9 20.033 ? 0.290 ms/op StringCompareToDifferentLength.compareToUU 2 512 avgt 9 33.512 ? 0.433 ms/op StringCompareToDifferentLength.compareToUU 2 520 avgt 9 33.796 ? 0.435 ms/op StringCompareToDifferentLength.compareToUU 2 523 avgt 9 33.983 ? 0.152 ms/op hifive results to follow soon ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1643534917 From matthias.baesken at sap.com Thu Jul 20 08:56:21 2023 From: matthias.baesken at sap.com (Baesken, Matthias) Date: Thu, 20 Jul 2023 08:56:21 +0000 Subject: growableArray.hpp remove_if_existing Message-ID: Hello, I noticed that the method remove_if_existing from growableArray.hpp GrowableArrayView : bool remove_if_existing(const E& elem) { // Returns TRUE if elem is removed. for (int i = 0; i < _len; i++) { if (_data[i] == elem) { remove_at(i); return true; } } return false; } Just removes the first element it finds and afterwards returns. I found this a bit surprising - should the method better be named remove_first_if_existing to clarify that it removed no more than one element ? Or is there a rule that no more than 1 element in the growableArray can match the ?==? ? Best regards, Matthias -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuefe at openjdk.org Thu Jul 20 09:05:52 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 09:05:52 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 08:47:49 GMT, Matthias Baesken wrote: >> There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. >> Those assertions can be improved, e.g. by showing the bad index value used in case of failure. >> >> Example for an assertion in the 'at(index i)' access method >> old assertion looks like >> assert(0 <= i && i < _len) failed: illegal index >> >> new assertion looks like >> assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > adjust some output and some asserts Looks good! looks > I adjusted the output following your suggestion and also added some output to a few more asserts. > > Btw any idea why so much 'this->_len' syntax is used at some places and not just _len or other field names ? Probably some IDE-code-completion-related artifacts. E.g. typing "this->" in CDT gives me all members, and maybe someone just hit Enter then and took the full completion ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14946#pullrequestreview-1538797067 PR Comment: https://git.openjdk.org/jdk/pull/14946#issuecomment-1643551305 From aph at openjdk.org Thu Jul 20 09:19:44 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 20 Jul 2023 09:19:44 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options [v3] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 04:01:07 GMT, Pengfei Li wrote: >> As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. >> >> We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix dangling else Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14897#pullrequestreview-1538826239 From xgong at openjdk.org Thu Jul 20 09:25:44 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 20 Jul 2023 09:25:44 GMT Subject: RFR: 8311130: AArch64: Sync SVE related CPU features with VM options [v3] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 04:01:07 GMT, Pengfei Li wrote: >> As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. >> >> We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. > > Pengfei Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix dangling else LGTM! Thanks for the fix! ------------- Marked as reviewed by xgong (Committer). PR Review: https://git.openjdk.org/jdk/pull/14897#pullrequestreview-1538836197 From xgong at openjdk.org Thu Jul 20 09:26:46 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 20 Jul 2023 09:26:46 GMT Subject: RFR: 8306136: [vectorapi] Intrinsics of VectorMask.laneIsSet() [v2] In-Reply-To: References: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> Message-ID: <406_YGENnrONPwIbKsapI9AQR0_Il8qdszHXGo0DpP0=.6733371f-c079-4ecc-ae88-14156d276d41@github.com> On Tue, 4 Jul 2023 01:03:20 GMT, Eric Liu wrote: >> VectorMask.laneIsSet() [1] is implemented based on VectorMask.toLong() [2], and it's performance highly depends on the intrinsification of toLong(). However, if `toLong()` is failed to intrinsify, on some architectures or unsupported species, it's much more expensive than pure getBits(). Besides, some CPUs (e.g. with Arm Neon) may not have efficient instructions to implementation toLong(), so we propose to intrinsify VectorMask.laneIsSet separately. >> >> This patch optimize laneIsSet() by calling the existing intrinsic method VectorSupport.extract(), which actually does not introduce new intrinsic method. The C2 compiler intrinsification logic to support _VectorExtract has also been extended to better support laneIsSet(). It tries to extract the mask's lane value with an ExtractUB node if the hardware backend supports it. While on hardware without ExtractUB backend support , c2 will still try to generate toLong() related nodes, which behaves the same as before the patch. >> >> Key changes in this patch: >> >> 1. Reuse intrinsic `VectorSupport.extract()` in Java side. No new intrinsic method is introduced. >> 2. In compiler, `ExtractUBNode` is generated if backend support is. If not, the original "toLong" pattern is generated if it's implemented. Otherwise, it uses the default Java `getBits[i]` rather than the expensive and complicated toLong() based implementation. >> 3. Enable `ExtractUBNode` on AArch64 to extract the lane value for a vector mask in compiler, together with changing its bottom type to TypeInt::BOOL. This helps optimize the conditional selection generated by >> >> ``` >> >> public boolean laneIsSet(int i) { >> return VectorSupport.extract(..., defaultImpl) == 1L; >> } >> >> ``` >> >> [Test] >> hotspot:compiler/vectorapi and jdk/incubator/vector passed. >> >> [Performance] >> >> Below shows the performance gain on 128-bit vector size Neon machine. For 64 and 128 SPECIES, the improvment caused by this intrinsics. For other SPECIES which can not be intrinfied, performance gain comes from the default Java implementation changes, i.e. getBits[i] vs. toLong(). >> >> >> Benchmark Gain (after/before) >> microMaskLaneIsSetByte128_con 2.47 >> microMaskLaneIsSetByte128_var 1.82 >> microMaskLaneIsSetByte256_con 3.01 >> microMaskLaneIsSetByte256_var 3.04 >> microMaskLaneIsSetByte512_con 4.83 >> microMaskLaneIsSetByte512_var 4.86 >> microMaskLaneIsSetByt... > > Eric Liu has updated the pull request incrementally with one additional commit since the last revision: > > Bug fix > > Change-Id: Ib223c4048b29875a62a27d6081ad36a125dec144 LGTM? ------------- Marked as reviewed by xgong (Committer). PR Review: https://git.openjdk.org/jdk/pull/14200#pullrequestreview-1538839002 From pli at openjdk.org Thu Jul 20 09:38:58 2023 From: pli at openjdk.org (Pengfei Li) Date: Thu, 20 Jul 2023 09:38:58 GMT Subject: Integrated: 8311130: AArch64: Sync SVE related CPU features with VM options In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 03:19:10 GMT, Pengfei Li wrote: > As discussed in PR #14533, keeping AArch64 flag `UseSVE` and its related CPU features in sync helps to simplify rules in IR tests. In this patch, we mask SVE related CPU features off if specified SVE level in VM option is lower than the hardware supported. Also, to support this change, we move the features string construction to the end of the `initialize()` function. > > We also revert IR rule changes in PR #14533 and fix some code styles. We tested almost full jtreg on SVE, SVE2 and non-SVE CPUs and no new issue is found after this patch. This pull request has now been integrated. Changeset: 32833285 Author: Pengfei Li URL: https://git.openjdk.org/jdk/commit/32833285bf94a17989db9bdfa86f58777ab9187d Stats: 174 lines in 10 files changed: 123 ins; 18 del; 33 mod 8311130: AArch64: Sync SVE related CPU features with VM options Reviewed-by: aph, xgong ------------- PR: https://git.openjdk.org/jdk/pull/14897 From stuefe at openjdk.org Thu Jul 20 10:29:33 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 10:29:33 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v13] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 41 additional commits since the last revision: - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap - Display unsupported text with UL warning level, + test - Last Aleksey Feedback - Aleksey trim stats; state printerich; David better gtest; misc stuff - David simple cosmetics - Bikeshed Trim log lines - Fix windows build - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap - Alekseys patch - Make test spikes more pronounced - ... and 31 more: https://git.openjdk.org/jdk/compare/1dcba1a8...74b1aacd ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/d22248f1..74b1aacd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=11-12 Stats: 12470 lines in 365 files changed: 9578 ins; 1515 del; 1377 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From mdoerr at openjdk.org Thu Jul 20 10:30:51 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 20 Jul 2023 10:30:51 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 I'd abandon it unless it is really beneficial for LoongArch which I can't tell. (Probably not.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1643671489 From dholmes at openjdk.org Thu Jul 20 10:32:46 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 20 Jul 2023 10:32:46 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 08:47:49 GMT, Matthias Baesken wrote: >> There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. >> Those assertions can be improved, e.g. by showing the bad index value used in case of failure. >> >> Example for an assertion in the 'at(index i)' access method >> old assertion looks like >> assert(0 <= i && i < _len) failed: illegal index >> >> new assertion looks like >> assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > adjust some output and some asserts One further suggestion but approving anyway. I think the use of `this->xxx` is just personal style. I find it unnecessary especially given the other uses in this file. Thanks. src/hotspot/share/utilities/growableArray.hpp line 263: > 261: void remove_range(int start, int end) { > 262: assert(0 <= start, "illegal start index %d", start); > 263: assert(start < end && end <= _len, "erase called with invalid range %d, %d", start, end); I think you'd want to see `_len` here too - suggestion: "erase called with invalid range (%d, %d) for length %d", start, end, _len ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14946#pullrequestreview-1538947826 PR Review Comment: https://git.openjdk.org/jdk/pull/14946#discussion_r1269263868 From duke at openjdk.org Thu Jul 20 11:04:51 2023 From: duke at openjdk.org (SUN Guoyun) Date: Thu, 20 Jul 2023 11:04:51 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 10:28:09 GMT, Martin Doerr wrote: > I'd abandon it unless it is really beneficial for LoongArch which I can't tell. (Probably not.) ok ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1643717994 From duke at openjdk.org Thu Jul 20 11:04:54 2023 From: duke at openjdk.org (SUN Guoyun) Date: Thu, 20 Jul 2023 11:04:54 GMT Subject: Withdrawn: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/14677 From stuefe at openjdk.org Thu Jul 20 11:25:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 11:25:40 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 10:29:54 GMT, David Holmes wrote: > > I think the use of `this->xxx` is just personal style. I find it unnecessary especially given the other uses in this file. > Interestingly enough I get compiler errors when leaving out "this->" (GCC 10.3). It may be because the member is part of a templatized base class. Odd, though. > Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14946#issuecomment-1643746053 From mbaesken at openjdk.org Thu Jul 20 12:04:36 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 12:04:36 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v3] In-Reply-To: References: Message-ID: > There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. > Those assertions can be improved, e.g. by showing the bad index value used in case of failure. > > Example for an assertion in the 'at(index i)' access method > old assertion looks like > assert(0 <= i && i < _len) failed: illegal index > > new assertion looks like > assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: improve assert in remove_range ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14946/files - new: https://git.openjdk.org/jdk/pull/14946/files/d4e3ba68..b9e7e49d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14946&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14946&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14946.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14946/head:pull/14946 PR: https://git.openjdk.org/jdk/pull/14946 From mbaesken at openjdk.org Thu Jul 20 12:25:45 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 12:25:45 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 10:27:57 GMT, David Holmes wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> adjust some output and some asserts > > src/hotspot/share/utilities/growableArray.hpp line 263: > >> 261: void remove_range(int start, int end) { >> 262: assert(0 <= start, "illegal start index %d", start); >> 263: assert(start < end && end <= _len, "erase called with invalid range %d, %d", start, end); > > I think you'd want to see `_len` here too - suggestion: > > "erase called with invalid range (%d, %d) for length %d", start, end, _len Hi David, I agree ! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14946#discussion_r1269383784 From aph-open at littlepinkcloud.com Thu Jul 20 12:43:10 2023 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Thu, 20 Jul 2023 13:43:10 +0100 Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: <402a840e-bb3e-55af-344f-f0becf210eb2@littlepinkcloud.com> On 7/19/23 21:39, Dean Long wrote: > It's confusing to me to rename the post-barrier but not the pre-barrier. Instead of changing membar_release + membar to > membar_release + membar_volatile, something like volatile_write_pre_barrier + volatile_write_post_barrier in the shared code makes more sense to me. I know that this patch is now withdrawn, but I'd like to make a point. Please don't use terms like volatile_write_post_barrier for StoreLoad if you can possibly avoid it, because that's only one way to implement Java's volatile. The assumption that volatile write is implemented by release; write; StoreLoad or similar permeates HotSpot, and is a pain for architectures like AArch64 which that don't do volatile that way. [ Comment for people reading this who don't know how AArch64 does work. AArch64 uses an instruction that acts something like release; store (called LDAR) for a volatile write, and ; load ; acquire (called STLR) for a volatile read. Note: this isn't the full story, and you have to look at the definition of things like barrier-ordered-before to know all the details. ] From mbaesken at openjdk.org Thu Jul 20 15:10:07 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 15:10:07 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v3] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 12:04:36 GMT, Matthias Baesken wrote: >> There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. >> Those assertions can be improved, e.g. by showing the bad index value used in case of failure. >> >> Example for an assertion in the 'at(index i)' access method >> old assertion looks like >> assert(0 <= i && i < _len) failed: illegal index >> >> new assertion looks like >> assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > improve assert in remove_range Hi David and Thomas, thanks for the reviews ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14946#issuecomment-1644096824 From mbaesken at openjdk.org Thu Jul 20 15:10:08 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 20 Jul 2023 15:10:08 GMT Subject: Integrated: JDK-8312395: Improve assertions in growableArray In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 07:43:51 GMT, Matthias Baesken wrote: > There are a number of assertions in growableArray , for example to check for a valid index to access/remove elements. > Those assertions can be improved, e.g. by showing the bad index value used in case of failure. > > Example for an assertion in the 'at(index i)' access method > old assertion looks like > assert(0 <= i && i < _len) failed: illegal index > > new assertion looks like > assert(0 <= i && i < _len) failed: illegal index -559030609, 1 accessible elements This pull request has now been integrated. Changeset: b772e67e Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/b772e67e2929afd9f9d6a4b08713e41f891667c0 Stats: 12 lines in 1 file changed: 0 ins; 0 del; 12 mod 8312395: Improve assertions in growableArray Reviewed-by: dholmes, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/14946 From stuefe at openjdk.org Thu Jul 20 15:12:49 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 15:12:49 GMT Subject: RFR: JDK-8312453: GrowableArray should assert for length overflow on append Message-ID: Trivial change to assert that we don't overflow on append. ------------- Commit messages: - JDK-8312453-GrowableArray-should-assert-for-length-overflow-on-append Changes: https://git.openjdk.org/jdk/pull/14951/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14951&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312453 Stats: 4 lines in 1 file changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14951.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14951/head:pull/14951 PR: https://git.openjdk.org/jdk/pull/14951 From stuefe at openjdk.org Thu Jul 20 15:24:57 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 15:24:57 GMT Subject: RFR: JDK-8310388: Shenandoah: Auxiliary bitmap is not madvised for THP Message-ID: <2KUO9EbhXSV-XrJFWWAY8KvDCDvkdQE7w8ew_MMzUEk=.3fc29a02-fa49-4b6a-8800-aab84525ba43@github.com> See details in JBS isse. Note that there is no actual functional difference. AUX bitmap did not use THPs before this patch and does not now either. The only difference is that before, the JVM *thought* the AUX bitmap uses THPs when in fact it did not. That caused confusion. (All this illuminates how badly thought out the ReservedSpace API really is. We pass in page size to the constructor, but then need to commit manually; THPs actually use madvise on commit, not on reservation, so we need to pass page size in *again* to commit. Ideally, ReservedSpace should handle committing for us with the page size it saved from reservation.) Note that this only takes care of preventing THP formation in "madvise" mode. In "always" mode, the kernel will do THP coalescation always. We could prevent it from doing so by advising against it via madvise, but that would require extension of the platform generic reservation APIs and is left for a different RFE. Ideally, nobody should use THP "always" mode. ------------- Commit messages: - JDK-8310388-Shenandoah-Auxiliary-bitmap-is-not-madvised-for-THP Changes: https://git.openjdk.org/jdk/pull/14953/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14953&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310388 Stats: 6 lines in 1 file changed: 4 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14953.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14953/head:pull/14953 PR: https://git.openjdk.org/jdk/pull/14953 From pchilanomate at openjdk.org Thu Jul 20 15:26:41 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Thu, 20 Jul 2023 15:26:41 GMT Subject: RFR: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 06:53:26 GMT, Serguei Spitsyn wrote: > This problem is encountered when a JVMTI agent is loaded into running VM. The JvmtiExport::get_jvmti_interface() is called from the agent's Agent_OnAttach entrypoint. To support virtual threads it enables JVMTI notifications from the VirtualThread class with a call to: > `JvmtiEnvBase::enable_virtual_threads_notify_jvmti()`. > The problem is that there is no JVMTI environments at this point yet. This assert is hit when a virtual thread is created concurrently after the JVMTI notifications have been enabled but the requested JVMTI environment has not been created yet. > The fix is to create a JVMTI env first and only then to enable the JVMTI notifications. > > This issue is very hard to reproduce. I had to use some tricks with adding `os::naked_short_nanosleep()` and also by refactoring the test `VThreadTLSTest.java`. At least, I was able to verify this test does not fail with my fix anymore. > > Testing: > - submitted hundreds of `VThreadTLSTest.java` mach5 runs on several platforms in the `fastdebug` mode > - in progress: mach5 tiers 1-6 Looks good. ------------- Marked as reviewed by pchilanomate (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14945#pullrequestreview-1539544495 From kbarrett at openjdk.org Thu Jul 20 15:33:51 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 20 Jul 2023 15:33:51 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 11:22:30 GMT, Thomas Stuefe wrote: > > I think the use of `this->xxx` is just personal style. I find it unnecessary especially given the other uses in this file. > > Interestingly enough I get compiler errors when leaving out "this->" (GCC 10.3). It may be because the member is part of a templatized base class. Odd, though. Qualification with `this->` is needed in a bunch of places here because of the C++ name lookup rules. Unqualified name lookup does not search dependent base classes. For example, in GrowableArrayWithAllocator an unqualified use of `_len` won't find that member in the base class, but will instead (probably) fail to find it in namespace scope during phase1 (pre-instantiation) template checking lookup (of 2 phase lookup), and so fail to compile. It isn't possible to resolve it in phase1 because the dependent base class definition isn't known until instantiation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14946#issuecomment-1644141370 From shade at openjdk.org Thu Jul 20 15:43:42 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 20 Jul 2023 15:43:42 GMT Subject: RFR: JDK-8310388: Shenandoah: Auxiliary bitmap is not madvised for THP In-Reply-To: <2KUO9EbhXSV-XrJFWWAY8KvDCDvkdQE7w8ew_MMzUEk=.3fc29a02-fa49-4b6a-8800-aab84525ba43@github.com> References: <2KUO9EbhXSV-XrJFWWAY8KvDCDvkdQE7w8ew_MMzUEk=.3fc29a02-fa49-4b6a-8800-aab84525ba43@github.com> Message-ID: On Thu, 20 Jul 2023 11:19:53 GMT, Thomas Stuefe wrote: > See details in JBS isse. > > Note that there is no actual functional difference. AUX bitmap did not use THPs before this patch and does not now either. The only difference is that before, the JVM *thought* the AUX bitmap uses THPs when in fact it did not. That caused confusion. > > (All this illuminates how badly thought out the ReservedSpace API really is. We pass in page size to the constructor, but then need to commit manually; THPs actually use madvise on commit, not on reservation, so we need to pass page size in *again* to commit. Ideally, ReservedSpace should handle committing for us with the page size it saved from reservation.) > > Note that this only takes care of preventing THP formation in "madvise" mode. In "always" mode, the kernel will do THP coalescation always. We could prevent it from doing so by advising against it via madvise, but that would require extension of the platform generic reservation APIs and is left for a different RFE. Ideally, nobody should use THP "always" mode. Looks fine. Does it make sense to enable `runtime/os/TestTracePageSizes.java` now? src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 287: > 285: // since we know these commits will be short lived. > 286: const size_t aux_bitmap_page_size = > 287: LINUX_ONLY(UseTransparentHugePages ? os::vm_page_size() :) bitmap_page_size; There is a `ifdef LINUX` THP-related block in the same file, so better to match its style: size_t aux_bitmap_page_size = bitmap_page_size; #ifdef LINUX // In THP "advise" mode, we refrain from advising the system to use large pages // since we know these commits will be short lived, and there is no reason to trash // the THP area with this bitmap. if (UseTransparentHugePages) { aux_bitmap_page_size = os::vm_page_size(); } #endif ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14953#pullrequestreview-1539574286 PR Review Comment: https://git.openjdk.org/jdk/pull/14953#discussion_r1269649730 From kbarrett at openjdk.org Thu Jul 20 15:44:41 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 20 Jul 2023 15:44:41 GMT Subject: RFR: JDK-8312453: GrowableArray should assert for length overflow on append In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 10:20:36 GMT, Thomas Stuefe wrote: > Trivial change to assert that we don't overflow on append. I think these changes are not needed and should not be made. src/hotspot/share/utilities/growableArray.hpp line 390: > 388: public: > 389: int append(const E& elem) { > 390: assert(this->_len != INT_MAX, "Overflow"); This isn't needed. `grow` (via `next_power_of_2`) already does the appropriate overflow checking. src/hotspot/share/utilities/growableArray.hpp line 519: > 517: void GrowableArrayWithAllocator::grow(int j) { > 518: const size_t next_p2 = next_power_of_2((size_t)j); > 519: assert(next_p2 < INT_MAX, "GrowableArray overflow (current capacity: %d)", this->_capacity); This isn't needed. `next_power_of_2` has all the appropriate overflow checking assertions. Just let it do it's thing with the original `int` argument. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14951#pullrequestreview-1539577342 PR Review Comment: https://git.openjdk.org/jdk/pull/14951#discussion_r1269652718 PR Review Comment: https://git.openjdk.org/jdk/pull/14951#discussion_r1269651817 From sspitsyn at openjdk.org Thu Jul 20 16:06:41 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 20 Jul 2023 16:06:41 GMT Subject: RFR: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 06:53:26 GMT, Serguei Spitsyn wrote: > This problem is encountered when a JVMTI agent is loaded into running VM. The JvmtiExport::get_jvmti_interface() is called from the agent's Agent_OnAttach entrypoint. To support virtual threads it enables JVMTI notifications from the VirtualThread class with a call to: > `JvmtiEnvBase::enable_virtual_threads_notify_jvmti()`. > The problem is that there is no JVMTI environments at this point yet. This assert is hit when a virtual thread is created concurrently after the JVMTI notifications have been enabled but the requested JVMTI environment has not been created yet. > The fix is to create a JVMTI env first and only then to enable the JVMTI notifications. > > This issue is very hard to reproduce. I had to use some tricks with adding `os::naked_short_nanosleep()` and also by refactoring the test `VThreadTLSTest.java`. At least, I was able to verify this test does not fail with my fix anymore. > > Testing: > - submitted hundreds of `VThreadTLSTest.java` mach5 runs on several platforms in the `fastdebug` mode > - in progress: mach5 tiers 1-6 David and Patricio, thank you for quick review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14945#issuecomment-1644195730 From stuefe at openjdk.org Thu Jul 20 16:09:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 16:09:43 GMT Subject: RFR: JDK-8312453: GrowableArray should assert for length overflow on append In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 15:41:56 GMT, Kim Barrett wrote: >> Trivial change to assert that we don't overflow on append. > > src/hotspot/share/utilities/growableArray.hpp line 390: > >> 388: public: >> 389: int append(const E& elem) { >> 390: assert(this->_len != INT_MAX, "Overflow"); > > This isn't needed. `grow` (via `next_power_of_2`) already does the appropriate overflow checking. Then we rely on the underlying growth algorithm to always work in power-of-2-steps? What if I want to plug in a different allocator with a different growth cadence? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14951#discussion_r1269679356 From stuefe at openjdk.org Thu Jul 20 16:09:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 16:09:43 GMT Subject: RFR: JDK-8312453: GrowableArray should assert for length overflow on append In-Reply-To: References: Message-ID: <4s2YSE1gcRl7hC-8efTs09R_YHrhoJFVjnqE9EnGPrM=.43c19692-1089-481f-b1a7-9e5a43cd94c8@github.com> On Thu, 20 Jul 2023 16:04:56 GMT, Thomas Stuefe wrote: >> src/hotspot/share/utilities/growableArray.hpp line 390: >> >>> 388: public: >>> 389: int append(const E& elem) { >>> 390: assert(this->_len != INT_MAX, "Overflow"); >> >> This isn't needed. `grow` (via `next_power_of_2`) already does the appropriate overflow checking. > > Then we rely on the underlying growth algorithm to always work in power-of-2-steps? What if I want to plug in a different allocator with a different growth cadence? Or, what if we have a simple error? To just accept an overflow here seems dangerous. That said, I wonder whether we should make the length uintx-sized. As it is now, we are limited to 1G max number of elements. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14951#discussion_r1269681158 From stuefe at openjdk.org Thu Jul 20 16:22:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 16:22:40 GMT Subject: RFR: JDK-8310388: Shenandoah: Auxiliary bitmap is not madvised for THP In-Reply-To: References: <2KUO9EbhXSV-XrJFWWAY8KvDCDvkdQE7w8ew_MMzUEk=.3fc29a02-fa49-4b6a-8800-aab84525ba43@github.com> Message-ID: On Thu, 20 Jul 2023 15:40:41 GMT, Aleksey Shipilev wrote: > Looks fine. > > Does it make sense to enable `runtime/os/TestTracePageSizes.java` now? Maybe. Possibly. I just found another bug in TestTracePageSizes where it does not correctly identify VMAs that are clearly backed by THPs, because it looks for the VM_xxx flags alone. I rather leave that test out for now to stabilize it a bit more; I don't want to start whacking the mole again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14953#issuecomment-1644218665 From stuefe at openjdk.org Thu Jul 20 16:39:47 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 20 Jul 2023 16:39:47 GMT Subject: RFR: JDK-8310388: Shenandoah: Auxiliary bitmap is not madvised for THP [v2] In-Reply-To: <2KUO9EbhXSV-XrJFWWAY8KvDCDvkdQE7w8ew_MMzUEk=.3fc29a02-fa49-4b6a-8800-aab84525ba43@github.com> References: <2KUO9EbhXSV-XrJFWWAY8KvDCDvkdQE7w8ew_MMzUEk=.3fc29a02-fa49-4b6a-8800-aab84525ba43@github.com> Message-ID: > See details in JBS isse. > > Note that there is no actual functional difference. AUX bitmap did not use THPs before this patch and does not now either. The only difference is that before, the JVM *thought* the AUX bitmap uses THPs when in fact it did not. That caused confusion. > > (All this illuminates how badly thought out the ReservedSpace API really is. We pass in page size to the constructor, but then need to commit manually; THPs actually use madvise on commit, not on reservation, so we need to pass page size in *again* to commit. Ideally, ReservedSpace should handle committing for us with the page size it saved from reservation.) > > Note that this only takes care of preventing THP formation in "madvise" mode. In "always" mode, the kernel will do THP coalescation always. We could prevent it from doing so by advising against it via madvise, but that would require extension of the platform generic reservation APIs and is left for a different RFE. Ideally, nobody should use THP "always" mode. Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Alekseys feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14953/files - new: https://git.openjdk.org/jdk/pull/14953/files/03ff47b3..5bc2b246 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14953&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14953&range=00-01 Stats: 9 lines in 1 file changed: 5 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14953.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14953/head:pull/14953 PR: https://git.openjdk.org/jdk/pull/14953 From shade at openjdk.org Thu Jul 20 16:45:39 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 20 Jul 2023 16:45:39 GMT Subject: RFR: JDK-8310388: Shenandoah: Auxiliary bitmap is not madvised for THP [v2] In-Reply-To: References: <2KUO9EbhXSV-XrJFWWAY8KvDCDvkdQE7w8ew_MMzUEk=.3fc29a02-fa49-4b6a-8800-aab84525ba43@github.com> Message-ID: On Thu, 20 Jul 2023 16:39:47 GMT, Thomas Stuefe wrote: >> See details in JBS isse. >> >> Note that there is no actual functional difference. AUX bitmap did not use THPs before this patch and does not now either. The only difference is that before, the JVM *thought* the AUX bitmap uses THPs when in fact it did not. That caused confusion. >> >> (All this illuminates how badly thought out the ReservedSpace API really is. We pass in page size to the constructor, but then need to commit manually; THPs actually use madvise on commit, not on reservation, so we need to pass page size in *again* to commit. Ideally, ReservedSpace should handle committing for us with the page size it saved from reservation.) >> >> Note that this only takes care of preventing THP formation in "madvise" mode. In "always" mode, the kernel will do THP coalescation always. We could prevent it from doing so by advising against it via madvise, but that would require extension of the platform generic reservation APIs and is left for a different RFE. Ideally, nobody should use THP "always" mode. > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > Alekseys feedback Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14953#pullrequestreview-1539685989 From dholmes at openjdk.org Thu Jul 20 21:44:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 20 Jul 2023 21:44:52 GMT Subject: RFR: JDK-8312395: Improve assertions in growableArray [v2] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 15:30:42 GMT, Kim Barrett wrote: >>> >>> I think the use of `this->xxx` is just personal style. I find it unnecessary especially given the other uses in this file. >>> >> >> Interestingly enough I get compiler errors when leaving out "this->" (GCC 10.3). It may be because the member is part of a templatized base class. Odd, though. >> >>> Thanks. > >> > I think the use of `this->xxx` is just personal style. I find it unnecessary especially given the other uses in this file. >> >> Interestingly enough I get compiler errors when leaving out "this->" (GCC 10.3). It may be because the member is part of a templatized base class. Odd, though. > > Qualification with `this->` is needed in a bunch of places here because of the > C++ name lookup rules. Unqualified name lookup does not search dependent base > classes. For example, in GrowableArrayWithAllocator an unqualified use of > `_len` won't find that member in the base class, but will instead (probably) > fail to find it in namespace scope during phase1 (pre-instantiation) template > checking lookup (of 2 phase lookup), and so fail to compile. It isn't possible > to resolve it in phase1 because the dependent base class definition isn't > known until instantiation. Thanks @kimbarrett ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14946#issuecomment-1644642974 From sspitsyn at openjdk.org Thu Jul 20 22:43:46 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 20 Jul 2023 22:43:46 GMT Subject: Integrated: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist In-Reply-To: References: Message-ID: <0qJN-FYZqqzLShKgu85lfikKm55aOAlay3i_8zzf_h4=.60dd2467-677b-4e85-966e-ab8156dda2b7@github.com> On Thu, 20 Jul 2023 06:53:26 GMT, Serguei Spitsyn wrote: > This problem is encountered when a JVMTI agent is loaded into running VM. The JvmtiExport::get_jvmti_interface() is called from the agent's Agent_OnAttach entrypoint. To support virtual threads it enables JVMTI notifications from the VirtualThread class with a call to: > `JvmtiEnvBase::enable_virtual_threads_notify_jvmti()`. > The problem is that there is no JVMTI environments at this point yet. This assert is hit when a virtual thread is created concurrently after the JVMTI notifications have been enabled but the requested JVMTI environment has not been created yet. > The fix is to create a JVMTI env first and only then to enable the JVMTI notifications. > > This issue is very hard to reproduce. I had to use some tricks with adding `os::naked_short_nanosleep()` and also by refactoring the test `VThreadTLSTest.java`. At least, I was able to verify this test does not fail with my fix anymore. > > Testing: > - submitted hundreds of `VThreadTLSTest.java` mach5 runs on several platforms in the `fastdebug` mode > - in progress: mach5 tiers 1-6 This pull request has now been integrated. Changeset: 783de32b Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/783de32b6af4383b5ba71b91c307a5dddd0dae13 Stats: 26 lines in 2 files changed: 14 ins; 12 del; 0 mod 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist Reviewed-by: dholmes, pchilanomate ------------- PR: https://git.openjdk.org/jdk/pull/14945 From david.holmes at oracle.com Fri Jul 21 02:36:05 2023 From: david.holmes at oracle.com (David Holmes) Date: Fri, 21 Jul 2023 12:36:05 +1000 Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: <402a840e-bb3e-55af-344f-f0becf210eb2@littlepinkcloud.com> References: <402a840e-bb3e-55af-344f-f0becf210eb2@littlepinkcloud.com> Message-ID: <006faf1e-d9dc-2340-cbbd-8edc82eaafb8@oracle.com> On 20/07/2023 10:43 pm, Andrew Haley wrote: > On 7/19/23 21:39, Dean Long wrote: >> It's confusing to me to rename the post-barrier but not the >> pre-barrier.? Instead of changing membar_release + membar to >> membar_release + membar_volatile, something like >> volatile_write_pre_barrier + volatile_write_post_barrier in the shared >> code makes more sense to me. > > I know that this patch is now withdrawn, but I'd like to make a point. > > Please don't use terms like volatile_write_post_barrier for StoreLoad > if you can possibly avoid it, because that's only one way to implement > Java's volatile. > > The assumption that volatile write is implemented by release; write; > StoreLoad > or similar permeates HotSpot, and is a pain for architectures like AArch64 > which that don't do volatile that way. My long term understanding of the issue is that the requirements for volatile accesses is determined by the nature of the surrounding accesses - as per Doug Lea's JMM Cookbook [1]. But neither the interpreter nor C1 consider this and only apply barriers to the actual volatile read/write in isolation - hence they (have to) assume the worst-case and use the strongest pre/post barriers that may be needed. And of course the kinds of barriers available depend on architecture. David ----- [1] https://gee.cs.oswego.edu/dl/jmm/cookbook.html > [ Comment for people reading this who don't know how AArch64 does work. > > AArch64 uses an instruction that acts something like > > ? release; store?? (called LDAR) > > for a volatile write, and > > ? ; load ; acquire?? (called STLR) > > for a volatile read. > > Note: this isn't the full story, and you have to look at the definition > of things like barrier-ordered-before to know all the details. ] > From eliu at openjdk.org Fri Jul 21 03:29:52 2023 From: eliu at openjdk.org (Eric Liu) Date: Fri, 21 Jul 2023 03:29:52 GMT Subject: Integrated: 8306136: [vectorapi] Intrinsics of VectorMask.laneIsSet() In-Reply-To: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> References: <74WpJFbQAX7TMMMr-qK9nUcS_lxHHbJEmzTuWbpahfk=.97501257-dd82-43ce-b334-fb6caab35118@github.com> Message-ID: <0f8bTddA4GBzZNsLV9G53y-eXb0ODBfRCjzKwN-b-tA=.200bf354-05d2-4e19-9f99-da181102e5b6@github.com> On Mon, 29 May 2023 11:53:03 GMT, Eric Liu wrote: > VectorMask.laneIsSet() [1] is implemented based on VectorMask.toLong() [2], and it's performance highly depends on the intrinsification of toLong(). However, if `toLong()` is failed to intrinsify, on some architectures or unsupported species, it's much more expensive than pure getBits(). Besides, some CPUs (e.g. with Arm Neon) may not have efficient instructions to implementation toLong(), so we propose to intrinsify VectorMask.laneIsSet separately. > > This patch optimize laneIsSet() by calling the existing intrinsic method VectorSupport.extract(), which actually does not introduce new intrinsic method. The C2 compiler intrinsification logic to support _VectorExtract has also been extended to better support laneIsSet(). It tries to extract the mask's lane value with an ExtractUB node if the hardware backend supports it. While on hardware without ExtractUB backend support , c2 will still try to generate toLong() related nodes, which behaves the same as before the patch. > > Key changes in this patch: > > 1. Reuse intrinsic `VectorSupport.extract()` in Java side. No new intrinsic method is introduced. > 2. In compiler, `ExtractUBNode` is generated if backend support is. If not, the original "toLong" pattern is generated if it's implemented. Otherwise, it uses the default Java `getBits[i]` rather than the expensive and complicated toLong() based implementation. > 3. Enable `ExtractUBNode` on AArch64 to extract the lane value for a vector mask in compiler, together with changing its bottom type to TypeInt::BOOL. This helps optimize the conditional selection generated by > > ``` > > public boolean laneIsSet(int i) { > return VectorSupport.extract(..., defaultImpl) == 1L; > } > > ``` > > [Test] > hotspot:compiler/vectorapi and jdk/incubator/vector passed. > > [Performance] > > Below shows the performance gain on 128-bit vector size Neon machine. For 64 and 128 SPECIES, the improvment caused by this intrinsics. For other SPECIES which can not be intrinfied, performance gain comes from the default Java implementation changes, i.e. getBits[i] vs. toLong(). > > > Benchmark Gain (after/before) > microMaskLaneIsSetByte128_con 2.47 > microMaskLaneIsSetByte128_var 1.82 > microMaskLaneIsSetByte256_con 3.01 > microMaskLaneIsSetByte256_var 3.04 > microMaskLaneIsSetByte512_con 4.83 > microMaskLaneIsSetByte512_var 4.86 > microMaskLaneIsSetByte64_con 1.57 > microMaskLaneIsSetByte64_var 1.18... This pull request has now been integrated. Changeset: d4aacdb4 Author: Eric Liu URL: https://git.openjdk.org/jdk/commit/d4aacdb44665db9f787e0a408e6b1ba925ad1048 Stats: 1099 lines in 40 files changed: 1021 ins; 29 del; 49 mod 8306136: [vectorapi] Intrinsics of VectorMask.laneIsSet() Reviewed-by: psandoz, xgong ------------- PR: https://git.openjdk.org/jdk/pull/14200 From jwaters at openjdk.org Fri Jul 21 04:35:46 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 21 Jul 2023 04:35:46 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v17] In-Reply-To: References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: On Fri, 23 Jun 2023 02:31:11 GMT, Julian Waters wrote: >> C++11 added the alignas attribute, for the purpose of specifying alignment on types, much like compiler specific syntax such as gcc's __attribute__((aligned(x))) or Visual C++'s __declspec(align(x)). >> >> We can phase out the use of the macro in favor of the standard attribute. In the meantime, we can replace the compiler specific definitions of ATTRIBUTE_ALIGNED with a portable definition. We might deprecate the use of the macro but changing its implementation quickly and cleanly applies the feature where the macro is being used. >> >> Note: With certain parts of HotSpot using ATTRIBUTE_ALIGNED so indiscriminately, this commit will likely take some time to get right >> >> This will require adding the alignas attribute to the list of language features approved for use in HotSpot code. (Completed with [8297912](https://github.com/openjdk/jdk/pull/11446)) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: > > - Merge branch 'openjdk:master' into alignas > - Merge branch 'master' into alignas > - Merge branch 'openjdk:master' into alignas > - alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - ... and 7 more: https://git.openjdk.org/jdk/compare/5a82fa3b...bb9ae391 Don't do that... ------------- PR Comment: https://git.openjdk.org/jdk/pull/11431#issuecomment-1644967933 From duke at openjdk.org Fri Jul 21 05:28:59 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 21 Jul 2023 05:28:59 GMT Subject: Integrated: 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 22:35:43 GMT, Ashutosh Mehra wrote: > Please review this PR for controlling timing information of C1 compilation phases using CITimeVerbose option, same as for C2 compilations. > I also took this opportunity to fix some other minor issues with logging: > 1. The PhaseTraceTime object should be constructed after setting the compiler data as PhaseTraceTime constructor calls `Compilation::current()`. For this reason I moved the statement `PhaseTraceTime timeit(_t_compile)` after the call to `_env->set_compiler_data(this);`. > 2. Previous step also allowed to remove the nullptr check for `Compilation::current()` in PhaseTraceTime constructor. > 3. I noticed the call to ComileLog->done() only prints `phase_done` tag and ignores all other parameters passed to it. This was due to a bug in `xmlStream::va_done` which is also fixed in here. > 4. Remove unnecessary statements in TracePhase destructor as the object already has the fields computed in the constructor. > 5. Some bikeshedding like TimerName -> TimerId and TracePhase::C -> TracePhase::_compile > > I felt these are all minor fixes so I clubbed them together. here If it feel inappropriate I can pull them in their own PRs. > > Testing: GHA testing passed This pull request has now been integrated. Changeset: 3e8f1eb8 Author: Ashutosh Mehra Committer: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/3e8f1eb82039d4943abf79380f35ad1ec1927b45 Stats: 39 lines in 4 files changed: 3 ins; 19 del; 17 mod 8311976: Inconsistency in usage of CITimeVerbose to generate compilation logs Reviewed-by: kvn, thartmann ------------- PR: https://git.openjdk.org/jdk/pull/14880 From dholmes at openjdk.org Fri Jul 21 05:31:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Jul 2023 05:31:45 GMT Subject: RFR: JDK-8312453: GrowableArray should assert for length overflow on append In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 15:41:56 GMT, Kim Barrett wrote: >> Trivial change to assert that we don't overflow on append. > > src/hotspot/share/utilities/growableArray.hpp line 390: > >> 388: public: >> 389: int append(const E& elem) { >> 390: assert(this->_len != INT_MAX, "Overflow"); > > This isn't needed. `grow` (via `next_power_of_2`) already does the appropriate overflow checking. I'm inclined to agree with @kimbarrett . We check length against capacity and grow if needed. The responsibility for checking overflow lies in grow, not append. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14951#discussion_r1270239346 From stuefe at openjdk.org Fri Jul 21 05:36:47 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 21 Jul 2023 05:36:47 GMT Subject: RFR: JDK-8312453: GrowableArray should assert for length overflow on append In-Reply-To: References: Message-ID: <-Mn83_KE_0RTlnJ7mQMhfndPVxVh_bb1460Q0TANDWQ=.c889e5bb-eec7-49e6-82ae-c91b6427d964@github.com> On Thu, 20 Jul 2023 10:20:36 GMT, Thomas Stuefe wrote: > Trivial change to assert that we don't overflow on append. Okay, @dholmes-ora and @kimbarrett convinced me to close this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14951#issuecomment-1645006268 From stuefe at openjdk.org Fri Jul 21 05:36:49 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 21 Jul 2023 05:36:49 GMT Subject: Withdrawn: JDK-8312453: GrowableArray should assert for length overflow on append In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 10:20:36 GMT, Thomas Stuefe wrote: > Trivial change to assert that we don't overflow on append. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/14951 From stuefe at openjdk.org Fri Jul 21 05:48:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 21 Jul 2023 05:48:43 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 19:40:39 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Feedback David >> - Merge branch 'master' into JDK-8312018-Improve-zero-base-optimized-reservation-of-class-space >> - Fix Windows >> - Feedback Roman; fix off-by-1; fix tracing >> - better zero-based reservation strategy > > src/hotspot/share/runtime/os.cpp line 1787: > >> 1785: #endif >> 1786: - os::vm_page_size()); >> 1787: } > > I am not sure what "attach" means in this sense. If it's the usable address range, wouldn't it need to be OS-specific? Yes, highest and lowest usable address. You are right in that its OS specific. I'll move these to the respective OS files. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1270248045 From stuefe at openjdk.org Fri Jul 21 05:51:57 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 21 Jul 2023 05:51:57 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 19:43:26 GMT, Ioi Lam wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Feedback David >> - Merge branch 'master' into JDK-8312018-Improve-zero-base-optimized-reservation-of-class-space >> - Fix Windows >> - Feedback Roman; fix off-by-1; fix tracing >> - better zero-based reservation strategy > > src/hotspot/share/memory/metaspace.cpp line 597: > >> 595: { >> 596: // First try for zero-base zero-shift (lower 4G); failing that, try for zero-based with max shift (lower 32G) >> 597: constexpr int num_tries = 8; > > num_tries should be computed instead of hard-coded. Why? In pre-existing code that does similar things, we always hardcode them implicitly (typically by attempt-mapping from A->B in hardcoded stride C). And how would I calculate it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1270249352 From thomas.stuefe at gmail.com Fri Jul 21 06:29:59 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 21 Jul 2023 08:29:59 +0200 Subject: Why is HeapBaseMinAddress by default so large? Message-ID: Hi, I am trying to understand why the default value for HeapBaseMinAddress is so large (2G on all of our platforms). When attempting to allocate heap or class space for unscaled encoding, this denies us half of the valuable space below 4G. I think it is unnecessary. Using lower address ranges is absolutely possible (e.g. Shenandoah reserves collection sets as low as 0x10000, and that works fine). One answer had been "because we don't want to obstruct sbrk() and cause malloc OOMs". But to my knowledge, the only platforms that had that particular problem were Solaris and AIX. Solaris is no more. AIX solves it differently and much smarter by declaring a configurable "no-reserve-zone" atop the sbrk() and preventing os::reserve_memory from mmap'ing there. That works wherever sbrk happens to be, and for every mapping the JVM reserves, not just for the heap. Looking through the history, I see that HeapBaseMinAddress was introduced by Vladimir in 2009 with 6791178: "Specialize for zero as the compressed oop vm heap base". From the start, the default values were mostly 2G. I could find no RFR thread for that change that discussed the patch. @Vladimir : do you maybe still know how you came up with the default? Thanks, Thomas [1] https://github.com/openjdk/jdk/commit/69f9ddee905f -------------- next part -------------- An HTML attachment was scrubbed... URL: From stuefe at openjdk.org Fri Jul 21 06:50:11 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 21 Jul 2023 06:50:11 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v14] In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 42 additional commits since the last revision: - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap - Display unsupported text with UL warning level, + test - Last Aleksey Feedback - Aleksey trim stats; state printerich; David better gtest; misc stuff - David simple cosmetics - Bikeshed Trim log lines - Fix windows build - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap - Alekseys patch - ... and 32 more: https://git.openjdk.org/jdk/compare/09d7ace0...9c5f6df3 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14781/files - new: https://git.openjdk.org/jdk/pull/14781/files/74b1aacd..9c5f6df3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14781&range=12-13 Stats: 1353 lines in 64 files changed: 1130 ins; 90 del; 133 mod Patch: https://git.openjdk.org/jdk/pull/14781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14781/head:pull/14781 PR: https://git.openjdk.org/jdk/pull/14781 From mdoerr at openjdk.org Fri Jul 21 10:24:48 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 21 Jul 2023 10:24:48 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v8] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 15:08:41 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > RISCV port src/hotspot/cpu/x86/templateTable_x86.cpp line 2746: > 2744: > 2745: // Store TOS into index register in case it is needed later > 2746: __ load_unsigned_byte(index, Address(cache, in_bytes(ResolvedFieldEntry::type_offset()))); I can't see where "index" is used as index. I'd call it e.g. "tos_state" (it's not the actual top of stack, it's the state of it, see comments above `enum TosState`). Please don't confuse these terms. That makes the code hardly readable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1270515919 From mdoerr at openjdk.org Fri Jul 21 10:50:43 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 21 Jul 2023 10:50:43 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee I've seen java/foreign/StdLibTest.java Total tests run: 41388, Passes: 39378, Failures: 2010, Skips: 0 (slowdebug build). Is this a known problem? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1645384285 From duke at openjdk.org Fri Jul 21 11:46:47 2023 From: duke at openjdk.org (sid8606) Date: Fri, 21 Jul 2023 11:46:47 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 10:46:14 GMT, Martin Doerr wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address suggestions from Jorn Vernee > > I've seen > > java/foreign/StdLibTest.java > Total tests run: 41388, Passes: 39378, Failures: 2010, Skips: 0 > > (slowdebug build). Is this a known problem? @TheRealMDoerr Thank you looking into this PR. I have tested again java/foreign/StdLibTest.java on my end. I see that it passes with slowdebug build (rebased with master). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1645452677 From mdoerr at openjdk.org Fri Jul 21 12:20:47 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 21 Jul 2023 12:20:47 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: <5ifOngKE41_DMQw35KiJQzf8L1MP_smglYyyNCipf_k=.5c7aad97-0615-4ef4-a61a-98d3ecb282b3@github.com> On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee I can see it failing with `make run-test TEST="jdk/java/foreign"` on g9-z15 (patch applied to jdk head). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1645492542 From stuefe at openjdk.org Fri Jul 21 12:25:11 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 21 Jul 2023 12:25:11 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v14] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Thu, 20 Jul 2023 08:26:21 GMT, Robbin Ehn wrote: >> Thomas Stuefe has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 42 additional commits since the last revision: >> >> - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap >> - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap >> - Display unsupported text with UL warning level, + test >> - Last Aleksey Feedback >> - Aleksey trim stats; state printerich; David better gtest; misc stuff >> - David simple cosmetics >> - Bikeshed Trim log lines >> - Fix windows build >> - Merge branch 'master' into JDK-8293114-JVM-should-trim-the-native-heap >> - Alekseys patch >> - ... and 32 more: https://git.openjdk.org/jdk/compare/4a503686...9c5f6df3 > > Marked as reviewed by rehn (Reviewer). Many thanks @robehn, @shipilev and @dholmes-ora ! Feels good to have this finally in. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1645495592 From stuefe at openjdk.org Fri Jul 21 12:25:14 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 21 Jul 2023 12:25:14 GMT Subject: Integrated: JDK-8293114: JVM should trim the native heap In-Reply-To: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: <55ndnfhY7FSVWdAFR78cFe3UlXystFq-YvcfdYNE_7o=.560b252e-d644-4482-bed8-a24347f5ed01@github.com> On Thu, 6 Jul 2023 06:54:22 GMT, Thomas Stuefe wrote: > This is a continuation of https://github.com/openjdk/jdk/pull/10085. I closed https://github.com/openjdk/jdk/pull/10085 because it had accumulated too much comment history and got confusing. For a history of this issue, see previous discussions [1] and the comment section of 10085. > > --------------- > > This RFE adds the option to trim the Glibc heap periodically. This can recover a significant memory footprint if the VM process suffers from high-but-rare malloc spikes. It does not matter who causes the spikes: the JDK or customer code running in the JVM process. > > ### Background: > > The Glibc is reluctant to return memory to the OS. Temporary malloc spikes often carry over as permanent RSS increase. Note that C-heap retention is difficult to observe. Since it is freed memory, it won't appear in NMT; it is just a part of RSS. > > This is, effectively, caching - a performance tradeoff by the glibc. It makes a lot of sense with applications that cause high traffic on the C-heap. The JVM, however, clusters allocations and often rolls its own memory management based on virtual memory for many of its use cases. > > To manually trim the C-heap, Glibc exposes `malloc_trim(3)`. With JDK 18 [2], we added a new jcmd command to *manually* trim the C-heap on Linux (`jcmd System.trim_native_heap`). We then observed customers running this command periodically to slim down process sizes of container-bound jvms. That is cumbersome, and the JVM can do this a lot better - among other things because it knows best when *not* to trim. > > #### GLIBC internals > > The following information I took from the glibc source code and experimenting. > > ##### Why do we need to trim manually? Does the Glibc not trim on free? > > Upon `free()`, glibc may return memory to the OS if: > - the returned block was mmap'ed > - the returned block was not added to tcache or to fastbins > - the returned block, possibly merged with its two immediate neighbors, had they been free, is larger than FASTBIN_CONSOLIDATION_THRESHOLD (64K) - in that case: > a) for the main arena, glibc attempts to lower the brk() > b) for mmap-ed heaps, glibc attempts to completely unmap or shrink the heap. > In both cases, (a) and (b), only the top portion of the heap is reclaimed. "Holes" in the middle of other in-use chunks are not reclaimed. > > So: glibc *may* automatically reclaim memory. In normal configurations, with typical C-heap allocation granularity, it is unlikely. > > To increase the chance of auto-reclamation happening, one can do one or more t... This pull request has now been integrated. Changeset: 9e4fc568 Author: Thomas Stuefe URL: https://git.openjdk.org/jdk/commit/9e4fc568a6f1a93c84a84d6cc5220c6eb4e546a5 Stats: 843 lines in 20 files changed: 835 ins; 0 del; 8 mod 8293114: JVM should trim the native heap Reviewed-by: shade, rehn, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/14781 From vkempik at openjdk.org Fri Jul 21 12:27:41 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 21 Jul 2023 12:27:41 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 08:31:56 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > Simplify case for long LU UL compares Updated results on hifive: Benchmark (delta) (size) Mode Cnt Score Error Units StringCompareToDifferentLength.compareToLL 2 24 avgt 9 8.694 ? 0.920 ms/op StringCompareToDifferentLength.compareToLL 2 36 avgt 9 9.583 ? 0.840 ms/op StringCompareToDifferentLength.compareToLL 2 72 avgt 9 11.671 ? 0.710 ms/op StringCompareToDifferentLength.compareToLL 2 128 avgt 9 15.860 ? 0.980 ms/op StringCompareToDifferentLength.compareToLL 2 256 avgt 9 21.407 ? 0.509 ms/op StringCompareToDifferentLength.compareToLL 2 512 avgt 9 32.435 ? 0.622 ms/op StringCompareToDifferentLength.compareToLL 2 520 avgt 9 30.888 ? 0.402 ms/op StringCompareToDifferentLength.compareToLL 2 523 avgt 9 33.265 ? 0.489 ms/op StringCompareToDifferentLength.compareToLU 2 24 avgt 9 14.335 ? 0.182 ms/op StringCompareToDifferentLength.compareToLU 2 36 avgt 9 18.220 ? 0.360 ms/op StringCompareToDifferentLength.compareToLU 2 72 avgt 9 32.331 ? 1.166 ms/op StringCompareToDifferentLength.compareToLU 2 128 avgt 9 50.462 ? 1.495 ms/op StringCompareToDifferentLength.compareToLU 2 256 avgt 9 91.972 ? 1.635 ms/op StringCompareToDifferentLength.compareToLU 2 512 avgt 9 174.093 ? 0.384 ms/op StringCompareToDifferentLength.compareToLU 2 520 avgt 9 178.736 ? 2.627 ms/op StringCompareToDifferentLength.compareToLU 2 523 avgt 9 183.112 ? 2.090 ms/op StringCompareToDifferentLength.compareToUL 2 24 avgt 9 16.344 ? 0.486 ms/op StringCompareToDifferentLength.compareToUL 2 36 avgt 9 19.724 ? 0.628 ms/op StringCompareToDifferentLength.compareToUL 2 72 avgt 9 34.062 ? 0.513 ms/op StringCompareToDifferentLength.compareToUL 2 128 avgt 9 52.019 ? 1.212 ms/op StringCompareToDifferentLength.compareToUL 2 256 avgt 9 93.111 ? 1.267 ms/op StringCompareToDifferentLength.compareToUL 2 512 avgt 9 176.734 ? 0.695 ms/op StringCompareToDifferentLength.compareToUL 2 520 avgt 9 179.531 ? 1.215 ms/op StringCompareToDifferentLength.compareToUL 2 523 avgt 9 184.424 ? 1.069 ms/op StringCompareToDifferentLength.compareToUU 2 24 avgt 9 9.966 ? 0.648 ms/op StringCompareToDifferentLength.compareToUU 2 36 avgt 9 11.915 ? 0.662 ms/op StringCompareToDifferentLength.compareToUU 2 72 avgt 9 15.422 ? 0.925 ms/op StringCompareToDifferentLength.compareToUU 2 128 avgt 9 19.726 ? 0.581 ms/op StringCompareToDifferentLength.compareToUU 2 256 avgt 9 31.103 ? 0.738 ms/op StringCompareToDifferentLength.compareToUU 2 512 avgt 9 51.847 ? 0.861 ms/op StringCompareToDifferentLength.compareToUU 2 520 avgt 9 53.083 ? 0.326 ms/op StringCompareToDifferentLength.compareToUU 2 523 avgt 9 53.209 ? 0.477 ms/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1645501190 From duke at openjdk.org Fri Jul 21 12:41:48 2023 From: duke at openjdk.org (Eric Nothum) Date: Fri, 21 Jul 2023 12:41:48 GMT Subject: RFR: JDK-8310316: Failing HotSpot Compiler directives are too verbose Message-ID: Previously jcmd printed the whole file if a compiler directive was added that was not in json format. This example illustrates the issue: ./jcmd 331311 Compiler.directives_add ./example.txt 331311: Syntax error on line 1 byte 1: Json must start with an object or an array. At 'This'. This is my very interesting text, followed by some more exciting text. Parsing of compiler directives failed Could not load file: ./example.txt The json error message is not printed if the silent field is set in the `DirectivesParser` object. The proposed change adds a boolean parameter silent that is propagated from `CompilerDirectivesAddDCmd::execute` to the `DirectivesParser` constructor. The default value for the new parameter is set to false, which represents the original behavior. In case where a compiler directive is added, the parameter is set to true and the error message will be reduced. The proposed change reduces the error message to: ./jcmd 335703 Compiler.directives_add ./example.txt 335703: Parsing of compiler directives failed Could not load file: ./example.txt ------------- Commit messages: - Fixed verbose compiler directive, by propagating silent from CompilerDirectivesAddDCmd::execute Changes: https://git.openjdk.org/jdk/pull/14957/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14957&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8310316 Stats: 10 lines in 3 files changed: 0 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14957.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14957/head:pull/14957 PR: https://git.openjdk.org/jdk/pull/14957 From mdoerr at openjdk.org Fri Jul 21 12:49:59 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 21 Jul 2023 12:49:59 GMT Subject: RFR: 8139457: Array bases are aligned at HeapWord granularity [v50] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 14:14:46 GMT, Roman Kennke wrote: >> See [JDK-8139457](https://bugs.openjdk.org/browse/JDK-8139457) for details. >> >> Basically, when running with -XX:-UseCompressedClassPointers, arrays will have a gap between the length field and the first array element, because array elements will only start at word-aligned offsets. This is not necessary for smaller-than-word elements. >> >> Also, while it is not very important now, it will become very important with Lilliput, which eliminates the Klass field and would always put the length field at offset 8, and leave a gap between offset 12 and 16. >> >> Testing: >> - [x] runtime/FieldLayout/ArrayBaseOffsets.java (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] bootcycle (x86_64, x86_32, aarch64, arm, riscv, s390) >> - [x] tier1 (x86_64, x86_32, aarch64, riscv) >> - [x] tier2 (x86_64, aarch64, riscv) >> - [x] tier3 (x86_64, riscv) > > Roman Kennke has updated the pull request incrementally with one additional commit since the last revision: > > RISCV fixes by @RealYFang PPC64 part looks ok. ------------- PR Comment: https://git.openjdk.org/jdk/pull/11044#issuecomment-1645529186 From duke at openjdk.org Fri Jul 21 13:39:43 2023 From: duke at openjdk.org (sid8606) Date: Fri, 21 Jul 2023 13:39:43 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee I am using below configure to build jdk --with-boot-jdk=~/jdk-20+36 \ --with-jtreg=../../jtreg/build/images/jtreg/ \ --with-gtest=../../googletest \ --with-debug-level=slowdebug \ --with-native-debug-symbols=internal \ With this build I see somehow all test cases are passing. In case any extra flags to set to jdk build let me know. `make run-test TEST="jdk/java/foreign" `on g9-z15, glibc 2.35 (patch applied to jdk head) TEST TOTAL PASS FAIL ERROR jtreg:test/jdk/java/foreign 88 88 0 0 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1645601051 From jvernee at openjdk.org Fri Jul 21 15:08:46 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 21 Jul 2023 15:08:46 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee StdLibTest tests several standard library functions. I suggest comment out some of the test methods to narrow down which function is problematic. I suspect it's `printf`. If that's the case, make sure you have the fix for: https://bugs.openjdk.org/browse/JDK-8308031 as well ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1645734481 From duke at openjdk.org Fri Jul 21 16:08:51 2023 From: duke at openjdk.org (duke) Date: Fri, 21 Jul 2023 16:08:51 GMT Subject: Withdrawn: 8308076: X86_64: make rheapbase register allocatable in zero based compressedOops mode In-Reply-To: References: Message-ID: On Mon, 15 May 2023 06:35:00 GMT, kuaiwei wrote: > In x86 64 mode, decode heap oop could use SIB without base if heap base is zero. like > > 0d1 movl R11, [,R9 << 3 + #72] (zero base compressed oop addressing) # compressed ptr ! Field: java/lang/ClassLoader.classAssertionStatus > > So rheapbase( r12 ) can be allocated as general register. > > Tier 1/2 tests are passed without new failure. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/13976 From jvernee at openjdk.org Fri Jul 21 17:24:45 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 21 Jul 2023 17:24:45 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee FWIW, I found a bug in StdLibTest while working on something else: the format string for `LONG` is `%d` while it should be `%lld`. This might mess up some printf implementation I think. This might fix your issue as well: diff --git a/test/jdk/java/foreign/StdLibTest.java b/test/jdk/java/foreign/StdLibTest.java index 0732f60525d9..c9ce60bfc09b 100644 --- a/test/jdk/java/foreign/StdLibTest.java +++ b/test/jdk/java/foreign/StdLibTest.java @@ -120,16 +119,20 @@ void test_rand() throws Throwable { @Test(dataProvider = "printfArgs") void test_printf(List args) throws Throwable { - String formatArgs = args.stream() - .map(a -> a.format) + String javaFormatArgs = args.stream() + .map(a -> a.javaFormat) + .collect(Collectors.joining(",")); + String nativeFormatArgs = args.stream() + .map(a -> a.nativeFormat) .collect(Collectors.joining(",")); - String formatString = "hello(" + formatArgs + ")\n"; + String javaFormatString = "hello(" + javaFormatArgs + ")\n"; + String nativeFormatString = "hello(" + nativeFormatArgs + ")\n"; - String expected = String.format(formatString, args.stream() + String expected = String.format(javaFormatString, args.stream() .map(a -> a.javaValue).toArray()); - int found = stdLibHelper.printf(formatString, args); + int found = stdLibHelper.printf(nativeFormatString, args); assertEquals(found, expected.length()); } @@ -377,21 +385,24 @@ public static Object[][] printfArgs() { } enum PrintfArg { - INT(int.class, C_INT, "%d", arena -> 42, 42), - LONG(long.class, C_LONG_LONG, "%d", arena -> 84L, 84L), - DOUBLE(double.class, C_DOUBLE, "%.4f", arena -> 1.2345d, 1.2345d), - STRING(MemorySegment.class, C_POINTER, "%s", arena -> arena.allocateFrom("str"), "str"); + INT(int.class, C_INT, "%d", "%d", arena -> 42, 42), + LONG(long.class, C_LONG_LONG, "%lld", "%d", arena -> 84L, 84L), + DOUBLE(double.class, C_DOUBLE, "%.4f", "%.4f", arena -> 1.2345d, 1.2345d), + STRING(MemorySegment.class, C_POINTER, "%s", "%s", arena -> arena.allocateFrom("str"), "str"); final Class carrier; final ValueLayout layout; - final String format; + final String nativeFormat; + final String javaFormat; final Function nativeValueFactory; final Object javaValue; - PrintfArg(Class carrier, L layout, String format, Function nativeValueFactory, Object javaValue) { + PrintfArg(Class carrier, L layout, String nativeFormat, String javaFormat, + Function nativeValueFactory, Object javaValue) { this.carrier = carrier; this.layout = layout; - this.format = format; + this.nativeFormat = nativeFormat; + this.javaFormat = javaFormat; this.nativeValueFactory = nativeValueFactory; this.javaValue = javaValue; } ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1646020324 From iklam at openjdk.org Fri Jul 21 19:57:12 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 21 Jul 2023 19:57:12 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v5] In-Reply-To: References: Message-ID: > This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. > > The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. > > Example with `-XX:-UseCompressedOops`: > > > 0x00000000100001f0: @@ Object java.lang.String > 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 4 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 > 0x0000000010000210: @@ Object [B length: 6 > 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W > > > Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: > > > 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String > 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 3 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops - added hints in test case - added test case - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops - @tstuefe and @matias9927 comments - 8308903: Print detailed info for Java objects in -Xlog:cds+map ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14841/files - new: https://git.openjdk.org/jdk/pull/14841/files/2f916d4c..dcd4c53b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14841&range=03-04 Stats: 21371 lines in 330 files changed: 10534 ins; 10070 del; 767 mod Patch: https://git.openjdk.org/jdk/pull/14841.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14841/head:pull/14841 PR: https://git.openjdk.org/jdk/pull/14841 From matsaave at openjdk.org Fri Jul 21 20:04:56 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 21 Jul 2023 20:04:56 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v9] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Martin and Fei comments - Merge branch 'master' into field_entry_8301996 - RISCV port - Interpreter fix and cleanup - Fred comments x86 - Frederic comments - Register naming improved - Fixed test and is_resolved - Merge branch 'master' into field_entry_8301996 - Coleen and Amit comments - ... and 1 more: https://git.openjdk.org/jdk/compare/af7f95e2...51e65f75 ------------- Changes: https://git.openjdk.org/jdk/pull/14129/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=08 Stats: 1313 lines in 42 files changed: 826 ins; 159 del; 328 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From matsaave at openjdk.org Fri Jul 21 21:14:17 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 21 Jul 2023 21:14:17 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Missing semicolon ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/51e65f75..8ae1e1c1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=08-09 Stats: 7 lines in 1 file changed: 0 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From dholmes at openjdk.org Fri Jul 21 21:20:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 21 Jul 2023 21:20:59 GMT Subject: RFR: JDK-8293114: JVM should trim the native heap [v14] In-Reply-To: References: <3m-t4CAcgNIvaWroKPD-k2qcTyiOAUppUDSAokuV9Fo=.41a94f1a-03df-43a5-8b91-c18a622a7526@github.com> Message-ID: On Fri, 21 Jul 2023 12:20:25 GMT, Thomas Stuefe wrote: >> Marked as reviewed by rehn (Reviewer). > > Many thanks @robehn, @shipilev and @dholmes-ora ! > > Feels good to have this finally in. @tstuefe unfortunately the test is failing intermittently in our CI. I will file a bug and assign it to you. Likely we will need a quick ProblemListing though. Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/14781#issuecomment-1646247948 From sspitsyn at openjdk.org Fri Jul 21 22:40:02 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 21 Jul 2023 22:40:02 GMT Subject: [jdk21] RFR: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist Message-ID: This is a clean 21 backport of the 22 fix: [JDK-8300051](https://bugs.openjdk.org/browse/JDK-8300051): assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist Testing: - TBD: mach5 tiers 1-5 ------------- Commit messages: - Backport 783de32b6af4383b5ba71b91c307a5dddd0dae13 Changes: https://git.openjdk.org/jdk21/pull/143/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=143&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8300051 Stats: 26 lines in 2 files changed: 14 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk21/pull/143.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/143/head:pull/143 PR: https://git.openjdk.org/jdk21/pull/143 From jwaters at openjdk.org Sat Jul 22 02:57:11 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 22 Jul 2023 02:57:11 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location Message-ID: Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all ------------- Commit messages: - arguments.hpp - arguments.hpp - globalDefinitions_gcc.hpp - assembler_aarch64.hpp - macroAssembler_aarch64.cpp - vmError.cpp - vmError.cpp - macroAssembler_aarch64.cpp - assembler_aarch64.hpp - os_linux.cpp - ... and 29 more: https://git.openjdk.org/jdk/compare/8cd43bff...58b52fce Changes: https://git.openjdk.org/jdk/pull/14969/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14969&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312502 Stats: 170 lines in 34 files changed: 60 ins; 0 del; 110 mod Patch: https://git.openjdk.org/jdk/pull/14969.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14969/head:pull/14969 PR: https://git.openjdk.org/jdk/pull/14969 From dzhang at openjdk.org Sat Jul 22 06:06:01 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Sat, 22 Jul 2023 06:06:01 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 21:14:17 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Missing semicolon Thank you for your update! Similar to aarch64, it would be better to add `tos_state` in `putfield_or_static` as well on RISC-V. Please help us to add the patch, thanks a lot! https://github.com/DingliZhang/jdk/commit/1fd29d6315c8bbca387b35c095e3dd0f4979fdb2 on branch https://github.com/DingliZhang/jdk/tree/pr-14129-putfield_or_static ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1646499614 From duke at openjdk.org Sat Jul 22 06:15:44 2023 From: duke at openjdk.org (sid8606) Date: Sat, 22 Jul 2023 06:15:44 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: <4-SkKe526mi_69V5zOsgHsLNTp1_vHzNaPFJluPXppM=.4ed68247-ac70-41b1-a4b6-e0daf864ab9f@github.com> On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee I have narrowed down the issue failing on glibc 2.31 but passes on glibc 2.35 on s390x. Two tests methods are failing **test_time** and **test_qsort**. @JornVernee test_printf test method is passing as it is. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1646500956 From stuefe at openjdk.org Sat Jul 22 07:02:53 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 22 Jul 2023 07:02:53 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v5] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 19:57:12 GMT, Ioi Lam wrote: >> This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. >> >> The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. >> >> Example with `-XX:-UseCompressedOops`: >> >> >> 0x00000000100001f0: @@ Object java.lang.String >> 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 >> - klass: 'java/lang/String' 0x0000000800010290 >> - ---- fields (total size 4 words): >> - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) >> - private final 'coder' 'B' @16 0 (0x00) >> - private 'hashIsZero' 'Z' @17 false (0x00) >> - injected 'flags' 'B' @18 1 (0x01) >> - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 >> 0x0000000010000210: @@ Object [B length: 6 >> 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e >> - klass: {type array byte} 0x00000008000024c8 >> - 0: 4e N >> - 1: 41 A >> - 2: 52 R >> - 3: 52 R >> - 4: 4f O >> - 5: 57 W >> >> >> Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: >> >> >> 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String >> 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a >> - klass: 'java/lang/String' 0x0000000800010290 >> - ---- fields (total size 3 words): >> - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) >> - private final 'coder' 'B' @16 0 (0x00) >> - private 'hashIsZero' 'Z' @17 false (0x00) >> - injected 'flags' 'B' @18 1 (0x01) >> - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 >> 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 >> 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f >> - klass: {type array byte} 0x00000008000024c8 >> - 0: 4e N >> - 1: 41 A >> - 2: 52 R >> - 3: 52 R >> - 4: 4f O >> - 5: 57 W > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops > - added hints in test case > - added test case > - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops > - @tstuefe and @matias9927 comments > - 8308903: Print detailed info for Java objects in -Xlog:cds+map This looks good to me. Nice addition! ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14841#pullrequestreview-1542060974 From mdoerr at openjdk.org Sat Jul 22 13:07:57 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 22 Jul 2023 13:07:57 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 21:14:17 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Missing semicolon src/hotspot/cpu/riscv/templateTable_riscv.cpp line 180: > 178: } > 179: // Load-acquire the bytecode to match store-release in ResolvedFieldEntry::fill_in() > 180: __ membar(MacroAssembler::AnyAny); Why do you need this membar? src/hotspot/cpu/riscv/templateTable_riscv.cpp line 2219: > 2217: } > 2218: // Load-acquire the bytecode to match store-release in ResolvedFieldEntry::fill_in() > 2219: __ membar(MacroAssembler::AnyAny); Same here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1271294236 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1271294300 From mdoerr at openjdk.org Sat Jul 22 13:19:46 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sat, 22 Jul 2023 13:19:46 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 21:14:17 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Missing semicolon src/hotspot/share/oops/resolvedFieldEntry.hpp line 53: > 51: u2 _field_index; // Index into field information in holder InstanceKlass > 52: u2 _cpool_index; // Constant pool index > 53: u1 _tos; // TOS state `_tos_state` would avoid confusion. src/hotspot/share/oops/resolvedFieldEntry.hpp line 125: > 123: _tos = tos; > 124: > 125: // This has to be done last Maybe better use `OrderAccess::release()` here instead of 2x `Atomic::release_store` above? That would fit to the comment. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1271295154 PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1271294973 From swpalmer at gmail.com Sat Jul 22 15:15:48 2023 From: swpalmer at gmail.com (Scott Palmer) Date: Sat, 22 Jul 2023 11:15:48 -0400 Subject: Unexpected performance of operator % vs & Message-ID: I hope this is the appropriate list for this question. Given the following Java code to test if a number is even or odd I observed unexpected results. boolean evenA = ((i % 2) == 0); boolean evenB = ((i & 1) == 0); I expect the bitwise AND to be the fastest, as modulo operations are generally slower. The masking operation should never take more CPU cycles than the modulo operation. This in fact is true, but only until the code is JIT compiled, and then the performance flips and the modulo version is notably faster. This remains the case for checking if 'i' is evenly divisible by 4, using (i % 4) vs (i & 3). Only when I get to checking for divisibility by 8 using (i % 8) vs (i & 7) do I see the performance shift to masking's favour after JIT compiling. I suspect there is an optimization somewhere in the JIT compiler that sees the modulo 2 pattern and outputs optimized code that is not in fact doing a modulo calculation. What I don't understand is how it ends up faster than the bit-mask version. The JIT compiler appears to be undoing my attempted optimization. Am I making a mistake here (other than assuming what is faster before profiling)? Is this something that could be improved/fixed in the compiler? Regards, Scott My simple experiment: public static void main(String [] args) { for (int i = 0; i < 10; i++) { long start = System.nanoTime(); long maskCount = mask(); var maskTime = Duration.ofNanos(System.nanoTime()-start); System.out.printf("%d mask method took: %s%n", maskCount, maskTime); start = System.nanoTime(); long moduloCount = modulo(); var moduloTime = Duration.ofNanos(System.nanoTime()-start); System.out.printf("%d modulo method took: %s%n", moduloCount, moduloTime); System.out.println("fastest: " + ((maskTime.compareTo(moduloTime) < 0) ? "MASK" : "MODULO")); } } static long modulo () { long count = 0; for (int i = 0; i < 2_000_000_000; i++) { if ((i % 2) == 0) count++; } return count; } static long mask() { long count = 0; for (int i = 0; i < 2_000_000_000; i++) { if ((i & 1) == 0) count++; } return count; } -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Jul 22 15:24:08 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 22 Jul 2023 17:24:08 +0200 (CEST) Subject: Unexpected performance of operator % vs & In-Reply-To: References: Message-ID: <1553227735.1831501.1690039448490.JavaMail.zimbra@univ-eiffel.fr> Please use JMH for your performance tests, otherwise what you see may just be an artifact of the way you have written the rest of the code, not the code you want to test. regards, R?mi > From: "Scott Palmer" > To: "hotspot-dev" > Sent: Saturday, July 22, 2023 5:15:48 PM > Subject: Unexpected performance of operator % vs & > I hope this is the appropriate list for this question. > Given the following Java code to test if a number is even or odd I observed > unexpected results. > boolean evenA = ((i % 2) == 0); > boolean evenB = ((i & 1) == 0); > I expect the bitwise AND to be the fastest, as modulo operations are generally > slower. The masking operation should never take more CPU cycles than the modulo > operation. > This in fact is true, but only until the code is JIT compiled, and then the > performance flips and the modulo version is notably faster. > This remains the case for checking if 'i' is evenly divisible by 4, using (i % > 4) vs (i & 3). Only when I get to checking for divisibility by 8 using (i % 8) > vs (i & 7) do I see the performance shift to masking's favour after JIT > compiling. > I suspect there is an optimization somewhere in the JIT compiler that sees the > modulo 2 pattern and outputs optimized code that is not in fact doing a modulo > calculation. What I don't understand is how it ends up faster than the bit-mask > version. The JIT compiler appears to be undoing my attempted optimization. > Am I making a mistake here (other than assuming what is faster before > profiling)? > Is this something that could be improved/fixed in the compiler? > Regards, > Scott > My simple experiment: > public static void main(String [] args) { > for (int i = 0; i < 10; i++) { > long start = System.nanoTime(); > long maskCount = mask(); > var maskTime = Duration.ofNanos(System.nanoTime()-start); > System.out.printf("%d mask method took: %s%n", maskCount, maskTime); > start = System.nanoTime(); > long moduloCount = modulo(); > var moduloTime = Duration.ofNanos(System.nanoTime()-start); > System.out.printf("%d modulo method took: %s%n", moduloCount, moduloTime); > System.out.println("fastest: " + ((maskTime.compareTo(moduloTime) < 0) ? "MASK" > : "MODULO")); > } > } > static long modulo () { > long count = 0; > for (int i = 0; i < 2_000_000_000; i++) { > if ((i % 2) == 0) > count++; > } > return count; > } > static long mask() { > long count = 0; > for (int i = 0; i < 2_000_000_000; i++) { > if ((i & 1) == 0) > count++; > } > return count; > } -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph-open at littlepinkcloud.com Sat Jul 22 17:16:44 2023 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Sat, 22 Jul 2023 18:16:44 +0100 Subject: Unexpected performance of operator % vs & In-Reply-To: <1553227735.1831501.1690039448490.JavaMail.zimbra@univ-eiffel.fr> References: <1553227735.1831501.1690039448490.JavaMail.zimbra@univ-eiffel.fr> Message-ID: <6551a16b-a793-1caa-5e2c-8b7c6a482ca5@littlepinkcloud.com> On 7/22/23 16:24, Remi Forax wrote: > Please use JMH for your performance tests, otherwise what you see may just be an artifact of the way you have written the rest of the code, not the code you want to test. And with "-prof perfasm" you'll be able to see what the optimizer did. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From swpalmer at gmail.com Sat Jul 22 22:32:37 2023 From: swpalmer at gmail.com (Scott Palmer) Date: Sat, 22 Jul 2023 18:32:37 -0400 Subject: Unexpected performance of operator % vs & In-Reply-To: <1553227735.1831501.1690039448490.JavaMail.zimbra@univ-eiffel.fr> References: <1553227735.1831501.1690039448490.JavaMail.zimbra@univ-eiffel.fr> Message-ID: Yes, I know. Though I provided "the rest of the code" in my email. But I should have just used JMH in the first place. I tested with JMH on two machines and got very different results.. To summarize: macOS Intel i7 JDK 20.0.1 mask: 1.082 ops/s modulo: 1.080 ops/s macOS Intel i7 JDK 17.0.7 mask: 1.040 ops/s modulo: 0.815 ops/s Linux Intel i5 JDK 20.0.1 mask: 0.759 ops/s modulo: 1.070 ops/s Linux Intel i5 JDK 17.0.8 mask: 0.738 ops/s modulo: 1.039 ops/s macOS: CPU: Quad-Core Intel Core i7-7700K OpenJDK 64-Bit Server VM Zulu17.42+19-CA (build 17.0.7+7-LTS, mixed mode, sharing) OpenJDK 64-Bit Server VM Zulu20.30+11-CA (build 20.0.1+9, mixed mode, sharing) Linux: CPU: 12th Gen Intel? Core? i5-1245U ? 12 OpenJDK 64-Bit Server VM Corretto-17.0.8.7.1 (build 17.0.8+7-LTS, mixed mode, sharing) OpenJDK 64-Bit Server VM Temurin-20.0.1+9 (build 20.0.1+9, mixed mode, sharing) On Linux the results are the opposite of what I would expect. Interestingly JDK 20.0.1 on macOS had no significant difference between the two methods, but on Linux it did. Regards, Scott On Sat, Jul 22, 2023 at 11:24?AM Remi Forax wrote: > Please use JMH for your performance tests, otherwise what you see may just > be an artifact of the way you have written the rest of the code, not the > code you want to test. > > regards, > R?mi > > ------------------------------ > > *From: *"Scott Palmer" > *To: *"hotspot-dev" > *Sent: *Saturday, July 22, 2023 5:15:48 PM > *Subject: *Unexpected performance of operator % vs & > > I hope this is the appropriate list for this question. > > Given the following Java code to test if a number is even or odd I > observed unexpected results. > > boolean evenA = ((i % 2) == 0); > > boolean evenB = ((i & 1) == 0); > > I expect the bitwise AND to be the fastest, as modulo operations are > generally slower. The masking operation should never take more CPU cycles > than the modulo operation. > This in fact is true, but only until the code is JIT compiled, and then > the performance flips and the modulo version is notably faster. > > This remains the case for checking if 'i' is evenly divisible by 4, using > (i % 4) vs (i & 3). Only when I get to checking for divisibility by 8 > using (i % 8) vs (i & 7) do I see the performance shift to masking's favour > after JIT compiling. > > I suspect there is an optimization somewhere in the JIT compiler that sees > the modulo 2 pattern and outputs optimized code that is not in fact doing a > modulo calculation. What I don't understand is how it ends up faster than > the bit-mask version. The JIT compiler appears to be undoing my attempted > optimization. > > Am I making a mistake here (other than assuming what is faster before > profiling)? > Is this something that could be improved/fixed in the compiler? > > Regards, > > Scott > > My simple experiment: > > public static void main(String [] args) { > for (int i = 0; i < 10; i++) { > long start = System.nanoTime(); > long maskCount = mask(); > var maskTime = Duration.ofNanos(System.nanoTime()-start); > System.out.printf("%d mask method took: %s%n", maskCount, > maskTime); > start = System.nanoTime(); > long moduloCount = modulo(); > var moduloTime = Duration.ofNanos(System.nanoTime()-start); > System.out.printf("%d modulo method took: %s%n", moduloCount, > moduloTime); > System.out.println("fastest: " + ((maskTime.compareTo(moduloTime) < > 0) ? "MASK" : "MODULO")); > } > } > static long modulo () { > long count = 0; > for (int i = 0; i < 2_000_000_000; i++) { > if ((i % 2) == 0) > count++; > } > return count; > } > static long mask() { > long count = 0; > for (int i = 0; i < 2_000_000_000; i++) { > if ((i & 1) == 0) > count++; > } > return count; > } > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From iklam at openjdk.org Sun Jul 23 03:37:53 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sun, 23 Jul 2023 03:37:53 GMT Subject: RFR: JDK-8312018: Improve zero-base-optimized reservation of class space [v4] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 05:48:57 GMT, Thomas Stuefe wrote: >> src/hotspot/share/memory/metaspace.cpp line 597: >> >>> 595: { >>> 596: // First try for zero-base zero-shift (lower 4G); failing that, try for zero-based with max shift (lower 32G) >>> 597: constexpr int num_tries = 8; >> >> num_tries should be computed instead of hard-coded. > > Why? In pre-existing code that does similar things, we always hardcode them implicitly (typically by attempt-mapping from A->B in hardcoded stride C). And how would I calculate it? Hmm, I am a bit unsure about the meaning of the `num_tries` parameter. It looks like Windows and Linux will scan the system's memory map and look for the first hole that's larger than `size`. However, if there are a lot of small holes at lower addresses, then doing 8 tries won't find a large enough block, even though such a block may exist below `unscaled_max`. The default implementation will try to find a free block with fixed steps. In this case, it seems like num_tries = (unscaled_max + size - 1) / size; would be more appropriate. Otherwise if `size` changes in the future (to be smaller), again you won't be able to find an appropriate block, even if one exists. So I am not sure if the caller passing in an arbitrary `num_tries` parameter is a good idea. Maybe the API needs to be redesigned. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14867#discussion_r1271375406 From jwaters at openjdk.org Sun Jul 23 05:50:58 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sun, 23 Jul 2023 05:50:58 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: References: Message-ID: > Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: - Merge branch 'openjdk:master' into patch-6 - arguments.hpp - arguments.hpp - globalDefinitions_gcc.hpp - assembler_aarch64.hpp - macroAssembler_aarch64.cpp - vmError.cpp - vmError.cpp - macroAssembler_aarch64.cpp - assembler_aarch64.hpp - ... and 30 more: https://git.openjdk.org/jdk/compare/c685c4b6...afff56f2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14969/files - new: https://git.openjdk.org/jdk/pull/14969/files/58b52fce..afff56f2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14969&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14969&range=00-01 Stats: 18185 lines in 210 files changed: 7290 ins; 10195 del; 700 mod Patch: https://git.openjdk.org/jdk/pull/14969.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14969/head:pull/14969 PR: https://git.openjdk.org/jdk/pull/14969 From mdoerr at openjdk.org Sun Jul 23 10:14:46 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Sun, 23 Jul 2023 10:14:46 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 21:14:17 GMT, Matias Saavedra Silva wrote: >> 8301996: Move field resolution information out of the cpCache > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Missing semicolon Thanks for improving readability! An initial PPC64 part is here: https://github.com/TheRealMDoerr/jdk/commit/612ee9dcd4a50c0f3eb7e1289789b96468acbb2c. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1646800813 From kbarrett at openjdk.org Sun Jul 23 11:53:59 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sun, 23 Jul 2023 11:53:59 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: References: Message-ID: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> On Sun, 23 Jul 2023 05:50:58 GMT, Julian Waters wrote: >> Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - arguments.hpp > - arguments.hpp > - globalDefinitions_gcc.hpp > - assembler_aarch64.hpp > - macroAssembler_aarch64.cpp > - vmError.cpp > - vmError.cpp > - macroAssembler_aarch64.cpp > - assembler_aarch64.hpp > - ... and 30 more: https://git.openjdk.org/jdk/compare/65e01d8a...afff56f2 Why? What is the benefit from this that makes the resulting code churn worthwhile? We already discussed this kind of code churn a bit circa https://github.com/openjdk/jdk/pull/11081#issuecomment-1313274792 and didn't like it then. I don't see anything to change that. The style guide only talks about the new C++ `[[attribute...]]` syntax, which has a couple of valid locations. These are all gcc `__attribute__` and MSVC `__declspec`, and are often located in places where the new syntax isn't permitted. Moving the non-standard "attributes" around has the potential to change semantics. I don't know that it does, but this PR should contain discussion and references to documentation showing it doesn't. *If* it is to be done, there are some former one-liners that have been made multi-line by moving an attribute macro, where there were multiple in a cluster with no blank lines between them. (Unwritten) HotSpot style only elides whitespace between declarations when they are all one-liners. I think I prefer preceding attributes to be on their own line, rather than on the same line as the declaration, so I can mostly skip over them as I'm reading the code. But that might be just me; I don't think that's been discussed by the Group. It probably should have been brought up when permitting attributes was added to the style guide, but it doesn't look like that happened. src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 38: > 36: #ifdef __GNUC__ > 37: > 38: // ISO C++ asm is always implicitly volatile I can find no evidence for this claim, and it seems to me likely incorrect. This is also way outside the described scope of this PR. src/hotspot/share/c1/c1_CFGPrinter.hpp line 66: > 64: void dec_indent(); > 65: ATTRIBUTE_PRINTF(2, 3) > 66: void print(const char* format, ...); This is an example where rearranging the attributes is out of character with usual practice. And I think it makes it harder to read. src/hotspot/share/compiler/compileLog.hpp line 75: > 73: > 74: ATTRIBUTE_PRINTF(2, 3) > 75: void set_context(const char* format, ...); Whitespace between return type and function name is pretty pointless here. Also above. src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 161: > 159: #define NOINLINE [[gnu::noinline]] > 160: #define ALWAYSINLINE [[gnu::always_inline]] inline > 161: #define ATTRIBUTE_FLATTEN [[gnu::flatten]] This is way beyond the described scope of this PR. src/hotspot/share/utilities/xmlstream.hpp line 149: > 147: void text(const char* format, ...); > 148: ATTRIBUTE_PRINTF(2, 0) > 149: void va_text(const char* format, va_list ap) { This file is a particularly bad (to me) example of what happens without whitespace between a declaration and the attributes for the next declaration. I find this really hard to parse. And the extra whitespace following return types makes it even worse for me. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14969#pullrequestreview-1542234010 PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271430790 PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271431103 PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271431994 PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271432593 PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271433180 From jwaters at openjdk.org Sun Jul 23 12:38:47 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sun, 23 Jul 2023 12:38:47 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> References: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> Message-ID: <2f0Rc29Pp42L2yJD43ynG-z0bZ65EyovRNU8D1IiC5o=.d9d98e2e-5405-4430-ab42-4570435a5468@github.com> On Sun, 23 Jul 2023 11:32:55 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-6 >> - arguments.hpp >> - arguments.hpp >> - globalDefinitions_gcc.hpp >> - assembler_aarch64.hpp >> - macroAssembler_aarch64.cpp >> - vmError.cpp >> - vmError.cpp >> - macroAssembler_aarch64.cpp >> - assembler_aarch64.hpp >> - ... and 30 more: https://git.openjdk.org/jdk/compare/67e2bb59...afff56f2 > > src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 38: > >> 36: #ifdef __GNUC__ >> 37: >> 38: // ISO C++ asm is always implicitly volatile > > I can find no evidence for this claim, and it seems to me likely incorrect. This is also way outside the > described scope of this PR. Hi Kim, it's actually listed under https://gcc.gnu.org/onlinedocs/gcc/Basic-Asm.html as follows: > Qualifiers volatile The optional volatile qualifier has no effect. All basic asm blocks are implicitly volatile. I'll take this outside of this PR though ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271440784 From jwaters at openjdk.org Sun Jul 23 12:46:59 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sun, 23 Jul 2023 12:46:59 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> References: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> Message-ID: On Sun, 23 Jul 2023 11:45:25 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-6 >> - arguments.hpp >> - arguments.hpp >> - globalDefinitions_gcc.hpp >> - assembler_aarch64.hpp >> - macroAssembler_aarch64.cpp >> - vmError.cpp >> - vmError.cpp >> - macroAssembler_aarch64.cpp >> - assembler_aarch64.hpp >> - ... and 30 more: https://git.openjdk.org/jdk/compare/624faab5...afff56f2 > > src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 161: > >> 159: #define NOINLINE [[gnu::noinline]] >> 160: #define ALWAYSINLINE [[gnu::always_inline]] inline >> 161: #define ATTRIBUTE_FLATTEN [[gnu::flatten]] > > This is way beyond the described scope of this PR. I changed these to the standard attributes so the compilers would concretely enforce the checks of which areas the attributes appertained to, as opposed to the regular __attribute__ syntax which don't perform such checks (Same reasoning for the ones in compilerWarnings as well) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271441981 From aph-open at littlepinkcloud.com Sun Jul 23 16:46:56 2023 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Sun, 23 Jul 2023 17:46:56 +0100 Subject: Unexpected performance of operator % vs & In-Reply-To: References: <1553227735.1831501.1690039448490.JavaMail.zimbra@univ-eiffel.fr> Message-ID: On 7/22/23 23:32, Scott Palmer wrote: > On Linux the results are the opposite of what I would expect. > Interestingly JDK 20.0.1 on macOS had no significant difference between the > two methods, but on Linux it did. Mmm, but look at those numbers. That's ~ two clock cycles per iteration. I think we can safely say that a modulo instruction is not being used. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at openjdk.org Sun Jul 23 16:54:59 2023 From: aph at openjdk.org (Andrew Haley) Date: Sun, 23 Jul 2023 16:54:59 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: References: Message-ID: On Sun, 23 Jul 2023 05:50:58 GMT, Julian Waters wrote: >> Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - arguments.hpp > - arguments.hpp > - globalDefinitions_gcc.hpp > - assembler_aarch64.hpp > - macroAssembler_aarch64.cpp > - vmError.cpp > - vmError.cpp > - macroAssembler_aarch64.cpp > - assembler_aarch64.hpp > - ... and 30 more: https://git.openjdk.org/jdk/compare/57f455e2...afff56f2 > Why? What is the benefit from this that makes the resulting code churn worthwhile? > > We already discussed this kind of code churn a bit circa [#11081 (comment)](https://github.com/openjdk/jdk/pull/11081#issuecomment-1313274792) and didn't like it then. I don't see anything to change that. I agree. Such changes don't much help maintainers, and speaking as the lead of both the 8u and 11u projects, really won't help backporting. I'd say no. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14969#issuecomment-1646888896 From dholmes at openjdk.org Sun Jul 23 21:44:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 23 Jul 2023 21:44:59 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> References: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> Message-ID: On Sun, 23 Jul 2023 11:35:19 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-6 >> - arguments.hpp >> - arguments.hpp >> - globalDefinitions_gcc.hpp >> - assembler_aarch64.hpp >> - macroAssembler_aarch64.cpp >> - vmError.cpp >> - vmError.cpp >> - macroAssembler_aarch64.cpp >> - assembler_aarch64.hpp >> - ... and 30 more: https://git.openjdk.org/jdk/compare/f058acfd...afff56f2 > > src/hotspot/share/c1/c1_CFGPrinter.hpp line 66: > >> 64: void dec_indent(); >> 65: ATTRIBUTE_PRINTF(2, 3) >> 66: void print(const char* format, ...); > > This is an example where rearranging the attributes is out of character with usual practice. > And I think it makes it harder to read. I agree with Kim, I do not like this style of using a new line for the attribute. I also prefer to see these attributes in their original location where I can generally ignore them while reading the code. > src/hotspot/share/utilities/xmlstream.hpp line 149: > >> 147: void text(const char* format, ...); >> 148: ATTRIBUTE_PRINTF(2, 0) >> 149: void va_text(const char* format, va_list ap) { > > This file is a particularly bad (to me) example of what happens without whitespace between a declaration > and the attributes for the next declaration. I find this really hard to parse. And the extra whitespace following > return types makes it even worse for me. +1 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271561053 PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271561299 From dholmes at openjdk.org Sun Jul 23 21:45:00 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 23 Jul 2023 21:45:00 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: References: <6TM5USxRwOvQTy8muWykwuq_o0b-2nCM_8-fFoLGVIg=.ae329b8d-8781-407b-ba4a-fb8d8abe685c@github.com> Message-ID: On Sun, 23 Jul 2023 12:43:51 GMT, Julian Waters wrote: >> src/hotspot/share/utilities/globalDefinitions_gcc.hpp line 161: >> >>> 159: #define NOINLINE [[gnu::noinline]] >>> 160: #define ALWAYSINLINE [[gnu::always_inline]] inline >>> 161: #define ATTRIBUTE_FLATTEN [[gnu::flatten]] >> >> This is way beyond the described scope of this PR. > > I changed these to the standard attributes so the compilers would concretely enforce the checks of which areas the attributes appertained to, as opposed to the regular __attribute__ syntax which don't perform such checks (Same reasoning for the ones in compilerWarnings as well) Again I agree with Kim, this is not simply moving an attribute to a new location. It is a change that has to be validated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14969#discussion_r1271561284 From dholmes at openjdk.org Sun Jul 23 21:44:57 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 23 Jul 2023 21:44:57 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: References: Message-ID: On Sun, 23 Jul 2023 05:50:58 GMT, Julian Waters wrote: >> Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - arguments.hpp > - arguments.hpp > - globalDefinitions_gcc.hpp > - assembler_aarch64.hpp > - macroAssembler_aarch64.cpp > - vmError.cpp > - vmError.cpp > - macroAssembler_aarch64.cpp > - assembler_aarch64.hpp > - ... and 30 more: https://git.openjdk.org/jdk/compare/f058acfd...afff56f2 > Someone had to do it, so I did. Why did someone _have_ to do it? Is it incorrect? Unless there is some semantic significance to this then it is just unnecessary churn and I really don't like the chosen style. Sorry. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14969#issuecomment-1646966208 From dholmes at openjdk.org Sun Jul 23 23:07:58 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 23 Jul 2023 23:07:58 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: On Sat, 22 Jul 2023 13:11:53 GMT, Martin Doerr wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Missing semicolon > > src/hotspot/share/oops/resolvedFieldEntry.hpp line 125: > >> 123: _tos = tos; >> 124: >> 125: // This has to be done last > > Maybe better use `OrderAccess::release()` here instead of 2x `Atomic::release_store` above? That would fit to the comment. Is the comment accurate? The semantics would be different. Does seeing the update to `_put_code` require that the update to `_get_code` is also seen? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1271573904 From dzhang at openjdk.org Mon Jul 24 02:11:47 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Mon, 24 Jul 2023 02:11:47 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: <0MQ1d57orKq75lSKIhByN-gJRb84V_qAnFpmictXt5Q=.3619e3e4-1a6d-466f-8a85-169067fc4a3c@github.com> On Sat, 22 Jul 2023 13:04:23 GMT, Martin Doerr wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Missing semicolon > > src/hotspot/cpu/riscv/templateTable_riscv.cpp line 180: > >> 178: } >> 179: // Load-acquire the bytecode to match store-release in ResolvedFieldEntry::fill_in() >> 180: __ membar(MacroAssembler::AnyAny); > > Why do you need this membar? I am referencing the mapping from ARM memory operations onto RISC-V memory instructions in riscv-spec [1]: > Since RISC-V does not currently have plain load and store opcodes with aq or rl annotations, ARM load-acquire and store-release operations should be mapped using fences instead. Furthermore, in order to enforce store-release-to-load-acquire ordering, there must be a FENCE RW,RW between the store-release and load-acquire; So I am trying to be conservative here and added this membar to enforce store-release-to-load-acquire ordering. I can remove that if we are sure this is not necessary here. [1] https://github.com/riscv/riscv-isa-manual/blob/riscv-isa-release-1239329-2023-05-23/src/mm-eplan.adoc#armmappings ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1271653513 From kbarrett at openjdk.org Mon Jul 24 02:36:45 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 24 Jul 2023 02:36:45 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v2] In-Reply-To: References: Message-ID: On Sun, 23 Jul 2023 05:50:58 GMT, Julian Waters wrote: >> Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 40 additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - arguments.hpp > - arguments.hpp > - globalDefinitions_gcc.hpp > - assembler_aarch64.hpp > - macroAssembler_aarch64.cpp > - vmError.cpp > - vmError.cpp > - macroAssembler_aarch64.cpp > - assembler_aarch64.hpp > - ... and 30 more: https://git.openjdk.org/jdk/compare/c63a77e5...afff56f2 See also: https://openjdk.org/guide/#things-to-consider-before-proposing-changes-to-openjdk-code This change looks like a case of pure "Modernizing", which isn't looked on particularly favorably for its own sake. There generally needs to be some additional benefits. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14969#issuecomment-1647108763 From jwaters at openjdk.org Mon Jul 24 05:34:04 2023 From: jwaters at openjdk.org (Julian Waters) Date: Mon, 24 Jul 2023 05:34:04 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v3] In-Reply-To: References: Message-ID: > Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 41 additional commits since the last revision: - Merge branch 'openjdk:master' into patch-6 - Merge branch 'openjdk:master' into patch-6 - arguments.hpp - arguments.hpp - globalDefinitions_gcc.hpp - assembler_aarch64.hpp - macroAssembler_aarch64.cpp - vmError.cpp - vmError.cpp - macroAssembler_aarch64.cpp - ... and 31 more: https://git.openjdk.org/jdk/compare/1a8dd18b...d60d8923 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14969/files - new: https://git.openjdk.org/jdk/pull/14969/files/afff56f2..d60d8923 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14969&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14969&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14969.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14969/head:pull/14969 PR: https://git.openjdk.org/jdk/pull/14969 From fyang at openjdk.org Mon Jul 24 06:14:45 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 24 Jul 2023 06:14:45 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Thu, 20 Jul 2023 08:31:56 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > Simplify case for long LU UL compares Hi, Thanks for the update. Taking another look. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2315: > 2313: void compare_string_8_x_LU(Register tmpL, Register tmpU, Register strL, Register strU, Label& DIFF) { > 2314: const Register tmp = x30; > 2315: __ ld(tmpL, Address(strL)); Could we make use of another tmp register (maybe `x7`) and use that as the destination register for `ld` instead? Then we could inflate_lo32/hi32 from this tmp register and put the result in `tmpL`. That way would help remove the two `mv` instructions here in this function and the one located at label `DIFF`. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2368: > 2366: > 2367: // make sure main loop is 8 byte-aligned, we should load another 4 bytes from strL > 2368: __ beqz(cnt2, DONE); // no characters left I don't quite understand this newly-added branch. If the intention was to ensure that we have at lease 4 chars left, shoudn't we compare with 4 instead? But given that the const `STUB_THRESHOLD` is (64 + 8) in C2_MacroAssembler::string_compare, I think neigher will be taken. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2379: > 2377: __ addi(cnt2, cnt2, -wordSize / 2); > 2378: > 2379: __ beqz(cnt2, DONE); // no character left Similar here. Will this newly-added branch ever be taken? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2540: > 2538: __ ld(tmp2, Address(str2)); > 2539: __ addi(str2, str2, 8); > 2540: __ beqz(cnt2, LAST_CHECK_AND_LENGTH_DIFF); I wonder when this branch instruction will taken? ------------- Changes requested by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14534#pullrequestreview-1542726484 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1271781757 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1271786219 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1271787329 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1271776226 From vkempik at openjdk.org Mon Jul 24 07:38:47 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 07:38:47 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 06:02:44 GMT, Fei Yang wrote: >> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: >> >> Simplify case for long LU UL compares > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2368: > >> 2366: >> 2367: // make sure main loop is 8 byte-aligned, we should load another 4 bytes from strL >> 2368: __ beqz(cnt2, DONE); // no characters left > > I don't quite understand this newly-added branch. If the intention was to ensure that we have at lease 4 chars left, shoudn't we compare with 4 instead? But given that the const `STUB_THRESHOLD` is (64 + 8) in C2_MacroAssembler::string_compare, I think neigher will be taken. you right, that looks to be a sanity check where the branch will never be taken, will remove both ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1271863906 From tanksherman27 at gmail.com Mon Jul 24 08:15:28 2023 From: tanksherman27 at gmail.com (Julian Waters) Date: Mon, 24 Jul 2023 16:15:28 +0800 Subject: os_thread is not handled by os::create_thread in os_windows.cpp Message-ID: Hi all, I just wanted to give a heads up that gcc catches a case inside os::create_thread in os_windows.cpp, in a switch statement that enumerates over all possible thread types. The warning states that threads of type os_thread are not handled: src/hotspot/os/windows/os_windows.cpp: In static member function 'static bool os::create_thread(Thread*, ThreadType, size_t)': d:/eclipse/workspace/jdk/src/hotspot/os/windows/os_windows.cpp:715:12: error: enumeration value 'os_thread' not handled in switch [-Werror=switch] 715 | switch (thr_type) { | ^ cc1plus.exe: all warnings being treated as errors I initially used to silence this warning, but since I don't know how threads of type os_thread should be handled or if this is intentional, I'm raising this to the attention of any HotSpot developers that know about this more than me in case this is a bug. best regards, Julian -------------- next part -------------- An HTML attachment was scrubbed... URL: From duke at openjdk.org Mon Jul 24 08:28:49 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Mon, 24 Jul 2023 08:28:49 GMT Subject: RFR: 8312569: RISC-V: Missing intrinsics for Math.ceil, floor, rint Message-ID: Please review this changes into risc-v double rounding intrinsic. On risc-v intrinsics for rounding doubles with mode (like Math.ceil/floor/rint) were missing. On risc-v we don`t have special instruction for such conversion, so two times conversion was used: double -> long int -> double (using fcvt.l.d, fcvt.d.l). Also, we should provide some rounding mode to fcvt.x.x instruction. Rounding mode selection on ceil (similar for floor and rint): according to Math.ceil requirements: Returns the smallest (closest to negative infinity) value that is greater than or equal to the argument and is equal to a mathematical integer (Math.java:475). So, for double -> long int we choose rup (round towards +inf) mode to get the integer that more than or equal to the input value, for long int -> double we choose rdn (rounds towards -inf) mode to get the smallest (closest to -inf) representation of integer that we got after conversion. For cases when we got inf, nan, or value more than 2^63 return input value (double value which more than 2^63 is guaranteed integer), as well when we store result we copy sign from input value (need for cases when for (-1.0, 0.0) ceil need to return -0.0). We have observed significant improvement on hifive and thead boards. testing: tier1 and tier2 on hifive Performance results on hifive (FpRoundingBenchmark.testceil/floor/rint): Without intrinsic: Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.testceil 1024 thrpt 25 39.297 ? 0.037 ops/ms FpRoundingBenchmark.testfloor 1024 thrpt 25 39.398 ? 0.018 ops/ms FpRoundingBenchmark.testrint 1024 thrpt 25 36.388 ? 0.844 ops/ms With intrinsic: Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.testceil 1024 thrpt 25 80.560 ? 0.053 ops/ms FpRoundingBenchmark.testfloor 1024 thrpt 25 80.541 ? 0.081 ops/ms FpRoundingBenchmark.testrint 1024 thrpt 25 80.603 ? 0.071 ops/ms ------------- Commit messages: - Fix comments style - Add missing intrinsic for double rounding Changes: https://git.openjdk.org/jdk/pull/14991/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14991&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312569 Stats: 107 lines in 4 files changed: 105 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14991.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14991/head:pull/14991 PR: https://git.openjdk.org/jdk/pull/14991 From duke at openjdk.org Mon Jul 24 08:41:47 2023 From: duke at openjdk.org (mmyxym) Date: Mon, 24 Jul 2023 08:41:47 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v18] In-Reply-To: References: Message-ID: On Fri, 16 Jun 2023 15:01:45 GMT, Roman Kennke wrote: >> Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. > > Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: > > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - Update comment about mark-word layout > - Merge branch 'JDK-8305896' into JDK-8305898 > - Fix tests on 32bit builds > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge branch 'JDK-8305896' into JDK-8305898 > - wqRevert "Rename self-forwarded -> forward-failed" > > This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. > - Merge branch 'JDK-8305896' into JDK-8305898 > - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 > - ... and 20 more: https://git.openjdk.org/jdk/compare/524f9c52...3838ac05 src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 210: > 208: markWord m = obj->mark(); > 209: if (m.is_marked()) { > 210: obj = obj->forwardee(m); Shall we have a method "oop::forwardee_not_self" which guarantee to be not self fowarded? So we can remove the self-forward if-check in GC critical path. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1271858310 From vkempik at openjdk.org Mon Jul 24 09:59:16 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 09:59:16 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v4] In-Reply-To: References: Message-ID: > Please review this fix. it eliminates misaligned loads in String.compare on risc-v > > for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, > it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. > > so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. > > I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. > > Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. > > Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. > > for large strings, the instrinsics from stubGenerator.cpp is used > for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. > > large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: > These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. > > This also enables regression test for string.Compare which previously was aarch64-only > > Testing: tier1 and tier2 clean on hifive. > > JMH testing, hifive: > before: > > Benchmark (delta) (size) Mode Cnt Score Error Units > StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op > StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op > StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op > StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op > StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op > StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? 0.458 ms/op > StringCompareToDifferentLength.compareToLL 2 ... Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: remove some branches and moves ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14534/files - new: https://git.openjdk.org/jdk/pull/14534/files/054a885f..27a194c1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=02-03 Stats: 15 lines in 1 file changed: 0 ins; 5 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14534.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14534/head:pull/14534 PR: https://git.openjdk.org/jdk/pull/14534 From vkempik at openjdk.org Mon Jul 24 09:59:16 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 09:59:16 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 05:54:43 GMT, Fei Yang wrote: >> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: >> >> Simplify case for long LU UL compares > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2315: > >> 2313: void compare_string_8_x_LU(Register tmpL, Register tmpU, Register strL, Register strU, Label& DIFF) { >> 2314: const Register tmp = x30; >> 2315: __ ld(tmpL, Address(strL)); > > Could we make use of another tmp register (maybe `x7`) and use that as the destination register for `ld` instead? Then we could inflate_lo32/hi32 from this tmp register and put the result in `tmpL`. That way would help remove the two `mv` instructions here in this function and the one located at label `DIFF`. I have done it in a slightly different way, still removing all "mv"s you have mentioned ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1272022286 From aph at openjdk.org Mon Jul 24 10:29:52 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 24 Jul 2023 10:29:52 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 > _Mailing list message from [David Holmes](mailto:david.holmes at oracle.com) on [hotspot-dev](mailto:hotspot-dev at mail.openjdk.org):_ > > > > I know that this patch is now withdrawn, but I'd like to make a point. > > Please don't use terms like volatile_write_post_barrier for StoreLoad > > if you can possibly avoid it, because that's only one way to implement Java's volatile. > > The assumption that volatile write is implemented by release; write; StoreLoad > > or similar permeates HotSpot, and is a pain for architectures like AArch64 > > which that don't do volatile that way. > > My long term understanding of the issue is that the requirements for volatile accesses is determined by the nature of the surrounding accesses - as per Doug Lea's JMM Cookbook [1]. That is to say, if you have to implement volatile using plain memory accesses and fences, then you are going to need a StoreLoad between a plain store and a plain load in the same thread. But the JMM Cookbook is of historical interest only, because there are now alternative ways to do it. > But neither the interpreter nor C1 consider this and only apply barriers to the actual volatile read/write in isolation - hence they (have to) assume the worst-case and use the strongest pre/post barriers that may be needed. And of course the kinds of barriers available depend on architecture. They don't have to do that: they could simply have punted volatile-store/volatile-load to the back end. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1647642736 From rkennke at openjdk.org Mon Jul 24 10:39:46 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 24 Jul 2023 10:39:46 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v18] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 07:29:53 GMT, mmyxym wrote: >> Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: >> >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Update comment about mark-word layout >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Fix tests on 32bit builds >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - wqRevert "Rename self-forwarded -> forward-failed" >> >> This reverts commit 4d9713ca239da8e294c63887426bfb97240d3130. >> - Merge branch 'JDK-8305896' into JDK-8305898 >> - Merge remote-tracking branch 'origin/JDK-8305898' into JDK-8305898 >> - ... and 20 more: https://git.openjdk.org/jdk/compare/524f9c52...3838ac05 > > src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 210: > >> 208: markWord m = obj->mark(); >> 209: if (m.is_marked()) { >> 210: obj = obj->forwardee(m); > > Shall we have a method "oop::forwardee_not_self" which guarantee to be not self fowarded? So we can remove the self-forward if-check in GC critical path. Are there any paths that don't need to handle self-forwarded state? Also, it seems to me that the path would be dominated by the load of the mark-word. Testing-and-branching for the self-forwarded bit seems like the minor problem there. It would be nice if we could tell the C++ compiler that the branch is expected to be uncommon, so it could shape the emitted code accordingly, but, afaik, we can't. If we were to micro-optimize the forwarding-decoding, then it would be more useful to optimize the common pattern: if (o->is_forwarded()) { // Loads and tests the mark-word oop fwd = o->forwardee(); // Loads mark-word again, and decode forwardee. ... } To something like: oop fwd = o->forwardee(); // Return nullptr when not forwarded if (fwd != nullptr) { ... } There's a way to improve further on this, as I proposed back in the early versions of #5955, which also avoids the decoding altogether if not forwarded, but it's a little more clunky: OopForwarding fwd(obj); if (fwd.is_forwarded()) { oop forwardee = fwd.forwardee(); ... } where the scoped OopForwarding object would encapsulate the markWord and the testing and decoding of the fwd-ptr, but it has been rejected back then. But it would certainly help more than an oopDesc::forwardee_not_self() approach (if that were even possible, which I think it is not). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1272080209 From vkempik at openjdk.org Mon Jul 24 11:01:44 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 11:01:44 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v4] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 09:59:16 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > remove some branches and moves updated results from hifive: good perf improvements on compareToLU and compareToUL with sizes > 72 Benchmark (delta) (size) Mode Cnt Score Error Units StringCompareToDifferentLength.compareToLL 2 24 avgt 9 8.610 ? 0.524 ms/op StringCompareToDifferentLength.compareToLL 2 36 avgt 9 9.623 ? 0.980 ms/op StringCompareToDifferentLength.compareToLL 2 72 avgt 9 11.483 ? 0.607 ms/op StringCompareToDifferentLength.compareToLL 2 128 avgt 9 15.931 ? 0.306 ms/op StringCompareToDifferentLength.compareToLL 2 256 avgt 9 21.179 ? 0.179 ms/op StringCompareToDifferentLength.compareToLL 2 512 avgt 9 32.687 ? 0.713 ms/op StringCompareToDifferentLength.compareToLL 2 520 avgt 9 31.122 ? 0.580 ms/op StringCompareToDifferentLength.compareToLL 2 523 avgt 9 33.225 ? 0.478 ms/op StringCompareToDifferentLength.compareToLU 2 24 avgt 9 15.019 ? 0.631 ms/op StringCompareToDifferentLength.compareToLU 2 36 avgt 9 18.538 ? 1.178 ms/op StringCompareToDifferentLength.compareToLU 2 72 avgt 9 30.966 ? 1.096 ms/op StringCompareToDifferentLength.compareToLU 2 128 avgt 9 48.397 ? 1.622 ms/op StringCompareToDifferentLength.compareToLU 2 256 avgt 9 87.368 ? 1.432 ms/op StringCompareToDifferentLength.compareToLU 2 512 avgt 9 164.575 ? 0.816 ms/op StringCompareToDifferentLength.compareToLU 2 520 avgt 9 167.250 ? 1.221 ms/op StringCompareToDifferentLength.compareToLU 2 523 avgt 9 172.279 ? 1.525 ms/op StringCompareToDifferentLength.compareToUL 2 24 avgt 9 16.391 ? 0.456 ms/op StringCompareToDifferentLength.compareToUL 2 36 avgt 9 19.760 ? 0.283 ms/op StringCompareToDifferentLength.compareToUL 2 72 avgt 9 31.841 ? 0.888 ms/op StringCompareToDifferentLength.compareToUL 2 128 avgt 9 49.545 ? 1.115 ms/op StringCompareToDifferentLength.compareToUL 2 256 avgt 9 88.728 ? 0.877 ms/op StringCompareToDifferentLength.compareToUL 2 512 avgt 9 166.147 ? 1.468 ms/op StringCompareToDifferentLength.compareToUL 2 520 avgt 9 168.843 ? 1.251 ms/op StringCompareToDifferentLength.compareToUL 2 523 avgt 9 173.655 ? 1.518 ms/op StringCompareToDifferentLength.compareToUU 2 24 avgt 9 9.462 ? 0.572 ms/op StringCompareToDifferentLength.compareToUU 2 36 avgt 9 11.976 ? 0.696 ms/op StringCompareToDifferentLength.compareToUU 2 72 avgt 9 15.301 ? 0.673 ms/op StringCompareToDifferentLength.compareToUU 2 128 avgt 9 19.836 ? 0.841 ms/op StringCompareToDifferentLength.compareToUU 2 256 avgt 9 31.328 ? 0.619 ms/op StringCompareToDifferentLength.compareToUU 2 512 avgt 9 52.381 ? 1.249 ms/op StringCompareToDifferentLength.compareToUU 2 520 avgt 9 53.119 ? 1.195 ms/op StringCompareToDifferentLength.compareToUU 2 523 avgt 9 53.588 ? 1.803 ms/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1647689516 From david.holmes at oracle.com Mon Jul 24 12:05:59 2023 From: david.holmes at oracle.com (David Holmes) Date: Mon, 24 Jul 2023 22:05:59 +1000 Subject: os_thread is not handled by os::create_thread in os_windows.cpp In-Reply-To: References: Message-ID: Hi, On 24/07/2023 6:15 pm, Julian Waters wrote: > Hi all, > > I just wanted to give a heads up that gcc catches a case inside > os::create_thread in os_windows.cpp, in a switch statement that > enumerates over all possible thread types. The warning states that > threads of type os_thread are not handled: > > src/hotspot/os/windows/os_windows.cpp: In static member function 'static > bool os::create_thread(Thread*, ThreadType, size_t)': > d:/eclipse/workspace/jdk/src/hotspot/os/windows/os_windows.cpp:715:12: > error: enumeration value 'os_thread' not handled in switch [-Werror=switch] > ? 715 |? ? ?switch (thr_type) { > ? ? ? ? ?|? ? ? ? ? ? ^ > cc1plus.exe: all warnings being treated as errors > > I initially used to silence this warning, but since I don't know how > threads of type os_thread should be handled or if this is intentional, > I'm raising this to the attention of any HotSpot developers that know > about this more than me in case this is a bug. I didn't even realize it existed - it is only used as the type for the JfrThreadSampler thread. I'd say this a bug/oversight. The missing case would do nothing anyway. Cheers, David > best regards, > Julian From fyang at openjdk.org Mon Jul 24 12:11:41 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 24 Jul 2023 12:11:41 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 09:54:15 GMT, Vladimir Kempik wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2315: >> >>> 2313: void compare_string_8_x_LU(Register tmpL, Register tmpU, Register strL, Register strU, Label& DIFF) { >>> 2314: const Register tmp = x30; >>> 2315: __ ld(tmpL, Address(strL)); >> >> Could we make use of another tmp register (maybe `x7`) and use that as the destination register for `ld` instead? Then we could inflate_lo32/hi32 from this tmp register and put the result in `tmpL`. That way would help remove the two `mv` instructions here in this function and the one located at label `DIFF`. > > I have done it in a slightly different way, still removing all "mv"s you have mentioned Ah, I see you used `t0` instead. I once considered that, but I would still suggest `x7` which would be safer. `t0` as a scratch register are implicitly clobbered/used by so many assembler functions that it's risky to keep a live value in it across assemblers like `inflate_lo32`. `x7` should be a better candidate here as it is only updated at label `CALCULATE_DIFFERENCE` in function generate_compare_long_string_different_encoding. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1272175683 From coleenp at openjdk.org Mon Jul 24 12:12:50 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 24 Jul 2023 12:12:50 GMT Subject: Integrated: 8311847: Fix -Wconversion for assembler.hpp emit_int8, 16 callers In-Reply-To: References: Message-ID: On Tue, 11 Jul 2023 01:26:44 GMT, Coleen Phillimore wrote: > Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. > > Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. This pull request has now been integrated. Changeset: 7dd47998 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/7dd47998f00712515c25fb852b6c0cf958120508 Stats: 50 lines in 5 files changed: 22 ins; 2 del; 26 mod 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers Reviewed-by: dlong, aph ------------- PR: https://git.openjdk.org/jdk/pull/14822 From coleenp at openjdk.org Mon Jul 24 12:14:41 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 24 Jul 2023 12:14:41 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp In-Reply-To: <6NQeZaB-D8hr-onoolXrScRurp3aFgk5ew-Qi8QNVi8=.7a634414-4b4e-4cc1-ac4b-c50462ff2453@github.com> References: <-HYDaQSWc6DgSZk1cm-MpRBw-vc8y1Kh42kTAeR73uo=.8f1d85ce-501e-4ac8-bdf6-6ce441c58d47@github.com> <6NQeZaB-D8hr-onoolXrScRurp3aFgk5ew-Qi8QNVi8=.7a634414-4b4e-4cc1-ac4b-c50462ff2453@github.com> Message-ID: On Mon, 17 Jul 2023 18:46:13 GMT, Dean Long wrote: >> Or cast the `2` to `(u1)2` ? > > It does seem strange that this code is using a bit-field, but still using & | and shift. Why not represent the high and low bit with two separate bit-fields of width 1? I tried to change the class to be int value : 1, init: 1; the changes got a bit ugly when fixing the TriBoolArray functions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1272179086 From coleenp at openjdk.org Mon Jul 24 12:35:41 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 24 Jul 2023 12:35:41 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp In-Reply-To: References: <-HYDaQSWc6DgSZk1cm-MpRBw-vc8y1Kh42kTAeR73uo=.8f1d85ce-501e-4ac8-bdf6-6ce441c58d47@github.com> <6NQeZaB-D8hr-onoolXrScRurp3aFgk5ew-Qi8QNVi8=.7a634414-4b4e-4cc1-ac4b-c50462ff2453@github.com> Message-ID: On Mon, 24 Jul 2023 12:12:15 GMT, Coleen Phillimore wrote: >> It does seem strange that this code is using a bit-field, but still using & | and shift. Why not represent the high and low bit with two separate bit-fields of width 1? > > I tried to change the class to be int value : 1, init: 1; the changes got a bit ugly when fixing the TriBoolArray functions. expr & 3 is the wrong expression. This wants to set the 'not-default' bit, not and in the 'not-default' bit if it's set in the value. Thankfully, tribool has a gtest. tribool.hpp:43:43: warning: conversion from 'unsigned int' to 'unsigned char:2' may change value [-Wconversion] 43 | TriBool(bool value) : _value((u1)value | (u1)2) { | ~~~~~~~~~~~^~~~~~~ Casting 2 to u1 doesn't work because | still promotes to unsigned int. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1272200460 From vkempik at openjdk.org Mon Jul 24 12:48:46 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 12:48:46 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 12:09:07 GMT, Fei Yang wrote: >> I have done it in a slightly different way, still removing all "mv"s you have mentioned > > Ah, I see you used `t0` instead. I once considered that, but I would still suggest `x7` which would be safer. `t0` as a scratch register are implicitly clobbered/used by so many assembler functions that it's risky to keep a live value in it across assemblers like `inflate_lo32`. `x7` should be a better candidate here as it is only updated at label `CALCULATE_DIFFERENCE` in function generate_compare_long_string_different_encoding. Good points ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1272214995 From vkempik at openjdk.org Mon Jul 24 13:12:19 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 13:12:19 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v5] In-Reply-To: References: Message-ID: > Please review this fix. it eliminates misaligned loads in String.compare on risc-v > > for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, > it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. > > so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. > > I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. > > Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. > > Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. > > for large strings, the instrinsics from stubGenerator.cpp is used > for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. > > large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: > These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. > > This also enables regression test for string.Compare which previously was aarch64-only > > Testing: tier1 and tier2 clean on hifive. > > JMH testing, hifive: > before: > > Benchmark (delta) (size) Mode Cnt Score Error Units > StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op > StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op > StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op > StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op > StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op > StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? 0.458 ms/op > StringCompareToDifferentLength.compareToLL 2 ... Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: change temp register to x7 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14534/files - new: https://git.openjdk.org/jdk/pull/14534/files/27a194c1..7126c414 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=03-04 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14534.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14534/head:pull/14534 PR: https://git.openjdk.org/jdk/pull/14534 From vkempik at openjdk.org Mon Jul 24 13:22:41 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 13:22:41 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 12:45:58 GMT, Vladimir Kempik wrote: >> Ah, I see you used `t0` instead. I once considered that, but I would still suggest `x7` which would be safer. `t0` as a scratch register are implicitly clobbered/used by so many assembler functions that it's risky to keep a live value in it across assemblers like `inflate_lo32`. `x7` should be a better candidate here as it is only updated at label `CALCULATE_DIFFERENCE` in function generate_compare_long_string_different_encoding. > > Good points Updated the PR, this register renaming hasn't changed performance on hifive compared to previous version ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1272254708 From aph at openjdk.org Mon Jul 24 13:26:51 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 24 Jul 2023 13:26:51 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: <3gsU9ZWpjWzqPKiDeqIyl87ulxP6mEBma-c_wzcidww=.0ad4a924-cab2-49db-981f-41a9c8bdb715@github.com> On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 For a little more clarity: let's imagine that you wanted to have a sequentially-consistent set of stores and loads, using plain memory accesses and fences. This would work: str r1, [x1] ; fullFence; ldr r1, [x1]; fullFence; str r1, [x2]; fullFence; ldr r1, [x2] but this would not: str r1, [x1] ; storeLoad; ldr r1, [x1]; storeLoad; str r1, [x2]; storeLoad; ldr r1, [x2] This is why I wouldn't call storeLoad "membar_volatile". ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1647903628 From amitkumar at openjdk.org Mon Jul 24 13:31:47 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 24 Jul 2023 13:31:47 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <4-SkKe526mi_69V5zOsgHsLNTp1_vHzNaPFJluPXppM=.4ed68247-ac70-41b1-a4b6-e0daf864ab9f@github.com> References: <4-SkKe526mi_69V5zOsgHsLNTp1_vHzNaPFJluPXppM=.4ed68247-ac70-41b1-a4b6-e0daf864ab9f@github.com> Message-ID: On Sat, 22 Jul 2023 06:13:01 GMT, sid8606 wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address suggestions from Jorn Vernee > > I have narrowed down the issue failing on glibc 2.31 but passes on glibc 2.35 on s390x. > **test_time** test method is failing in StdLibTest.java. > @JornVernee test_printf test method is passing as it is. @sid8606, there is separate PR #14994 submitted for TestLayouts.java for handling the endian-ness issue. You may want to restore it from here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1647910965 From vkempik at openjdk.org Mon Jul 24 13:51:42 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 13:51:42 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v3] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 05:50:19 GMT, Fei Yang wrote: >> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: >> >> Simplify case for long LU UL compares > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2540: > >> 2538: __ ld(tmp2, Address(str2)); >> 2539: __ addi(str2, str2, 8); >> 2540: __ beqz(cnt2, LAST_CHECK_AND_LENGTH_DIFF); > > I wonder when this branch instruction will taken? yeah, cnt2 possible values before beqz is [1..7], branch will never be taken ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1272292454 From vkempik at openjdk.org Mon Jul 24 14:16:02 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 24 Jul 2023 14:16:02 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v6] In-Reply-To: References: Message-ID: <8KFiP2c2bVaIpns8acfNX0dh_FMeGWq4GSHhFCeJ2o0=.a518e5ac-3213-431f-a42d-4968a16777f0@github.com> > Please review this fix. it eliminates misaligned loads in String.compare on risc-v > > for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, > it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. > > so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. > > I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. > > Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. > > Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. > > for large strings, the instrinsics from stubGenerator.cpp is used > for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. > > large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: > These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. > > This also enables regression test for string.Compare which previously was aarch64-only > > Testing: tier1 and tier2 clean on hifive. > > JMH testing, hifive: > before: > > Benchmark (delta) (size) Mode Cnt Score Error Units > StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op > StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op > StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op > StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op > StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op > StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? 0.458 ms/op > StringCompareToDifferentLength.compareToLL 2 ... Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: remove unused branch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14534/files - new: https://git.openjdk.org/jdk/pull/14534/files/7126c414..0783c477 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14534.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14534/head:pull/14534 PR: https://git.openjdk.org/jdk/pull/14534 From ccheung at openjdk.org Mon Jul 24 16:05:45 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Mon, 24 Jul 2023 16:05:45 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v5] In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 19:57:12 GMT, Ioi Lam wrote: >> This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. >> >> The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. >> >> Example with `-XX:-UseCompressedOops`: >> >> >> 0x00000000100001f0: @@ Object java.lang.String >> 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 >> - klass: 'java/lang/String' 0x0000000800010290 >> - ---- fields (total size 4 words): >> - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) >> - private final 'coder' 'B' @16 0 (0x00) >> - private 'hashIsZero' 'Z' @17 false (0x00) >> - injected 'flags' 'B' @18 1 (0x01) >> - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 >> 0x0000000010000210: @@ Object [B length: 6 >> 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e >> - klass: {type array byte} 0x00000008000024c8 >> - 0: 4e N >> - 1: 41 A >> - 2: 52 R >> - 3: 52 R >> - 4: 4f O >> - 5: 57 W >> >> >> Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: >> >> >> 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String >> 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a >> - klass: 'java/lang/String' 0x0000000800010290 >> - ---- fields (total size 3 words): >> - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) >> - private final 'coder' 'B' @16 0 (0x00) >> - private 'hashIsZero' 'Z' @17 false (0x00) >> - injected 'flags' 'B' @18 1 (0x01) >> - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 >> 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 >> 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f >> - klass: {type array byte} 0x00000008000024c8 >> - 0: 4e N >> - 1: 41 A >> - 2: 52 R >> - 3: 52 R >> - 4: 4f O >> - 5: 57 W > > Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: > > - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops > - added hints in test case > - added test case > - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops > - @tstuefe and @matias9927 comments > - 8308903: Print detailed info for Java objects in -Xlog:cds+map Just couple of nits. Looks good. src/hotspot/share/cds/archiveBuilder.cpp line 1069: > 1067: ShouldNotReachHere(); > 1068: } > 1069: Blank line added by accident? src/hotspot/share/cds/archiveHeapWriter.cpp line 546: > 544: > 545: BitMap::idx_t idx = requested_field_addr - (Metadata**) _requested_bottom; > 546: return (idx < heap_info->ptrmap()->size()) && (heap_info->ptrmap()->at(idx) == true); For the second condition, would it be clearer to check for non-null? `heap_info->ptrmap()->at(idx) != nullptr` ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14841#pullrequestreview-1543778440 PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1272446560 PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1272448723 From iklam at openjdk.org Mon Jul 24 16:22:45 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 24 Jul 2023 16:22:45 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v5] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 15:46:22 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: >> >> - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops >> - added hints in test case >> - added test case >> - Merge branch 'master' into 8308903-cds-map-detailed-info-for-oops >> - @tstuefe and @matias9927 comments >> - 8308903: Print detailed info for Java objects in -Xlog:cds+map > > src/hotspot/share/cds/archiveBuilder.cpp line 1069: > >> 1067: ShouldNotReachHere(); >> 1068: } >> 1069: > > Blank line added by accident? I added this blank line to separate the following lines from the previous loop, to make the code easier to read. > src/hotspot/share/cds/archiveHeapWriter.cpp line 546: > >> 544: >> 545: BitMap::idx_t idx = requested_field_addr - (Metadata**) _requested_bottom; >> 546: return (idx < heap_info->ptrmap()->size()) && (heap_info->ptrmap()->at(idx) == true); > > For the second condition, would it be clearer to check for non-null? > `heap_info->ptrmap()->at(idx) != nullptr` `heap_info->ptrmap()->at(idx)` return a `bool` to indicate whether this index is marked or not. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1272482639 PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1272480471 From ccheung at openjdk.org Mon Jul 24 16:22:46 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Mon, 24 Jul 2023 16:22:46 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map [v5] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 16:17:30 GMT, Ioi Lam wrote: >> src/hotspot/share/cds/archiveBuilder.cpp line 1069: >> >>> 1067: ShouldNotReachHere(); >>> 1068: } >>> 1069: >> >> Blank line added by accident? > > I added this blank line to separate the following lines from the previous loop, to make the code easier to read. Ok >> src/hotspot/share/cds/archiveHeapWriter.cpp line 546: >> >>> 544: >>> 545: BitMap::idx_t idx = requested_field_addr - (Metadata**) _requested_bottom; >>> 546: return (idx < heap_info->ptrmap()->size()) && (heap_info->ptrmap()->at(idx) == true); >> >> For the second condition, would it be clearer to check for non-null? >> `heap_info->ptrmap()->at(idx) != nullptr` > > `heap_info->ptrmap()->at(idx)` return a `bool` to indicate whether this index is marked or not. I see. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1272484271 PR Review Comment: https://git.openjdk.org/jdk/pull/14841#discussion_r1272485022 From iklam at openjdk.org Mon Jul 24 17:59:53 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 24 Jul 2023 17:59:53 GMT Subject: RFR: 8308903: Print detailed info for Java objects in -Xlog:cds+map In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 05:51:58 GMT, Thomas Stuefe wrote: >> This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. >> >> The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. >> >> Example with `-XX:-UseCompressedOops`: >> >> >> 0x00000000100001f0: @@ Object java.lang.String >> 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 >> - klass: 'java/lang/String' 0x0000000800010290 >> - ---- fields (total size 4 words): >> - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) >> - private final 'coder' 'B' @16 0 (0x00) >> - private 'hashIsZero' 'Z' @17 false (0x00) >> - injected 'flags' 'B' @18 1 (0x01) >> - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 >> 0x0000000010000210: @@ Object [B length: 6 >> 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e >> - klass: {type array byte} 0x00000008000024c8 >> - 0: 4e N >> - 1: 41 A >> - 2: 52 R >> - 3: 52 R >> - 4: 4f O >> - 5: 57 W >> >> >> Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: >> >> >> 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String >> 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a >> - klass: 'java/lang/String' 0x0000000800010290 >> - ---- fields (total size 3 words): >> - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) >> - private final 'coder' 'B' @16 0 (0x00) >> - private 'hashIsZero' 'Z' @17 false (0x00) >> - injected 'flags' 'B' @18 1 (0x01) >> - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 >> 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 >> 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f >> - klass: {type array byte} 0x00000008000024c8 >> - 0: 4e N >> - 1: 41 A >> - 2: 52 R >> - 3: 52 R >> - 4: 4f O >> - 5: 57 W > >> The output looks like oopDesc::print_on(tty), but we need to print the pointers using the locations of the objects at runtime. > > I don't understand. Are you not experimenting with a new allocation mode that would basically make the runtime address unpredictable? Or is this still for the old way of things where the heap archive is mapped as-is ? > > Other than that, this looks very useful. Thanks @tstuefe @calvinccheung @matias9927 for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14841#issuecomment-1648354099 From iklam at openjdk.org Mon Jul 24 17:59:55 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 24 Jul 2023 17:59:55 GMT Subject: Integrated: 8308903: Print detailed info for Java objects in -Xlog:cds+map In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 00:39:30 GMT, Ioi Lam wrote: > This PR adds detailed printing of oop information with `-Xlog:cds+map+oop=trace`, or simply `-Xlog:cds+map*=trace`. The information is useful for debugging contents of the CDS archived heap objects. > > The output looks like `oopDesc::print_on(tty)`, but we need to print the pointers using the locations of the objects at runtime. The examples below show how a `String` references its `value` array. > > Example with `-XX:-UseCompressedOops`: > > > 0x00000000100001f0: @@ Object java.lang.String > 0x00000000100001f0: 0000006ff6ab8d01 88d47c5b00010290 0000000000010000 0000000010000210 > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 4 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @24 0x0000000010000210 [B length: 6 > 0x0000000010000210: @@ Object [B length: 6 > 0x0000000010000210: 000000693b708001 00000006000024c8 0000574f5252414e > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W > > > Example with `-XX:+UseCompressedOops`. Note that the narrorOop is also printed: > > > 0x00000007ffc001b8: @@ Object (0xfff80037) java.lang.String > 0x00000007ffc001b8: f6ab8d01 0000006f 00010290 88d47c5b 00010000 fff8003a > - klass: 'java/lang/String' 0x0000000800010290 > - ---- fields (total size 3 words): > - private 'hash' 'I' @12 -1999340453 (0x88d47c5b) > - private final 'coder' 'B' @16 0 (0x00) > - private 'hashIsZero' 'Z' @17 false (0x00) > - injected 'flags' 'B' @18 1 (0x01) > - private final 'value' '[B' @20 0x00000007ffc001d0 (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: @@ Object (0xfff8003a) [B length: 6 > 0x00000007ffc001d0: 3b708001 00000069 000024c8 00000006 5252414e 0000574f > - klass: {type array byte} 0x00000008000024c8 > - 0: 4e N > - 1: 41 A > - 2: 52 R > - 3: 52 R > - 4: 4f O > - 5: 57 W This pull request has now been integrated. Changeset: 8008e27c Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/8008e27c55030b397e2040bc3cf8408e47edf412 Stats: 597 lines in 9 files changed: 556 ins; 12 del; 29 mod 8308903: Print detailed info for Java objects in -Xlog:cds+map Reviewed-by: stuefe, ccheung ------------- PR: https://git.openjdk.org/jdk/pull/14841 From cjplummer at openjdk.org Mon Jul 24 18:12:52 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 24 Jul 2023 18:12:52 GMT Subject: [jdk21] RFR: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 22:32:29 GMT, Serguei Spitsyn wrote: > This is a clean 21 backport of the 22 fix: > [JDK-8300051](https://bugs.openjdk.org/browse/JDK-8300051): assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist > > Testing: > - TBD: mach5 tiers 1-5 Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk21/pull/143#pullrequestreview-1544008750 From dlong at openjdk.org Mon Jul 24 20:42:53 2023 From: dlong at openjdk.org (Dean Long) Date: Mon, 24 Jul 2023 20:42:53 GMT Subject: Integrated: 8312077: Fix signed integer overflow, final part In-Reply-To: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> References: <9VJGvEdEQ9qjyNL_trN5Nx1XzKffBkFdI3Ktmo0Bcs4=.cc394713-d908-458e-82e0-c5c180a414d1@github.com> Message-ID: On Fri, 14 Jul 2023 08:01:02 GMT, Dean Long wrote: > This is hopefully the last set of integer overflow fixes for hotspot. Some of the counters I changed to unsigned are updated in platform-specific code, so I could use some help testing on arm, ppc, riscv, and s390. This pull request has now been integrated. Changeset: d0761c19 Author: Dean Long URL: https://git.openjdk.org/jdk/commit/d0761c19d1ddafbcb5ea97334335462e716de250 Stats: 321 lines in 31 files changed: 5 ins; 0 del; 316 mod 8312077: Fix signed integer overflow, final part Reviewed-by: kvn, amitkumar ------------- PR: https://git.openjdk.org/jdk/pull/14883 From matsaave at openjdk.org Mon Jul 24 21:20:42 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 24 Jul 2023 21:20:42 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() In-Reply-To: References: Message-ID: On Mon, 17 Jul 2023 16:55:03 GMT, Ioi Lam wrote: >> Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. >> >> >> // We have finished dumping the static archive. At this point, there may be pending VM >> // operations. We have changed some global states (such as vmClasses::_klasses) that >> // may cause these VM operations to fail. For safety, forget these operations and >> // exit the VM directly. >> void MetaspaceShared::exit_after_static_dump() { >> os::_exit(0); >> } >> >> >> As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: >> 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead >> 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. >> 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. > > src/hotspot/share/classfile/classLoaderData.cpp line 1085: > >> 1083: guarantee(this == class_loader_data(cl) || has_class_mirror_holder(), "Must be the same"); >> 1084: guarantee(cl != nullptr || this == ClassLoaderData::the_null_class_loader_data() || has_class_mirror_holder(), "must be"); >> 1085: } > > Why is this necessary? This seems to be a band-aid solution to a deeper problem: the java platform and system loaders are reset. I believe the correct solution is to restore these values after dumping completes ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1272766119 From dlong at openjdk.org Mon Jul 24 22:12:52 2023 From: dlong at openjdk.org (Dean Long) Date: Mon, 24 Jul 2023 22:12:52 GMT Subject: RFR: 8310939: [c1] The visibility of write-volatile requires membar_volatile instead of membar In-Reply-To: References: Message-ID: On Tue, 27 Jun 2023 12:46:14 GMT, SUN Guoyun wrote: > For c1 now, a volatile write case: > > membar_release // LoadStore | StoreStore > write volatile > membar > > Just like c2, here `membar` should be defined `membar_volatile` clearly, then for risc-v, ppc and loongarch can use StoreLoad for `membar_volatile` for better performance. > > Testing: > GHA testing > jtreg tier1-3 for loongarch64 Just to clarify, I was suggesting that the shared code should not try to impose which barriers, if any, are used by the cpu-specific implementation. So a better name for `volatile_write_post_barrier` might have been `volatile_write_post_hook`, where the "volatile_write" part is because that's the logical operation the shared code is doing. The implementation of the hook could be a barrier or even a no-op. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14677#issuecomment-1648693818 From duke at openjdk.org Mon Jul 24 22:17:52 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Mon, 24 Jul 2023 22:17:52 GMT Subject: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class Message-ID: This patch adds NestHost and NestMembers attributes to the class dumped by SA. Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` Manual testing by dumping `j.l.String` and `j.l.String$CaseInsensitiveComparator` classes. `j.l.String` shows one entry in `NestMembers` attribute for `j.l.String$CaseInsensitiveComparator` and `j.l.String$CaseInsensitiveComparator` has `j.l.String` as its `NestHost`. ------------- Commit messages: - 8312623: SA add NestHost and NestMembers attributes when dumping class Changes: https://git.openjdk.org/jdk/pull/15005/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15005&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312623 Stats: 50 lines in 3 files changed: 50 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15005.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15005/head:pull/15005 PR: https://git.openjdk.org/jdk/pull/15005 From cjplummer at openjdk.org Mon Jul 24 23:32:40 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 24 Jul 2023 23:32:40 GMT Subject: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 22:12:28 GMT, Ashutosh Mehra wrote: > This patch adds NestHost and NestMembers attributes to the class dumped by SA. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Manual testing by dumping `j.l.String` and `j.l.String$CaseInsensitiveComparator` classes. > `j.l.String` shows one entry in `NestMembers` attribute for `j.l.String$CaseInsensitiveComparator` and `j.l.String$CaseInsensitiveComparator` has `j.l.String` as its `NestHost`. Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15005#pullrequestreview-1544419190 From sspitsyn at openjdk.org Tue Jul 25 00:14:41 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 25 Jul 2023 00:14:41 GMT Subject: [jdk21] RFR: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist In-Reply-To: References: Message-ID: <1dQDc_R7Z3eI73iYTg4kn_9Wnhjsxk2Qh0hlQJqlzf4=.01b56cd7-03c7-4c9c-887c-05ace048f954@github.com> On Fri, 21 Jul 2023 22:32:29 GMT, Serguei Spitsyn wrote: > This is a clean 21 backport of the 22 fix: > [JDK-8300051](https://bugs.openjdk.org/browse/JDK-8300051): assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist > > Testing: > - TBD: mach5 tiers 1-5 Thank you for review, Chris! ------------- PR Comment: https://git.openjdk.org/jdk21/pull/143#issuecomment-1648783712 From sspitsyn at openjdk.org Tue Jul 25 00:14:43 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 25 Jul 2023 00:14:43 GMT Subject: [jdk21] Integrated: 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist In-Reply-To: References: Message-ID: On Fri, 21 Jul 2023 22:32:29 GMT, Serguei Spitsyn wrote: > This is a clean 21 backport of the 22 fix: > [JDK-8300051](https://bugs.openjdk.org/browse/JDK-8300051): assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist > > Testing: > - TBD: mach5 tiers 1-5 This pull request has now been integrated. Changeset: 817dc554 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk21/commit/817dc554e52bc612f752aedfa4ea9dc3626c4cd8 Stats: 26 lines in 2 files changed: 14 ins; 12 del; 0 mod 8300051: assert(JvmtiEnvBase::environments_might_exist()) failed: to enter event controller, JVM TI environments must exist Reviewed-by: cjplummer Backport-of: 783de32b6af4383b5ba71b91c307a5dddd0dae13 ------------- PR: https://git.openjdk.org/jdk21/pull/143 From jwaters at openjdk.org Tue Jul 25 01:45:50 2023 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 25 Jul 2023 01:45:50 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v3] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 05:34:04 GMT, Julian Waters wrote: >> Someone had to do it, so I did. Moves attributes to the correct place as specified in the HotSpot Style Guide once and for all > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 41 additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-6 > - Merge branch 'openjdk:master' into patch-6 > - arguments.hpp > - arguments.hpp > - globalDefinitions_gcc.hpp > - assembler_aarch64.hpp > - macroAssembler_aarch64.cpp > - vmError.cpp > - vmError.cpp > - macroAssembler_aarch64.cpp > - ... and 31 more: https://git.openjdk.org/jdk/compare/090c3cd9...d60d8923 Hmm... I was doing this so that what attribute specifiers applied to would be clearer since it's helpful to have the ability to make them apply to different aspects of a method as per C++11. Could I split this up and discard some of the areas in which the changes are less agreeable? (as advised by someone who reached out to me privately) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14969#issuecomment-1648839267 From kbarrett at openjdk.org Tue Jul 25 02:30:40 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 25 Jul 2023 02:30:40 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v3] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 01:41:24 GMT, Julian Waters wrote: > Hmm... I was doing this so that what attribute specifiers applied to would be clearer since it's helpful to have the ability to make them apply to different aspects of a method as per C++11. Could I split this up and discard some of the areas in which the changes are less agreeable? (as advised by someone who reached out to me privately) That can be done when other attributes are being added to a declaration. For example, various error reporting functions had the `[[noreturn]]` attribute added by JDK-8303805. Many of them also had a trailing ATTRIBUTE_PRINTF, which was moved to the front at the same time. There were also one or two functions there whose trailing ATTRIBUTE_PRINTF was moved to the front even though not noreturn, for local consistency (e.g. `warning`). (So yes, sometimes it's okay to normalize some outliers. That doesn't seem like what's going on in this PR though.) There currently aren't other standard attributes to be dealt with. That may change as we adopt newer versions of the standard. But even so, the scope of the adoption changes is going to be limited. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14969#issuecomment-1648898589 From coleenp at openjdk.org Tue Jul 25 02:33:57 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 25 Jul 2023 02:33:57 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v2] In-Reply-To: References: Message-ID: > Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. > Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Dean's suggestion. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14892/files - new: https://git.openjdk.org/jdk/pull/14892/files/73c2075a..c0e12b5d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=00-01 Stats: 5 lines in 1 file changed: 0 ins; 3 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14892.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14892/head:pull/14892 PR: https://git.openjdk.org/jdk/pull/14892 From duke at openjdk.org Tue Jul 25 02:46:44 2023 From: duke at openjdk.org (mmyxym) Date: Tue, 25 Jul 2023 02:46:44 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v18] In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 10:36:24 GMT, Roman Kennke wrote: >> src/hotspot/share/gc/g1/g1ParScanThreadState.cpp line 210: >> >>> 208: markWord m = obj->mark(); >>> 209: if (m.is_marked()) { >>> 210: obj = obj->forwardee(m); >> >> Shall we have a method "oop::forwardee_not_self" which guarantee to be not self fowarded? So we can remove the self-forward if-check in GC critical path. > > Are there any paths that don't need to handle self-forwarded state? > > Also, it seems to me that the path would be dominated by the load of the mark-word. Testing-and-branching for the self-forwarded bit seems like the minor problem there. It would be nice if we could tell the C++ compiler that the branch is expected to be uncommon, so it could shape the emitted code accordingly, but, afaik, we can't. > > If we were to micro-optimize the forwarding-decoding, then it would be more useful to optimize the common pattern: > > > if (o->is_forwarded()) { // Loads and tests the mark-word > oop fwd = o->forwardee(); // Loads mark-word again, and decode forwardee. > ... > } > > > To something like: > > > oop fwd = o->forwardee(); // Return nullptr when not forwarded > if (fwd != nullptr) { > ... > } > > > There's a way to improve further on this, as I proposed back in the early versions of #5955, which also avoids the decoding altogether if not forwarded, but it's a little more clunky: > > > OopForwarding fwd(obj); > if (fwd.is_forwarded()) { > oop forwardee = fwd.forwardee(); > ... > } > > > where the scoped OopForwarding object would encapsulate the markWord and the testing and decoding of the fwd-ptr, but it has been rejected back then. But it would certainly help more than an oopDesc::forwardee_not_self() approach (if that were even possible, which I think it is not). Sorry. It's my misunderstanding. G1ParScanThreadState::do_oop_evac still needs to handle self forward. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13779#discussion_r1272936418 From sspitsyn at openjdk.org Tue Jul 25 03:01:40 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 25 Jul 2023 03:01:40 GMT Subject: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 22:12:28 GMT, Ashutosh Mehra wrote: > This patch adds NestHost and NestMembers attributes to the class dumped by SA. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Manual testing by dumping `j.l.String` and `j.l.String$CaseInsensitiveComparator` classes. > `j.l.String` shows one entry in `NestMembers` attribute for `j.l.String$CaseInsensitiveComparator` and `j.l.String$CaseInsensitiveComparator` has `j.l.String` as its `NestHost`. Looks good. Thanks, Serguei ------------- Marked as reviewed by sspitsyn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15005#pullrequestreview-1544557663 From coleenp at openjdk.org Tue Jul 25 03:05:49 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 25 Jul 2023 03:05:49 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v3] In-Reply-To: References: Message-ID: > Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. > Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Missed TriBoolAssigner case. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14892/files - new: https://git.openjdk.org/jdk/pull/14892/files/c0e12b5d..f4f32fb5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=01-02 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14892.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14892/head:pull/14892 PR: https://git.openjdk.org/jdk/pull/14892 From dholmes at openjdk.org Tue Jul 25 04:46:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 25 Jul 2023 04:46:50 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v3] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 03:05:49 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Missed TriBoolAssigner case. Okay. Beats me why the compiler can't tell that the result of the OR fits in two bits without having to do the `& 3`. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14892#pullrequestreview-1544627104 From dholmes at openjdk.org Tue Jul 25 05:38:39 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 25 Jul 2023 05:38:39 GMT Subject: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 22:12:28 GMT, Ashutosh Mehra wrote: > This patch adds NestHost and NestMembers attributes to the class dumped by SA. > > Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` > Manual testing by dumping `j.l.String` and `j.l.String$CaseInsensitiveComparator` classes. > `j.l.String` shows one entry in `NestMembers` attribute for `j.l.String$CaseInsensitiveComparator` and `j.l.String$CaseInsensitiveComparator` has `j.l.String` as its `NestHost`. We need to be sure this works as expected for top-level classes that have no nest members, and deeply nested nest members, plus dynamically injected hidden classes that are nest members. I'm unclear if this is intended to only expose the same details as would be statically defined in the attribute in the classfile? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15005#issuecomment-1649153163 From fyang at openjdk.org Tue Jul 25 06:42:42 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 25 Jul 2023 06:42:42 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v6] In-Reply-To: <8KFiP2c2bVaIpns8acfNX0dh_FMeGWq4GSHhFCeJ2o0=.a518e5ac-3213-431f-a42d-4968a16777f0@github.com> References: <8KFiP2c2bVaIpns8acfNX0dh_FMeGWq4GSHhFCeJ2o0=.a518e5ac-3213-431f-a42d-4968a16777f0@github.com> Message-ID: On Mon, 24 Jul 2023 14:16:02 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > remove unused branch Hi, would you mind one more tweak? Thanks. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2364: > 2362: tmpU = isLU ? tmp5 : tmp1, // where to keep U for comparison > 2363: tmpL = isLU ? tmp1 : tmp5; // where to keep L for comparison > 2364: I see `cnt1`(x12) is unused now. We could make use of this to further eliminate use of both `tmp4` (x7) and `tmp5` (x31). Let: tmpU = isLU ? tmp2 : tmp1, // where to keep U for comparison tmpL = isLU ? tmp1 : tmp2; // where to keep L for comparison And use `cnt1` and `tmp2` instead where `tmp4` and `tmp5` are used respectively. This would help save one `mv` and saving & restoring of `tmp4` and `tmp5`. Also remember to change `tmpLval` from `tmp4` (x7) to `cnt1`(x12) in compare_string_8_x_LU. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2391: > 2389: > 2390: __ addi(t0, cnt2, wordSize); > 2391: __ addi(cnt2, cnt2, wordSize*2); // amount of characters left to process Indentation: s/__ addi(cnt2, cnt2, wordSize*2);/__ addi(cnt2, cnt2, wordSize * 2);/ src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2410: > 2408: __ bnez(tmp3, CALCULATE_DIFFERENCE); > 2409: > 2410: __ addi(strL, strL, wordSize/2); // Address of last 4 bytes in Latin1 string Indentation: s/__ addi(strL, strL, wordSize/2);/__ addi(strL, strL, wordSize / 2);/ ------------- Changes requested by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14534#pullrequestreview-1544726633 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1273057036 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1273058345 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1273059841 From eosterlund at openjdk.org Tue Jul 25 07:27:46 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Tue, 25 Jul 2023 07:27:46 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:12:13 GMT, Ioi Lam wrote: >>> I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled. >> >> I've implemented a fast relocation lookup. It currently uses a table of the same size as the archived heap objects, but I can reduce that to 1/2 the size. >> >> See https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 >> >> This is implemented by about 330 lines of code in archiveHeapLoader.cpp. The code is templatized to try out different approaches (like `-XX:+NahlRawAlloc` and `-XX:+NahlUseAccessAPI`), so it can be further simplified. >> >> There's only one thing that's not yet implemented -- the equivalent of `ArchiveHeapLoader::patch_native_pointers()`. I'll do that next. >> >> >> $ java -XX:+NewArchiveHeapLoading -Xmx128m -Xlog:cds+gc --version >> [0.004s][info][cds,gc] Delayed allocation records alloced: 640 >> [0.004s][info][cds,gc] Load Time: 1388458 >> >> >> The whole allocation + reloc takes about 1.4ms. It's about 1.25ms slower in the worst case (when the "old" code doesn't have to relocate -- see the `(**)` in the table below). It's 0.8ms slower when the "old" code has to relocate. >> >> >> All times are in ms, for "java --version" >> >> ==================================== >> Dump: java -Xshare:dump -Xmx128m >> >> G1 old new diff >> 128m 14.476 15.754 +1.277 (**) >> 8192m 15.359 16.085 +0.726 >> >> >> Serial old new >> 128m 13.442 14.241 +0.798 >> 8192m 13.740 14.532 +0.791 >> >> ==================================== >> Dump: java -Xshare:dump -Xmx8192m >> >> G1 old new diff >> 128m 14.975 15.787 +0.812 >> 2048m 16.239 17.035 +0.796 >> 8192m 14.821 16.042 +1.221 (**) >> >> >> Serial old new >> 128m 13.444 14.167 +0.723 >> 8192m 13.717 14.502 +0.785 >> >> >> While the code is slower than before, it's a lot simpler. It works on all collectors. I tested on ZGC, but I think Shenandoah should work as well. >> >> The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future. >> >> The extra memory cost is: >> >> - a temporary in-memory copy of the archived heap o... > >> @iklam can you please elaborate a bit on relocation optimizations being done by the patch. Without any background on the idea, it is difficult to infer it from the code. > > The algorithm tries to materialize all objects and relocate their oop fields in a single pass. (Each object has a "stream address" (its location in the input stream) and a "materialized address" (its location in the runtime heap)) > > - Materialize one object from the input stream > - Enter the materialized address of this object in the `reloc_table`. Since the input stream is contiguous, we can index `reloc_table` by computing the offset of the `stream` address of this object to the bottom of the input stream. > - For each non-null oop pointer in the materialized object: > - If the pointee's stream address is lower than that of the current object, update the pointer with the pointee's materialized address, which is already in `reloc_table` > - Otherwise, enter the location of this pointer into `reloc_table`, as a linked-list of the `Dst` type, at the `reloc_table` entry of the pointee. When the pointee is materialized, it walks its own `Dst` list, and relocate all pointers to itself. > > My branch contains a separate patch for [JDK-8251330](https://bugs.openjdk.org/browse/JDK-8251330) -- the input stream is ordered such that: > - the first 50% of the input stream contains no pointers, so relocation can be skipped altogether > - in the remaining input stream, about 90% of the 43225 pointers are pointing below the current object, so they can be relocated quickly. Less than 640 `Dst` are needed for the "delayed relocation". Excellent work, @iklam! This work looks very promising to me. Seems like the price we have to pay for the generality is very small indeed, and well worth it. Thank you for doing this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1649269538 From vkempik at openjdk.org Tue Jul 25 08:07:09 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 25 Jul 2023 08:07:09 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v7] In-Reply-To: References: Message-ID: <_xSyRKiDdm8aS0XzBLaID7PGJwmZWjXYN0ojaG06k-0=.d64f9527-0a34-4abc-b7da-b36e42c9e860@github.com> > Please review this fix. it eliminates misaligned loads in String.compare on risc-v > > for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, > it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. > > so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. > > I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. > > Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. > > Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. > > for large strings, the instrinsics from stubGenerator.cpp is used > for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. > > large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: > These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. > > This also enables regression test for string.Compare which previously was aarch64-only > > Testing: tier1 and tier2 clean on hifive. > > JMH testing, hifive: > before: > > Benchmark (delta) (size) Mode Cnt Score Error Units > StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op > StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op > StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op > StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op > StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op > StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? 0.458 ms/op > StringCompareToDifferentLength.compareToLL 2 ... Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: reduce registers usage and nits ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14534/files - new: https://git.openjdk.org/jdk/pull/14534/files/0783c477..f6e066e5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=05-06 Stats: 14 lines in 1 file changed: 0 ins; 4 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14534.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14534/head:pull/14534 PR: https://git.openjdk.org/jdk/pull/14534 From vkempik at openjdk.org Tue Jul 25 08:07:09 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 25 Jul 2023 08:07:09 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v6] In-Reply-To: References: <8KFiP2c2bVaIpns8acfNX0dh_FMeGWq4GSHhFCeJ2o0=.a518e5ac-3213-431f-a42d-4968a16777f0@github.com> Message-ID: On Tue, 25 Jul 2023 06:32:26 GMT, Fei Yang wrote: >> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: >> >> remove unused branch > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2364: > >> 2362: tmpU = isLU ? tmp5 : tmp1, // where to keep U for comparison >> 2363: tmpL = isLU ? tmp1 : tmp5; // where to keep L for comparison >> 2364: > > I see `cnt1`(x12) is unused now. We could make use of this to further eliminate use of both `tmp4` (x7) and `tmp5` (x31). > Let: > > tmpU = isLU ? tmp2 : tmp1, // where to keep U for comparison > tmpL = isLU ? tmp1 : tmp2; // where to keep L for comparison > > And use `cnt1` and `tmp2` instead where `tmp4` and `tmp5` are used respectively. This would help save one `mv` and saving & restoring of `tmp4` and `tmp5`. Also remember to change `tmpLval` from `tmp4` (x7) to `cnt1`(x12) in compare_string_8_x_LU. It works, no difference in performance tho ( UL and LU cases) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1273154473 From jpai at openjdk.org Tue Jul 25 09:06:54 2023 From: jpai at openjdk.org (Jaikiran Pai) Date: Tue, 25 Jul 2023 09:06:54 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 12:17:31 GMT, Coleen Phillimore wrote: >> Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. >> >> Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indentation, thanks for pointing that out @aph. Hello Coleen, looks like the github actions job failure in this PR for windows-aarch64 are genuine - I see it failing in some other recent unrelated PRs containing this commit, like here https://github.com/openjdk/jdk/pull/15012 === Output from failing command(s) repeated here === * For target hotspot_variant-server_libjvm_gtest_objs_BUILD_GTEST_LIBJVM_pch.obj: BUILD_GTEST_LIBJVM_pch.cpp d:\a\jdk\jdk\src\hotspot\cpu\aarch64\assembler_aarch64.hpp(657): error C2220: the following warning is treated as an error d:\a\jdk\jdk\src\hotspot\cpu\aarch64\assembler_aarch64.hpp(657): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) ... (rest of output omitted) * All command lines available in /d/a/jdk/jdk/build/windows-aarch64/make-support/failure-logs. === End of repeated output === No indication of failed target found. HELP: Try searching the build log for '] Error'. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14822#issuecomment-1649423711 From aph at openjdk.org Tue Jul 25 11:20:45 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 25 Jul 2023 11:20:45 GMT Subject: RFR: 8312502: Mass migrate HotSpot attributes to the correct location [v3] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 02:28:02 GMT, Kim Barrett wrote: > Could I split this up and discard some of the areas in which the changes are less agreeable? (as advised by someone who reached out to me privately) I'm a little surprised that you're persisting with this. Kim said: > https://openjdk.org/guide/#things-to-consider-before-proposing-changes-to-openjdk-code > This change looks like a case of pure "Modernizing", which isn't looked on > particularly favorably for its own sake. There generally needs to be some > additional benefits. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14969#issuecomment-1649638229 From coleenp at openjdk.org Tue Jul 25 12:25:41 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 25 Jul 2023 12:25:41 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v3] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 03:05:49 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Missed TriBoolAssigner case. This doesn't compile on Windows, without a whole lot of other casts. I like my solution better now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1649736188 From coleenp at openjdk.org Tue Jul 25 12:56:54 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 25 Jul 2023 12:56:54 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v4] In-Reply-To: References: Message-ID: <83Td5jRPH10B33fu1lem7ZMK7P0hm2DTQG00flWvkR4=.713a3a3c-ad85-4744-bc2f-60dbd330145f@github.com> On Tue, 25 Jul 2023 09:04:12 GMT, Jaikiran Pai wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix indentation, thanks for pointing that out @aph. > > Hello Coleen, looks like the github actions job failure in this PR for windows-aarch64 are genuine - I see it failing in some other recent unrelated PRs containing this commit, like here https://github.com/openjdk/jdk/pull/15012 > > > === Output from failing command(s) repeated here === > * For target hotspot_variant-server_libjvm_gtest_objs_BUILD_GTEST_LIBJVM_pch.obj: > BUILD_GTEST_LIBJVM_pch.cpp > d:\a\jdk\jdk\src\hotspot\cpu\aarch64\assembler_aarch64.hpp(657): error C2220: the following warning is treated as an error > d:\a\jdk\jdk\src\hotspot\cpu\aarch64\assembler_aarch64.hpp(657): warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) > ... (rest of output omitted) > > * All command lines available in /d/a/jdk/jdk/build/windows-aarch64/make-support/failure-logs. > === End of repeated output === > > No indication of failed target found. > HELP: Try searching the build log for '] Error'. @jaikiran I don't think we have that platform. Do you? Can you fix it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14822#issuecomment-1649786946 From coleenp at openjdk.org Tue Jul 25 15:16:06 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 25 Jul 2023 15:16:06 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v4] In-Reply-To: References: Message-ID: > Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. > Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add some more u1 casts to keep Windows compiler happy hopefully. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14892/files - new: https://git.openjdk.org/jdk/pull/14892/files/f4f32fb5..f135916d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=02-03 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14892.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14892/head:pull/14892 PR: https://git.openjdk.org/jdk/pull/14892 From matsaave at openjdk.org Tue Jul 25 15:28:56 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 25 Jul 2023 15:28:56 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v10] In-Reply-To: References: Message-ID: On Sun, 23 Jul 2023 23:04:32 GMT, David Holmes wrote: >> src/hotspot/share/oops/resolvedFieldEntry.hpp line 125: >> >>> 123: _tos = tos; >>> 124: >>> 125: // This has to be done last >> >> Maybe better use `OrderAccess::release()` here instead of 2x `Atomic::release_store` above? That would fit to the comment. > > Is the comment accurate? The semantics would be different. Does seeing the update to `_put_code` require that the update to `_get_code` is also seen? The comment seems to be unclear so I will update it. The fields `_get_code` and `_put_code` do not require the other to be seen, they just have to be set after all of the other fields. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14129#discussion_r1273726943 From matsaave at openjdk.org Tue Jul 25 15:38:11 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 25 Jul 2023 15:38:11 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v11] In-Reply-To: References: Message-ID: > 8301996: Move field resolution information out of the cpCache Matias Saavedra Silva has updated the pull request incrementally with two additional commits since the last revision: - RISCV Fix and PPC port - Rename tos to tos_state and comment improvement ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/8ae1e1c1..14e8297d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=09-10 Stats: 166 lines in 8 files changed: 85 ins; 15 del; 66 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From matsaave at openjdk.org Tue Jul 25 15:49:10 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 25 Jul 2023 15:49:10 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v12] In-Reply-To: References: Message-ID: > The current structure used to store the resolution information for fields, ConstantPoolCacheEntry, is difficult to interpret due to its ambigious fields f1 and f2. This structure can hold information for fields and methods and each of its fields can hold different types of values depending on the entry type. > > This enhancement introduces a new data structure that stores the necessary resolution data in an intuitive an extensible manner. These resolved entries are stored in an array inside the constant pool cache in a very similar manner to invokedynamic entries in JDK-8301995. > > Instances of ConstantPoolCache entry related to field resolution have been replaced with the new ResolvedFieldEntry. Verified with tier 1-9 tests. > > This change supports the following platforms: x86, aarch64, PPC. and RISCV Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Fix tos_state ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14129/files - new: https://git.openjdk.org/jdk/pull/14129/files/14e8297d..a090b482 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14129&range=10-11 Stats: 5 lines in 2 files changed: 0 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14129.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14129/head:pull/14129 PR: https://git.openjdk.org/jdk/pull/14129 From matsaave at openjdk.org Tue Jul 25 18:48:58 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 25 Jul 2023 18:48:58 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v2] In-Reply-To: References: Message-ID: > Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. > > > // We have finished dumping the static archive. At this point, there may be pending VM > // operations. We have changed some global states (such as vmClasses::_klasses) that > // may cause these VM operations to fail. For safety, forget these operations and > // exit the VM directly. > void MetaspaceShared::exit_after_static_dump() { > os::_exit(0); > } > > > As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: > 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead > 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. > 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. > > Verified with tier 1-9 tests. Matias Saavedra Silva has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: - Merge fix - Restores java loaders - Ioi and David comments - Windows fix - 8306582: Remove MetaspaceShared::exit_after_static_dump() ------------- Changes: https://git.openjdk.org/jdk/pull/14879/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14879&range=01 Stats: 127 lines in 9 files changed: 95 ins; 27 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14879.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14879/head:pull/14879 PR: https://git.openjdk.org/jdk/pull/14879 From matsaave at openjdk.org Tue Jul 25 18:48:59 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 25 Jul 2023 18:48:59 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 21:25:17 GMT, Matias Saavedra Silva wrote: > Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. > > > // We have finished dumping the static archive. At this point, there may be pending VM > // operations. We have changed some global states (such as vmClasses::_klasses) that > // may cause these VM operations to fail. For safety, forget these operations and > // exit the VM directly. > void MetaspaceShared::exit_after_static_dump() { > os::_exit(0); > } > > > As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: > 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead > 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. > 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. > > Verified with tier 1-9 tests. I messed up a merge and pushed the incorrect contents to this PR. It will be temporarily reverted to a draft and force pushed to undo my mistake. The existing comments should be addressed in the most recent commits. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14879#issuecomment-1649978355 From dchuyko at openjdk.org Tue Jul 25 20:55:03 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Tue, 25 Jul 2023 20:55:03 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v2] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives. Methods in general are often inlined, and this information is hard to track down. > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Obviously there is a performance penalty, so it should be applied with care. Hot code will most likely be recompiled soon, as nothing happens to its hotness. > > A new flag '`-d`' has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and marks for deoptimization those methods that have any active non-default matching compiler directives. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be deoptimized. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be deoptimized, but this can be achieved by having rules for them. > > In addition, a new diagnistic command `Compiler.replace_directives`, has been added for convenience. It's like a combinatio... Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: Recompilation instead of deoptimization ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14111/files - new: https://git.openjdk.org/jdk/pull/14111/files/7f9badf8..d0bcf852 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=00-01 Stats: 87 lines in 9 files changed: 38 ins; 10 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From serb at openjdk.org Tue Jul 25 21:02:56 2023 From: serb at openjdk.org (Sergey Bylokhov) Date: Tue, 25 Jul 2023 21:02:56 GMT Subject: RFR: 8295017: Remove Windows specific workaround in JLI_Snprintf [v3] In-Reply-To: <0wUuynDia128uyCaMmWi7BltH8HQcyI-CKcyGcP_Ucc=.89942c4d-b2a5-4fd2-8599-0c43745057a6@github.com> References: <0wUuynDia128uyCaMmWi7BltH8HQcyI-CKcyGcP_Ucc=.89942c4d-b2a5-4fd2-8599-0c43745057a6@github.com> Message-ID: On Thu, 13 Oct 2022 14:34:25 GMT, Julian Waters wrote: >> Looks good. Thanks. > > @dholmes-ora could I trouble you for a sponsor? Thanks! @TheShermanTanker Working on a similar cleanup, and wonder if is it correct to assume that the "snprintf" adds "nul" even in case of error. For example, this code was removed by this patch: if (rc < 0) { /* apply ansi semantics */ buffer[size - 1] = '\0'; return (int)size; } else if (rc == size) { /* force a null terminator */ buffer[size - 1] = '\0'; } If the result of "snprintf" was negative we always set the '\0'. But what about default "snprintf"? ------------- PR Comment: https://git.openjdk.org/jdk/pull/10625#issuecomment-1650553077 From jvernee at openjdk.org Tue Jul 25 21:28:45 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 25 Jul 2023 21:28:45 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee src/hotspot/cpu/s390/downcallLinker_s390.cpp line 162: > 160: > 161: assert(!_needs_return_buffer, "unexpected needs_return_buffer"); > 162: bool should_save_return_value = _needs_transition;; This should always be `true`, so I don't think you need the `if` statements around the spill/fill code below. See: https://github.com/openjdk/jdk/pull/15025 (`should_save_return_value` being dependent on `_needs_transition` is a bug). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1274138222 From dchuyko at openjdk.org Tue Jul 25 22:02:56 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Tue, 25 Jul 2023 22:02:56 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v3] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives. Methods in general are often inlined, and this information is hard to track down. > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Obviously there is a performance penalty, so it should be applied with care. Hot code will most likely be recompiled soon, as nothing happens to its hotness. > > A new flag '`-d`' has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and marks for deoptimization those methods that have any active non-default matching compiler directives. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be deoptimized. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be deoptimized, but this can be achieved by having rules for them. > > In addition, a new diagnistic command `Compiler.replace_directives`, has been added for convenience. It's like a combinatio... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge master - Recompilation instead of deoptimization - Merge branch 'openjdk:master' into compiler-directives-force-update - Formatting - Formatting - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Correct arguments info for new commands - Update through de-optimization ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=02 Stats: 267 lines in 14 files changed: 224 ins; 2 del; 41 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From dchuyko at openjdk.org Tue Jul 25 22:09:04 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Tue, 25 Jul 2023 22:09:04 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v4] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives. Methods in general are often inlined, and this information is hard to track down. > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Obviously there is a performance penalty, so it should be applied with care. Hot code will most likely be recompiled soon, as nothing happens to its hotness. > > A new flag '`-d`' has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and marks for deoptimization those methods that have any active non-default matching compiler directives. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be deoptimized. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be deoptimized, but this can be achieved by having rules for them. > > In addition, a new diagnistic command `Compiler.replace_directives`, has been added for convenience. It's like a combinatio... Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: jcheck fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14111/files - new: https://git.openjdk.org/jdk/pull/14111/files/40de84af..2d8cf4e9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From jwaters at openjdk.org Tue Jul 25 23:30:56 2023 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 25 Jul 2023 23:30:56 GMT Subject: RFR: 8295017: Remove Windows specific workaround in JLI_Snprintf [v3] In-Reply-To: References: <0wUuynDia128uyCaMmWi7BltH8HQcyI-CKcyGcP_Ucc=.89942c4d-b2a5-4fd2-8599-0c43745057a6@github.com> Message-ID: On Tue, 25 Jul 2023 21:00:17 GMT, Sergey Bylokhov wrote: >> @dholmes-ora could I trouble you for a sponsor? Thanks! > > @TheShermanTanker Working on a similar cleanup, and wonder if is it correct to assume that the "snprintf" adds "nul" even in case of error. > For example, this code was removed by this patch: > > > if (rc < 0) { > /* apply ansi semantics */ > buffer[size - 1] = '\0'; > return (int)size; > } else if (rc == size) { > /* force a null terminator */ > buffer[size - 1] = '\0'; > } > > > If the result of "snprintf" was negative we always set the '\0'. But what about default "snprintf"? @mrserb snprintf in the UCRT does indeed null terminate the buffer, conforming to C99. The relevant behaviour can be found documented under https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/snprintf-snprintf-snprintf-l-snwprintf-snwprintf-l?view=msvc-170#behavior-summary ------------- PR Comment: https://git.openjdk.org/jdk/pull/10625#issuecomment-1650700451 From serb at openjdk.org Tue Jul 25 23:55:00 2023 From: serb at openjdk.org (Sergey Bylokhov) Date: Tue, 25 Jul 2023 23:55:00 GMT Subject: RFR: 8295017: Remove Windows specific workaround in JLI_Snprintf [v5] In-Reply-To: References: Message-ID: On Thu, 13 Oct 2022 14:48:29 GMT, Julian Waters wrote: >> The C99 snprintf is available with Visual Studio 2015 and above, alongside Windows 10 and the UCRT, and is no longer identical to the outdated Windows _snprintf. Since support for the Visual C++ 2017 compiler was removed a while ago, we can now safely remove the compatibility workaround on Windows and have JLI_Snprintf simply delegate to snprintf. > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-1 > - Merge branch 'openjdk:master' into patch-1 > - Comment documenting change isn't required > - Merge branch 'openjdk:master' into patch-1 > - Comment formatting > - Remove Windows specific JLI_Snprintf implementation > - Remove Windows JLI_Snprintf definition Thank you! >If processing string specifier s, S, or Z, format specification processing stops, a NULL is placed at the beginning of the buffer. I hope this is not an MS extension/implementation detail since I did not find this in any other places. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10625#issuecomment-1650717499 From duke at openjdk.org Wed Jul 26 02:56:50 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 26 Jul 2023 02:56:50 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 16:12:13 GMT, Ioi Lam wrote: >>> I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled. >> >> I've implemented a fast relocation lookup. It currently uses a table of the same size as the archived heap objects, but I can reduce that to 1/2 the size. >> >> See https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1 >> >> This is implemented by about 330 lines of code in archiveHeapLoader.cpp. The code is templatized to try out different approaches (like `-XX:+NahlRawAlloc` and `-XX:+NahlUseAccessAPI`), so it can be further simplified. >> >> There's only one thing that's not yet implemented -- the equivalent of `ArchiveHeapLoader::patch_native_pointers()`. I'll do that next. >> >> >> $ java -XX:+NewArchiveHeapLoading -Xmx128m -Xlog:cds+gc --version >> [0.004s][info][cds,gc] Delayed allocation records alloced: 640 >> [0.004s][info][cds,gc] Load Time: 1388458 >> >> >> The whole allocation + reloc takes about 1.4ms. It's about 1.25ms slower in the worst case (when the "old" code doesn't have to relocate -- see the `(**)` in the table below). It's 0.8ms slower when the "old" code has to relocate. >> >> >> All times are in ms, for "java --version" >> >> ==================================== >> Dump: java -Xshare:dump -Xmx128m >> >> G1 old new diff >> 128m 14.476 15.754 +1.277 (**) >> 8192m 15.359 16.085 +0.726 >> >> >> Serial old new >> 128m 13.442 14.241 +0.798 >> 8192m 13.740 14.532 +0.791 >> >> ==================================== >> Dump: java -Xshare:dump -Xmx8192m >> >> G1 old new diff >> 128m 14.975 15.787 +0.812 >> 2048m 16.239 17.035 +0.796 >> 8192m 14.821 16.042 +1.221 (**) >> >> >> Serial old new >> 128m 13.444 14.167 +0.723 >> 8192m 13.717 14.502 +0.785 >> >> >> While the code is slower than before, it's a lot simpler. It works on all collectors. I tested on ZGC, but I think Shenandoah should work as well. >> >> The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future. >> >> The extra memory cost is: >> >> - a temporary in-memory copy of the archived heap o... > >> @iklam can you please elaborate a bit on relocation optimizations being done by the patch. Without any background on the idea, it is difficult to infer it from the code. > > The algorithm tries to materialize all objects and relocate their oop fields in a single pass. (Each object has a "stream address" (its location in the input stream) and a "materialized address" (its location in the runtime heap)) > > - Materialize one object from the input stream > - Enter the materialized address of this object in the `reloc_table`. Since the input stream is contiguous, we can index `reloc_table` by computing the offset of the `stream` address of this object to the bottom of the input stream. > - For each non-null oop pointer in the materialized object: > - If the pointee's stream address is lower than that of the current object, update the pointer with the pointee's materialized address, which is already in `reloc_table` > - Otherwise, enter the location of this pointer into `reloc_table`, as a linked-list of the `Dst` type, at the `reloc_table` entry of the pointee. When the pointee is materialized, it walks its own `Dst` list, and relocate all pointers to itself. > > My branch contains a separate patch for [JDK-8251330](https://bugs.openjdk.org/browse/JDK-8251330) -- the input stream is ordered such that: > - the first 50% of the input stream contains no pointers, so relocation can be skipped altogether > - in the remaining input stream, about 90% of the 43225 pointers are pointing below the current object, so they can be relocated quickly. Less than 640 `Dst` are needed for the "delayed relocation". @iklam I agree this is a much better approach and makes the whole process truly collector agnostic. Great work, specially the optimization to re-order the objects. Given that this has minimal impact on performance, are we good to go ahead with this approach now? One issue I noticed while doing some testing with Shenandoah collector is probably worth pointing out here: When using `-XX:+NahlRawAlloc` with very small heap size like -Xmx4m or -Xmx8m the java process freezes. . This happens because the allocations for archive objects causes pacing to kick in and the main thread waits on `ShenandoahPacer::_wait_monitor` [0] to be notified by ShenandoahPeriodicPacerNotify [1]. But the WatcherThread which is responsible for executing the `ShenandoahPeriodicPacerNotify` task does run the periodic tasks until VM init is done [2][3]. So the main thread is stuck now. I guess if we do the wait in `ShenandoahPacer::pace_for_alloc` only after VM init is completed, it would resolve this issue. I haven't noticed this with `-XX:-NahlRawAlloc`, not sure why that should make any difference. Here are the stack traces: main thread: #5 0x00007f5a1fafbafc in PlatformMonitor::wait (this=this at entry=0x7f5a180f6c78, millis=, millis at entry=10) at src/hotspot/os/posix/mutex_posix.hpp:124 #6 0x00007f5a1faa3f9c in Monitor::wait (this=0x7f5a180f6c70, timeout=timeout at entry=10) at src/hotspot/share/runtime/mutex.cpp:254 #7 0x00007f5a1fc2d3bd in ShenandoahPacer::wait (time_ms=10, this=0x7f5a180f6a20) at src/hotspot/share/gc/shenandoah/shenandoahPacer.cpp:286 #8 ShenandoahPacer::pace_for_alloc (this=0x7f5a180f6a20, words=) at src/hotspot/share/gc/shenandoah/shenandoahPacer.cpp:263 #9 0x00007f5a1fbfc7e1 in ShenandoahHeap::allocate_memory (this=0x7f5a180ca590, req=...) at src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp:855 #10 0x00007f5a1fbfcb5c in ShenandoahHeap::mem_allocate (this=, size=, gc_overhead_limit_was_exceeded=) at src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp:931 #11 0x00007f5a1f2402c2 in NewQuickLoader::mem_allocate_raw (size=6) at src/hotspot/share/cds/archiveHeapLoader.cpp:493 #12 NewQuickLoaderImpl::allocate (this=, __the_thread__=, size=: 6, stream=0x7f5a1d228850) at src/hotspot/share/cds/archiveHeapLoader.cpp:712 #13 NewQuickLoaderImpl::load_archive_heap_inner (__the_thread__=0x7f5a180b8810, stream_top=0x7f5a1d28e168, stream=0x7f5a1d228850, stream_bottom=0x7f5a1d204000, this=0x7f5a1edfe9c0) at src/hotspot/share/cds/archiveHeapLoader.cpp:634 #14 NewQuickLoaderImpl::load_archive_heap (this=this at entry=0x7f5a1edfe9c0, __the_thread__=__the_thread__ at entry=0x7f5a180b8810) at src/hotspot/share/cds/archiveHeapLoader.cpp:603 #15 0x00007f5a1f22da2a in ArchiveHeapLoader::new_fixup_region (__the_thread__=__the_thread__ at entry=0x7f5a180b8810) at src/hotspot/share/cds/archiveHeapLoader.cpp:806 #16 0x00007f5a1f22f00b in ArchiveHeapLoader::fixup_region () at src/hotspot/share/cds/archiveHeapLoader.cpp:90 #17 0x00007f5a1fde5bd5 in vmClasses::resolve_all (__the_thread__=__the_thread__ at entry=0x7f5a180b8810) at src/hotspot/share/classfile/vmClasses.cpp:145 #18 0x00007f5a1fd186a9 in SystemDictionary::initialize (__the_thread__=__the_thread__ at entry=0x7f5a180b8810) at src/hotspot/share/classfile/systemDictionary.cpp:1616 #19 0x00007f5a1fd8b960 in Universe::genesis (__the_thread__=__the_thread__ at entry=0x7f5a180b8810) at src/hotspot/share/memory/universe.cpp:356 #20 0x00007f5a1fd8d093 in universe2_init () at src/hotspot/share/memory/universe.cpp:977 #21 0x00007f5a1f6e10f9 in init_globals2 () at src/hotspot/share/runtime/init.cpp:150 #22 0x00007f5a1fd6c729 in Threads::create_vm (args=, canTryAgain=canTryAgain at entry=0x7f5a1edfedaf) at src/hotspot/share/runtime/threads.cpp:568 #23 0x00007f5a1f7af69e in JNI_CreateJavaVM_inner (args=, penv=0x7f5a1edfee58, vm=0x7f5a1edfee50) at src/hotspot/share/prims/jni.cpp:3577 #24 JNI_CreateJavaVM (vm=0x7f5a1edfee50, penv=0x7f5a1edfee58, args=) at src/hotspot/share/prims/jni.cpp:3668 #25 0x00007f5a206d81ef in InitializeJVM (ifn=, penv=0x7f5a1edfee58, pvm=0x7f5a1edfee50) at src/java.base/share/native/libjli/java.c:1506 #26 JavaMain (_args=) at src/java.base/share/native/libjli/java.c:415 #27 0x00007f5a206dbe29 in ThreadJavaMain (args=) at src/java.base/unix/native/libjli/java_md.c:650 #28 0x00007f5a20582907 in start_thread (arg=) at pthread_create.c:444 #29 0x00007f5a20608870 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 Watcher thread: #4 ___pthread_cond_timedwait64 (cond=0x7f5a202f0e30 , mutex=, abstime=0x7f5a1d126d20) at pthread_cond_wait.c:643 #5 0x00007f5a1fafbafc in PlatformMonitor::wait (this=this at entry=0x7f5a202f0e08 , millis=, millis at entry=100) at src/hotspot/os/posix/mutex_posix.hpp:124 #6 0x00007f5a1faa3f19 in Monitor::wait_without_safepoint_check (this=this at entry=0x7f5a202f0e00 , timeout=timeout at entry=100) at src/hotspot/share/runtime/mutex.cpp:226 #7 0x00007f5a1fabcded in MonitorLocker::wait (this=, timeout=100) at src/hotspot/share/runtime/mutexLocker.hpp:254 #8 WatcherThread::sleep (this=this at entry=0x7f5a1810a290) at src/hotspot/share/runtime/nonJavaThread.cpp:189 #9 0x00007f5a1fabce61 in WatcherThread::run (this=0x7f5a1810a290) at src/hotspot/share/runtime/nonJavaThread.cpp:249 #10 0x00007f5a1fd5f49f in Thread::call_run (this=0x7f5a1810a290) at src/hotspot/share/runtime/thread.cpp:217 #11 0x00007f5a1faf0a9a in thread_native_entry (thread=0x7f5a1810a290) at src/hotspot/os/linux/os_linux.cpp:779 #12 0x00007f5a20582907 in start_thread (arg=) at pthread_create.c:444 #13 0x00007f5a20608870 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81 [0] https://github.com/openjdk/jdk/blob/2d05d3545c8fe4d9e5ad3cee673fc938f84d1901/src/hotspot/share/gc/shenandoah/shenandoahPacer.cpp#L263 [1] https://github.com/openjdk/jdk/blob/2d05d3545c8fe4d9e5ad3cee673fc938f84d1901/src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp#L75-L78 [2] https://github.com/openjdk/jdk/blob/2d05d3545c8fe4d9e5ad3cee673fc938f84d1901/src/hotspot/share/runtime/threads.cpp#L804-L808 [3] https://github.com/openjdk/jdk/blob/2d05d3545c8fe4d9e5ad3cee673fc938f84d1901/src/hotspot/share/runtime/nonJavaThread.cpp#L288 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1650896237 From dholmes at openjdk.org Wed Jul 26 03:36:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Jul 2023 03:36:48 GMT Subject: RFR: 8295017: Remove Windows specific workaround in JLI_Snprintf [v5] In-Reply-To: References: Message-ID: <3hsPveYeVHuicgxUzShA-jeSXZFfkO_amBC8iAdOSt8=.db95b1f4-0b7c-4bd9-8989-3463c7058e0e@github.com> On Tue, 25 Jul 2023 23:51:34 GMT, Sergey Bylokhov wrote: > I hope this is not an MS extension/implementation detail since I did not find this in any other places. @mrserb this change was to a Windows specific file. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10625#issuecomment-1650925566 From fyang at openjdk.org Wed Jul 26 03:45:44 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 26 Jul 2023 03:45:44 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v7] In-Reply-To: <_xSyRKiDdm8aS0XzBLaID7PGJwmZWjXYN0ojaG06k-0=.d64f9527-0a34-4abc-b7da-b36e42c9e860@github.com> References: <_xSyRKiDdm8aS0XzBLaID7PGJwmZWjXYN0ojaG06k-0=.d64f9527-0a34-4abc-b7da-b36e42c9e860@github.com> Message-ID: On Tue, 25 Jul 2023 08:07:09 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > reduce registers usage and nits Marked as reviewed by fyang (Reviewer). src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 917: > 915: } > 916: addi(cnt2, cnt2, isUL ? 4 : 8); > 917: bne(tmp1, tmp2, DIFFERENCE); You might want to place this `bne` before the preceding `addi(cnt2, cnt2, isUL ? 4 : 8);` src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2396: > 2394: __ bind(LOAD_LAST); // cnt2 = 1..7 characters left > 2395: > 2396: __ addi(cnt2, cnt2, -wordSize); //cnt2 is now an offset in strL which points to last 8 bytes Indentation: // cnt2 is now an offset in strL which points to last 8 bytes ------------- PR Review: https://git.openjdk.org/jdk/pull/14534#pullrequestreview-1546775080 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1274340284 PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1274340740 From dholmes at openjdk.org Wed Jul 26 03:54:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Jul 2023 03:54:53 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v4] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 15:16:06 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add some more u1 casts to keep Windows compiler happy hopefully. To me the fact we have so much trouble placating the compilers suggests that we are doing something wrong/broken here! :( Is the `& 3` still needed with the extra u1 cast? I'm gonna guess yes as a u1 doesn't fit in a uchar:2 any better than a uint does. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14892#pullrequestreview-1546780415 From iklam at openjdk.org Wed Jul 26 04:31:53 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 26 Jul 2023 04:31:53 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 02:53:25 GMT, Ashutosh Mehra wrote: >>> @iklam can you please elaborate a bit on relocation optimizations being done by the patch. Without any background on the idea, it is difficult to infer it from the code. >> >> The algorithm tries to materialize all objects and relocate their oop fields in a single pass. (Each object has a "stream address" (its location in the input stream) and a "materialized address" (its location in the runtime heap)) >> >> - Materialize one object from the input stream >> - Enter the materialized address of this object in the `reloc_table`. Since the input stream is contiguous, we can index `reloc_table` by computing the offset of the `stream` address of this object to the bottom of the input stream. >> - For each non-null oop pointer in the materialized object: >> - If the pointee's stream address is lower than that of the current object, update the pointer with the pointee's materialized address, which is already in `reloc_table` >> - Otherwise, enter the location of this pointer into `reloc_table`, as a linked-list of the `Dst` type, at the `reloc_table` entry of the pointee. When the pointee is materialized, it walks its own `Dst` list, and relocate all pointers to itself. >> >> My branch contains a separate patch for [JDK-8251330](https://bugs.openjdk.org/browse/JDK-8251330) -- the input stream is ordered such that: >> - the first 50% of the input stream contains no pointers, so relocation can be skipped altogether >> - in the remaining input stream, about 90% of the 43225 pointers are pointing below the current object, so they can be relocated quickly. Less than 640 `Dst` are needed for the "delayed relocation". > > @iklam I agree this is a much better approach and makes the whole process truly collector agnostic. Great work, specially the optimization to re-order the objects. > > Given that this has minimal impact on performance, are we good to go ahead with this approach now? > > One issue I noticed while doing some testing with Shenandoah collector is probably worth pointing out here: > When using `-XX:+NahlRawAlloc` with very small heap size like -Xmx4m or -Xmx8m the java process freezes. . This happens because the allocations for archive objects causes pacing to kick in and the main thread waits on `ShenandoahPacer::_wait_monitor` [0] to be notified by ShenandoahPeriodicPacerNotify [1]. But the WatcherThread which is responsible for executing the `ShenandoahPeriodicPacerNotify` task does run the periodic tasks until VM init is done [2][3]. So the main thread is stuck now. > > I guess if we do the wait in `ShenandoahPacer::pace_for_alloc` only after VM init is completed, it would resolve this issue. > > I haven't noticed this with `-XX:-NahlRawAlloc`, not sure why that should make any difference. > > Here are the stack traces: > > main thread: > > #5 0x00007f5a1fafbafc in PlatformMonitor::wait (this=this at entry=0x7f5a180f6c78, millis=, millis at entry=10) at src/hotspot/os/posix/mutex_posix.hpp:124 > #6 0x00007f5a1faa3f9c in Monitor::wait (this=0x7f5a180f6c70, timeout=timeout at entry=10) at src/hotspot/share/runtime/mutex.cpp:254 > #7 0x00007f5a1fc2d3bd in ShenandoahPacer::wait (time_ms=10, this=0x7f5a180f6a20) at src/hotspot/share/gc/shenandoah/shenandoahPacer.cpp:286 > #8 ShenandoahPacer::pace_for_alloc (this=0x7f5a180f6a20, words=) at src/hotspot/share/gc/shenandoah/shenandoahPacer.cpp:263 > #9 0x00007f5a1fbfc7e1 in ShenandoahHeap::allocate_memory (this=0x7f5a180ca590, req=...) at src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp:855 > #10 0x00007f5a1fbfcb5c in ShenandoahHeap::mem_allocate (this=, size=, gc_overhead_limit_was_exceeded=) at src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp:931 > #11 0x00007f5a1f2402c2 in NewQuickLoader::mem_allocate_raw (size=6) at src/hotspot/share/cds/archiveHeapLoader.cpp:493 > #12 NewQuickLoaderImpl::allocate (this=, __the_thread__=, size=: 6, stream=0x7f5a1d228850) at src/hotspot/share/cds/archiveHeapLoader.cpp:712 > #13 NewQuickLoaderImpl::load_archive_heap_inner (__the_thread__=0x7f5a180b8810, str... Hi @ashu-mehra thanks for testing the patch. I think we all agree that the minor performance impact is acceptable because the code is simpler and more portable. I'll try to clean up my patch and start a PR. BTW, I have implemented a simpler relocation algorithm with similar performance. It uses less memory and hopefully will be easier to understand as well. The algorithm is described in comments inside archiveHeapLoader.cpp https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc.alternative-relocation?expand=1 As a prerequisite, I'll start a PR for [JDK-8251330: Reorder CDS archived heap to speed up relocation](https://bugs.openjdk.org/browse/JDK-8251330) Regarding raw allocation, it doesn't seem to be too much faster, so maybe we should disable it, at least for the initial integration. $ (for i in {1..6}; do perf stat -r 100 java -XX:+NewArchiveHeapLoading -XX:-NahlRawAlloc --version > /dev/null; done) 2>&1 | grep elapsed 0.0162332 +- 0.0000914 seconds time elapsed ( +- 0.56% ) 0.0161316 +- 0.0000228 seconds time elapsed ( +- 0.14% ) 0.0161171 +- 0.0000250 seconds time elapsed ( +- 0.16% ) 0.0161311 +- 0.0000231 seconds time elapsed ( +- 0.14% ) 0.0161433 +- 0.0000244 seconds time elapsed ( +- 0.15% ) 0.0161121 +- 0.0000271 seconds time elapsed ( +- 0.17% ) $ (for i in {1..6}; do perf stat -r 100 java -XX:+NewArchiveHeapLoading -XX:+NahlRawAlloc --version > /dev/null; done) 2>&1 | grep elapsed 0.0160640 +- 0.0000973 seconds time elapsed ( +- 0.61% ) 0.0159320 +- 0.0000221 seconds time elapsed ( +- 0.14% ) 0.0159910 +- 0.0000272 seconds time elapsed ( +- 0.17% ) 0.0159406 +- 0.0000230 seconds time elapsed ( +- 0.14% ) 0.0159930 +- 0.0000252 seconds time elapsed ( +- 0.16% ) 0.0159670 +- 0.0000296 seconds time elapsed ( +- 0.19% ) $ (for i in {1..6}; do perf stat -r 100 java -XX:-NewArchiveHeapLoading -XX:+NahlRawAlloc --version > /dev/null; done) 2>&1 | grep elapsed 0.0149069 +- 0.0000932 seconds time elapsed ( +- 0.63% ) 0.0148363 +- 0.0000259 seconds time elapsed ( +- 0.17% ) 0.0148077 +- 0.0000218 seconds time elapsed ( +- 0.15% ) 0.0148377 +- 0.0000212 seconds time elapsed ( +- 0.14% ) 0.0148411 +- 0.0000245 seconds time elapsed ( +- 0.17% ) 0.0148504 +- 0.0000258 seconds time elapsed ( +- 0.17% ) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1650961800 From dlong at openjdk.org Wed Jul 26 04:33:50 2023 From: dlong at openjdk.org (Dean Long) Date: Wed, 26 Jul 2023 04:33:50 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v4] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 15:16:06 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add some more u1 casts to keep Windows compiler happy hopefully. We don't need the & 3 if we do something like: `_value = newval ? 3 : 2;` ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1650963686 From serb at openjdk.org Wed Jul 26 04:34:57 2023 From: serb at openjdk.org (Sergey Bylokhov) Date: Wed, 26 Jul 2023 04:34:57 GMT Subject: RFR: 8295017: Remove Windows specific workaround in JLI_Snprintf [v5] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 23:51:34 GMT, Sergey Bylokhov wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-1 >> - Merge branch 'openjdk:master' into patch-1 >> - Comment documenting change isn't required >> - Merge branch 'openjdk:master' into patch-1 >> - Comment formatting >> - Remove Windows specific JLI_Snprintf implementation >> - Remove Windows JLI_Snprintf definition > > Thank you! > >>If processing string specifier s, S, or Z, format specification processing stops, a NULL is placed at the beginning of the buffer. > > I hope this is not an MS extension/implementation detail since I did not find this in any other places. >@mrserb this change was to a Windows specific file. That change removed the windows specific version of the JLI_Snprintf, and now we use `#define JLI_Snprintf snprintf` on all platforms. And my question was about that "cross-platform" `snprintf`. As linked in the comment above on Windows it adds the null at the start of the buffer in case of error when a negative value is returned. But is that specified by the c99? ------------- PR Comment: https://git.openjdk.org/jdk/pull/10625#issuecomment-1650963715 From dlong at openjdk.org Wed Jul 26 04:51:49 2023 From: dlong at openjdk.org (Dean Long) Date: Wed, 26 Jul 2023 04:51:49 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v4] In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 03:51:26 GMT, David Holmes wrote: > Is the `& 3` still needed with the extra u1 cast? I'm gonna guess yes as a u1 doesn't fit in a uchar:2 any better than a uint does. Good question. I got this to work: `_value = ((u1)newval | 2u);` but it requires gcc 11.3. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1650975096 From dholmes at openjdk.org Wed Jul 26 05:35:51 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Jul 2023 05:35:51 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v4] In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 04:31:11 GMT, Dean Long wrote: > We don't need the & 3 if we do something like: > > `_value = newval ? 3 : 2;` That seems a lot better/clearer than casts and bit-wise operations. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1651006722 From dholmes at openjdk.org Wed Jul 26 06:13:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Jul 2023 06:13:56 GMT Subject: RFR: 8295017: Remove Windows specific workaround in JLI_Snprintf [v5] In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 04:31:13 GMT, Sergey Bylokhov wrote: >> Thank you! >> >>>If processing string specifier s, S, or Z, format specification processing stops, a NULL is placed at the beginning of the buffer. >> >> I hope this is not an MS extension/implementation detail since I did not find this in any other places. > >>@mrserb this change was to a Windows specific file. > > That change removed the windows specific version of the JLI_Snprintf, and now we use > `#define JLI_Snprintf snprintf` on all platforms. And my question was about that "cross-platform" `snprintf`. As linked in the comment above on Windows it adds the null at the start of the buffer in case of error when a negative value is returned. But is that specified by the c99? @mrserb we already had the define for non-Windows, this change just made it unconditional. This change only had an affect on Windows where we removed the custom `JLI_Snprintf` and started using `snprintf` the same as all other platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10625#issuecomment-1651039136 From dholmes at openjdk.org Wed Jul 26 06:26:08 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Jul 2023 06:26:08 GMT Subject: RFR: 8295017: Remove Windows specific workaround in JLI_Snprintf [v5] In-Reply-To: References: Message-ID: On Thu, 13 Oct 2022 14:48:29 GMT, Julian Waters wrote: >> The C99 snprintf is available with Visual Studio 2015 and above, alongside Windows 10 and the UCRT, and is no longer identical to the outdated Windows _snprintf. Since support for the Visual C++ 2017 compiler was removed a while ago, we can now safely remove the compatibility workaround on Windows and have JLI_Snprintf simply delegate to snprintf. > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-1 > - Merge branch 'openjdk:master' into patch-1 > - Comment documenting change isn't required > - Merge branch 'openjdk:master' into patch-1 > - Comment formatting > - Remove Windows specific JLI_Snprintf implementation > - Remove Windows JLI_Snprintf definition That said C99 snprintf does not say anything about writing a NUL terminator in the case where nothing at all is written. ------------- PR Comment: https://git.openjdk.org/jdk/pull/10625#issuecomment-1651052351 From dlong at openjdk.org Wed Jul 26 07:36:54 2023 From: dlong at openjdk.org (Dean Long) Date: Wed, 26 Jul 2023 07:36:54 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v4] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 15:16:06 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add some more u1 casts to keep Windows compiler happy hopefully. I like this version: `_value = (newval ? 1 : 0) | 2;` ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1651139335 From vkempik at openjdk.org Wed Jul 26 07:41:59 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Wed, 26 Jul 2023 07:41:59 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v7] In-Reply-To: References: <_xSyRKiDdm8aS0XzBLaID7PGJwmZWjXYN0ojaG06k-0=.d64f9527-0a34-4abc-b7da-b36e42c9e860@github.com> Message-ID: On Wed, 26 Jul 2023 03:41:47 GMT, Fei Yang wrote: >> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: >> >> reduce registers usage and nits > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 917: > >> 915: } >> 916: addi(cnt2, cnt2, isUL ? 4 : 8); >> 917: bne(tmp1, tmp2, DIFFERENCE); > > You might want to place this `bne` before the preceding `addi(cnt2, cnt2, isUL ? 4 : 8);` Doubt so. Current code allows cnt2 to be calculated by the time the "bgez(cnt2, TAIL);" executed, if the cpu can execute two opcodes at once. So it benefits inorder dual-issue cpus (like hifive u74/unmatched) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1274501152 From vkempik at openjdk.org Wed Jul 26 07:45:56 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Wed, 26 Jul 2023 07:45:56 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v7] In-Reply-To: References: <_xSyRKiDdm8aS0XzBLaID7PGJwmZWjXYN0ojaG06k-0=.d64f9527-0a34-4abc-b7da-b36e42c9e860@github.com> Message-ID: On Wed, 26 Jul 2023 03:42:58 GMT, Fei Yang wrote: >> Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: >> >> reduce registers usage and nits > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 2396: > >> 2394: __ bind(LOAD_LAST); // cnt2 = 1..7 characters left >> 2395: >> 2396: __ addi(cnt2, cnt2, -wordSize); //cnt2 is now an offset in strL which points to last 8 bytes > > Indentation: // cnt2 is now an offset in strL which points to last 8 bytes fixed, now pending positive tier1/tier2 results after all recent changes ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14534#discussion_r1274508298 From jpai at openjdk.org Wed Jul 26 07:53:03 2023 From: jpai at openjdk.org (Jaikiran Pai) Date: Wed, 26 Jul 2023 07:53:03 GMT Subject: RFR: 8311847: Fix -Wconversion for assembler.hpp emit_int8,16 callers [v4] In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 12:17:31 GMT, Coleen Phillimore wrote: >> Please review changes to fix -Wconversion warnings that come from assembler_.cpp by adding narrow_casts to the emit_int8,16,24, and 32 functions. And some other fixups with checked_cast. >> >> Ran tier1 on Oracle platforms, and tier1-4 on linux-x64-debug, linux-aarch64-debug, windows-x64-debug. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Fix indentation, thanks for pointing that out @aph. Hello Coleen, I didn't have access to this platform other than Github actions. I see that you fixed this issue in https://bugs.openjdk.org/browse/JDK-8312979, so thank you for that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14822#issuecomment-1651164444 From stuefe at openjdk.org Wed Jul 26 07:56:59 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 26 Jul 2023 07:56:59 GMT Subject: Withdrawn: JDK-8312018: Improve zero-base-optimized reservation of class space In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 10:17:40 GMT, Thomas Stuefe wrote: > TL;DR This patch introduces a new reservation API to reserve memory in low address space; depending on the OS, it may use optimized placement techniques. That new API is used to optimize the placement of class space and CDS for zero-based encoding. > > A future RFE will use the same API to optimize the zero-based heap reservation and thereby consolidate a lot of coding. We also plan to use this API in other places, e.g. for Shenandoah CollectionSet reservation. > > ------- > > With CDS off or at dump time, we currently attempt to optimize class space location by reserving in low address ranges. > > We do this by examining the *java heap* end (which has been allocated at that point) and, if that had been allocated in lower address regions, attempt to allocate adjacent to it. Essentially, we piggyback on what we hope for is an optimized heap placement. If that fails, we attempt to map at HeapBaseMinAddress. > > This approach has many disadvantages: > > - it depends on the VM either using CompressedOops and getting a zero-based heap or the region around HeapBaseMinAddress being free. > > - HeapBaseMinAddress is an odd choice: it is 2G on all platforms, for reasons unknown to me, but that denies us half of the valuable low address range below 4G right away. > > - We only get 1 shot. It's either one of these two addresses. > > - And we only use this strategy for CDS=off or CDS=dumptime; we don't use it for the CDS-runtime-fallback-case when attaching to the primary attach point failed. > > - It assumes narrow Klass encoding uses the same geometry (bit size, shift) as narrow Oops, which is not guaranteed with future developments (lilliput). > > - It actually reduces the chance of getting a zero-based *java heap*. This is because when attempting to place the heap, we leave a gap for what we assume will be the later class space. That gap is `CompressedClassSpaceSize` bytes, which is often grossly over-dimensioned. A zero-based heap is more valuable than a zero-based class space. Therefore the heap should get the best chance of low-address heap reservation. > > - It introduces an unnecessary dependency between heap reservation, narrow Oop encoding, and class space reservation. That makes the code base brittle. > > - Getting the heap region to place class space adjacent to it is actually tricky. We lack a common get-heaprange-API because ZGC. This code misuses the CompressedOops interface. But CompressedOops is the encoding range and thus only loosely correlated to the heap range (the latter must cont... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/14867 From luhenry at openjdk.org Wed Jul 26 09:02:40 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Wed, 26 Jul 2023 09:02:40 GMT Subject: RFR: 8312569: RISC-V: Missing intrinsics for Math.ceil, floor, rint In-Reply-To: References: Message-ID: On Mon, 24 Jul 2023 08:22:52 GMT, Ilya Gavrilin wrote: > Please review this changes into risc-v double rounding intrinsic. > > On risc-v intrinsics for rounding doubles with mode (like Math.ceil/floor/rint) were missing. On risc-v we don`t have special instruction for such conversion, so two times conversion was used: double -> long int -> double (using fcvt.l.d, fcvt.d.l). > > Also, we should provide some rounding mode to fcvt.x.x instruction. > > Rounding mode selection on ceil (similar for floor and rint): according to Math.ceil requirements: > >> Returns the smallest (closest to negative infinity) value that is greater than or equal to the argument and is equal to a mathematical integer (Math.java:475). > > For double -> long int we choose rup (round towards +inf) mode to get the integer that more than or equal to the input value. > For long int -> double we choose rdn (rounds towards -inf) mode to get the smallest (closest to -inf) representation of integer that we got after conversion. > > For cases when we got inf, nan, or value more than 2^63 return input value (double value which more than 2^63 is guaranteed integer). > As well when we store result we copy sign from input value (need for cases when for (-1.0, 0.0) ceil need to return -0.0). > > We have observed significant improvement on hifive and thead boards. > > testing: tier1, tier2 and hotspot:tier3 on hifive > > Performance results on hifive (FpRoundingBenchmark.testceil/floor/rint): > > Without intrinsic: > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.testceil 1024 thrpt 25 39.297 ? 0.037 ops/ms > FpRoundingBenchmark.testfloor 1024 thrpt 25 39.398 ? 0.018 ops/ms > FpRoundingBenchmark.testrint 1024 thrpt 25 36.388 ? 0.844 ops/ms > > With intrinsic: > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.testceil 1024 thrpt 25 80.560 ? 0.053 ops/ms > FpRoundingBenchmark.testfloor 1024 thrpt 25 80.541 ? 0.081 ops/ms > FpRoundingBenchmark.testrint 1024 thrpt 25 80.603 ? 0.071 ops/ms src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4283: > 4281: // generating constant (tmp2) > 4282: // tmp2 = 100...0000 > 4283: addi(mask, zr, 1); There are other ways to [implement these functions with less instructions](https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1AB9U8lJL6yAngGVG6AMKpaAVxYM9DgDJ4GmADl3ACNMYhAAVgBmUgAHVAVCWwZnNw89eMSbAV9/IJZQ8OiLTCtshiECJmICVPdPLhKy5MrqglzAkLDImIUqmrr0xr62jvzCnoBKC1RXYmR2DgBSACYov2Q3LABqJajHPvxBADoEPewljQBBVfWGTdcdvccWJgIEU/PLm%2BuCAE9YpgsFRtolgP50NtkAhqttgqgXHsAELfb7oWbBejfMyYOjbCDo1yYzDbVSTVEAdhR12220JxO2r2AuyiABFtlQmMEFPiycjUTTtngQRAmSzHPiuHJvN5xc9tqZWUjvMYALJXAIAFWMrIAkgBxUyTcmC2lLKnfWlWoWCABskmMBCF/NNVvp9G2/gA7sZVC6fldrbSmAoWNsAG4uN50EkQVYrKjIcMEY60Y6Q1YRDSkXYrCKNbbEVyxePbEC5lZ7VnEUsQPCTMsVqi1snkqLUwNBkNhyMGGweuMrBNJlPoVO5rM5zMFosloeN%2BNV5vziDe30N8vxmsr%2BttjuWoMAegAVFDcbQIABaDQNqtV7bXnOGSF%2BbbARhhMTbL2YMAcWi0NsYhekwfw8jC4Yku8JIKKwmACgAnKCeDgkBPKzMQNqxK4Ka0seh4HtaqgsuyaCxGBKEMPia6qDmfLtoRuwUqyAqdsQmAEHMVF%2Bgx1zmixfHXO68HXKYVC0AimEEhiHp8nxFqCsJjJMMy96ctyvJ7qxtLCviYryhAUoynK%2BwKjqypqhq2p6oaxjGox5odkGNoEP ajrOrxnbWkpNH%2BoxwahhGUb9rG8aJsmqbphO2YTjO6AMKWm5DlW24rPi9YLkOy5pRArb%2Bs53ZBX2Mb4mFI7puOmYxdOObEPFiUVkutY0RuFapel9mef52wnhyEkkFeN4kfej5AQwL5Ue%2B/jEF%2BP5/gBQG0CBYHbBBUEIDBcGIchqEhtsGFYThxx4QRrq0sR95kRRqGrpgPq0aSWmuvx2mFhxXGkn5zGsWiMkiVcpjELM434kpck3ApnZKfpbIclyPK5U9AY6SKMMSoZ0qys88qKhZ6pajqBpGiaXlMU5QZ%2BK5DpOng%2BVBj5d2%2Bn5Z1oT2wUlYOw4RWmGZ5tVeYziwLANYubLtXWrVhS2nXk9ahW9tGA5lRFY6AVVU4C7VQsi8lbLZdRjNkplKzi7uzOk7SvVA6442DbebIjTFz42m%2BH4zYBc3/oBwGgeBTCQds0GgltglIWCVF7QdfjYbh2z4d1F2w1d4cG/ddFI9aL2CWx73ENxX0CVcHDTLQnARLwngcFopCoJwEoKBhCy5lEPCkAQmjF9MADWkTZqXHCSLwLASBo2aV9XtccLwCggNm7dV8XpBwLASBoCwsQxmQFAQGvG/0OExDhval7IJshjAKQWDhngCwAGp4HdADygKV63NC0AQYQzxAwQd6QwR%2BGqH8TgrcAHMGIH8R%2BwRtCYGsCA3ga82CCEfgwWgwCF6X0wMEVwwBHBiFoDPbgvAsCvCMOIDB%2BB2LWDwJBQh1dMCqFgThRYrdKalD/rQPAwQZoQOcFgP%2BBBiB4GHkQ6Y4kVIKHvk/F%2B8CZCCBEGIdgUg5HyCUGoP%2BuhGgGCMCAUwxhzCcO5PAaYqBYjlEIZeR%2BUReCoEgsQIRWAZ6QGmJYWB5R7DjUGJ4KI2YfB%2BE6AUboXAVhxASEkAQXiMhhPKGMLoRQmhuJaP0WoLh6ggB8Qk6hAhWg1FiYE8IwSLDJMicMZJeSJg RBcY3JRJcy4Vz/pPbYAAlXUQhHCXlvoWI%2BkhgDIChNo5kEBBE2y7g2CAuBCAkGblwSYvB55aGNKQDaTAsDhAgN3Xu%2BhOCD1IOPGxnBp6zzbh3RZ/cVj1IwZPOZJzph2MSHYSQQA). Also, given you're not checking for NaN (and neither is it done for other architectures), I assume that this is done before this is called? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14991#discussion_r1274612539 From vkempik at openjdk.org Wed Jul 26 09:52:17 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Wed, 26 Jul 2023 09:52:17 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v8] In-Reply-To: References: Message-ID: <1kKdchjPCDUAUEbHwMTpkXajFjlEl0nGWn4YTLJscv8=.d49b5e9a-92e8-4017-a00c-b0ac9a040c71@github.com> > Please review this fix. it eliminates misaligned loads in String.compare on risc-v > > for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, > it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. > > so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. > > I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. > > Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. > > Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. > > for large strings, the instrinsics from stubGenerator.cpp is used > for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. > > large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: > These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. > > This also enables regression test for string.Compare which previously was aarch64-only > > Testing: tier1 and tier2 clean on hifive. > > JMH testing, hifive: > before: > > Benchmark (delta) (size) Mode Cnt Score Error Units > StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op > StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op > StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op > StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op > StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op > StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? 0.458 ms/op > StringCompareToDifferentLength.compareToLL 2 ... Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: more nits ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14534/files - new: https://git.openjdk.org/jdk/pull/14534/files/f6e066e5..463847cf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14534&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14534.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14534/head:pull/14534 PR: https://git.openjdk.org/jdk/pull/14534 From duke at openjdk.org Wed Jul 26 09:58:52 2023 From: duke at openjdk.org (Ilya Gavrilin) Date: Wed, 26 Jul 2023 09:58:52 GMT Subject: RFR: 8312569: RISC-V: Missing intrinsics for Math.ceil, floor, rint In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 08:59:47 GMT, Ludovic Henry wrote: >> Please review this changes into risc-v double rounding intrinsic. >> >> On risc-v intrinsics for rounding doubles with mode (like Math.ceil/floor/rint) were missing. On risc-v we don`t have special instruction for such conversion, so two times conversion was used: double -> long int -> double (using fcvt.l.d, fcvt.d.l). >> >> Also, we should provide some rounding mode to fcvt.x.x instruction. >> >> Rounding mode selection on ceil (similar for floor and rint): according to Math.ceil requirements: >> >>> Returns the smallest (closest to negative infinity) value that is greater than or equal to the argument and is equal to a mathematical integer (Math.java:475). >> >> For double -> long int we choose rup (round towards +inf) mode to get the integer that more than or equal to the input value. >> For long int -> double we choose rdn (rounds towards -inf) mode to get the smallest (closest to -inf) representation of integer that we got after conversion. >> >> For cases when we got inf, nan, or value more than 2^63 return input value (double value which more than 2^63 is guaranteed integer). >> As well when we store result we copy sign from input value (need for cases when for (-1.0, 0.0) ceil need to return -0.0). >> >> We have observed significant improvement on hifive and thead boards. >> >> testing: tier1, tier2 and hotspot:tier3 on hifive >> >> Performance results on hifive (FpRoundingBenchmark.testceil/floor/rint): >> >> Without intrinsic: >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.testceil 1024 thrpt 25 39.297 ? 0.037 ops/ms >> FpRoundingBenchmark.testfloor 1024 thrpt 25 39.398 ? 0.018 ops/ms >> FpRoundingBenchmark.testrint 1024 thrpt 25 36.388 ? 0.844 ops/ms >> >> With intrinsic: >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.testceil 1024 thrpt 25 80.560 ? 0.053 ops/ms >> FpRoundingBenchmark.testfloor 1024 thrpt 25 80.541 ? 0.081 ops/ms >> FpRoundingBenchmark.testrint 1024 thrpt 25 80.603 ? 0.071 ops/ms > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4283: > >> 4281: // generating constant (tmp2) >> 4282: // tmp2 = 100...0000 >> 4283: addi(mask, zr, 1); > > There are other ways to [implement these functions with less instructions](https://godbolt.org/#z:OYLghAFBqd5QCxAYwPYBMCmBRdBLAF1QCcAaPECAMzwBtMA7AQwFtMQByARg9KtQYEAysib0QXACx8BBAKoBnTAAUAHpwAMvAFYTStJg1AB9U8lJL6yAngGVG6AMKpaAVxYM9DgDJ4GmADl3ACNMYhAAVgBmUgAHVAVCWwZnNw89eMSbAV9/IJZQ8OiLTCtshiECJmICVPdPLhKy5MrqglzAkLDImIUqmrr0xr62jvzCnoBKC1RXYmR2DgBSACYov2Q3LABqJajHPvxBADoEPewljQBBVfWGTdcdvccWJgIEU/PLm%2BuCAE9YpgsFRtolgP50NtkAhqttgqgXHsAELfb7oWbBejfMyYOjbCDo1yYzDbVSTVEAdhR12220JxO2r2AuyiABFtlQmMEFPiycjUTTtngQRAmSzHPiuHJvN5xc9tqZWUjvMYALJXAIAFWMrIAkgBxUyTcmC2lLKnfWlWoWCABskmMBCF/NNVvp9G2/gA7sZVC6fldrbSmAoWNsAG4uN50EkQVYrKjIcMEY60Y6Q1YRDSkXYrCKNbbEVyxePbEC5lZ7VnEUsQPCTMsVqi1snkqLUwNBkNhyMGGweuMrBNJlPoVO5rM5zMFosloeN%2BNV5vziDe30N8vxmsr%2BttjuWoMAegAVFDcbQIABaDQNqtV7bXnOGSF%2BbbARhhMTbL2YMAcWi0NsYhekwfw8jC4Yku8JIKKwmACgAnKCeDgkBPKzMQNqxK4Ka0seh4HtaqgsuyaCxGBKEMPia6qDmfLtoRuwUqyAqdsQmAEHMVF%2Bgx1zmixfHXO68HXKYVC0AimEEhiHp8nxFqCsJjJMMy96ctyvJ7qxtLCviYryhAUoynK%2BwKjqypqhq2p6oaxjGox5odkGNo EPajrOrxnbWkpNH%2BoxwahhGUb9rG8aJsmqbphO2YTjO6AMKWm5DlW24rPi9YLkOy5pRArb%2Bs53ZBX2Mb4mFI7puOmYxdOObEPFiUVkutY0RuFapel9mef52wnhyEkkFeN4kfej5AQwL5Ue%2B/jEF%2BP5/gBQG0CBYHbBBUEIDBcGIchqEhtsGFYThxx4QRrq0sR95kRRqGrpgPq0aSWmuvx2mFhxXGkn5zGsWiMkiVcpjELM434kpck3ApnZKfpbIclyPK5U9AY6SKMMSoZ0qys88qKhZ6pajqBpGiaXlMU5QZ%2BK5DpOng%2BVBj5d2%2Bn5Z1oT2wUlYOw4RWmGZ5tVeYziwLANYubLtXWrVhS2nXk9ahW9tGA5lRFY6AVVU4C7VQsi8lbLZdRjNkplKzi7uzOk7SvVA6442DbebIjTFz42m%2BH4zYBc3/oBwGgeBTCQds0GgltglIWCVF7QdfjYbh2z4d1F2w1d4cG/ddFI9aL2CWx73ENxX0CVcHDTLQnARLwngcFopCoJwEoKBhCy5lEPCkAQmjF9MADWkTZqXHCSLwLASBo2aV9XtccLwCggNm7dV8XpBwLASBoCwsQxmQFAQGvG/0OExDhval7IJshjAKQWDhngCwAGp4HdADygKV63NC0AQYQzxAwQd6QwR%2BGqH8TgrcAHMGIH8R%2BwRtCYGsCA3ga82CCEfgwWgwCF6X0wMEVwwBHBiFoDPbgvAsCvCMOIDB%2BB2LWDwJBQh1dMCqFgThRYrdKalD/rQPAwQZoQOcFgP%2BBBiB4GHkQ6Y4kVIKHvk/F%2B8CZCCBEGIdgUg5HyCUGoP%2BuhGgGCMCAUwxhzCcO5PAaYqBYjlEIZeR%2BUReCoEgsQIRWAZ6QGmJYWB5R7DjUGJ4KI2YfB%2BE6AUboXAVhxASEkAQXiMhhPKGMLoRQmhuJaP0WoLh6ggB8Qk6hAhWg1FiYE8IwSLDJMicMZJeS JgRBcY3JRJcy4Vz/pPbYAAlXUQhHCXlvoWI%2BkhgDIChNo5kEBBE2y7g2CAuBCAkGblwSYvB55aGNKQDaTAsDhAgN3Xu%2BhOCD1IOPGxnBp6zzbh3RZ/cVj1IwZPOZJzph2MSHYSQQA). Also, given you're not checking for NaN (and neither is it done for other architectures), I assume that this is done before this is called? Hi, thanks for your review. - About NaN, INF and other special values: According to RISC-V ISA paragraph 8.7 for fcvt.l instruction (Table 8.4) we have some special return values: 1) if we exceed minimum input value, or got -INF it returns (-2^63) 2) if we exceed maximum input value, or got +INF or NaN it returns (2^63-1) (Also, if we exceed maximum/minimum input values on input we have already integer value, because according to IEEE754 double-precision f.p. format all doubles more +/- 2^52 are already integer values) So we need to check if we got -2^63 or 2^63-1 after double->long int conversion, comment lines from [src/hotspot/cpu/riscv/macroAssembler_riscv.cpp:4288](https://github.com/openjdk/jdk/pull/14991/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR4288-R4291) describes how do we change result (converted_dbl -> converted_dbl_masked) and constant (mask) to check this values. I have tried to check for NaN and +-inf with just one conditional branch. If we got -2^63 or 2^63-1 we return input value, therefore NaN -> NaN; +/- INF -> +/- INF; double that is already integer stays same as required by the ceil/floor/rint function descriptions. - About case when we can use less instructions: Of course, we can use a bit less instructions, but the main goal during intrinsic writing was minimizing count of expensive instructions (so instead of flt.d was used integer instructions on converted_dbl etc.) We already have some cases when less expensive instructions were chosen instead of reducing their number. For example: https://bugs.openjdk.org/browse/JDK-8297359 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14991#discussion_r1274701998 From rschmelter at openjdk.org Wed Jul 26 14:44:42 2023 From: rschmelter at openjdk.org (Ralf Schmelter) Date: Wed, 26 Jul 2023 14:44:42 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen In-Reply-To: References: Message-ID: On Wed, 12 Jul 2023 08:05:59 GMT, Richard Reingruber wrote: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... src/hotspot/share/gc/parallel/psCardTable.cpp line 297: > 295: if (obj_sz >= large_obj_arr_min_words() && cast_to_oop(obj_addr)->is_objArray() && > 296: obj_addr >= cur_stripe_addr) { > 297: // the last condition is not redundant as we can reach here if an obj starts at space_top I think the comment here must be expanded. As far as I understand you, one can generally assume that the array found here must start in the stripe because either: - the array actually starts in the stripe. Since it is larger than a stripe it must be the rightmost object. - the array covers the whole stripe and doesn't start in the stripe. In this case we would never land here since the check !start_array->object_starts_in_range(cur_stripe_addr, cur_stripe_end_addr) in line 258 filters this out already (no object would start in the stripe) - the array ends in the stripe, but not on the stripe end. In this case the rightmost object is not the array itself.. But when the object start array overlaps with the space_top, the start_array->object_starts_in_range(cur_stripe_addr, cur_stripe_end_addr) could return true, if there is an object at space_top. In this case the array would not start in the stripe and the check ensures that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1275082104 From rrich at openjdk.org Wed Jul 26 16:19:50 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 26 Jul 2023 16:19:50 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen In-Reply-To: References: Message-ID: <-0Wglb5yDJhAZ0I0uNab-PylFb5hLzYodYmnQjlmuTE=.f2aa059a-40ef-4968-84a6-569c61a4c55e@github.com> On Wed, 26 Jul 2023 14:42:16 GMT, Ralf Schmelter wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > src/hotspot/share/gc/parallel/psCardTable.cpp line 297: > >> 295: if (obj_sz >= large_obj_arr_min_words() && cast_to_oop(obj_addr)->is_objArray() && >> 296: obj_addr >= cur_stripe_addr) { >> 297: // the last condition is not redundant as we can reach here if an obj starts at space_top > > I think the comment here must be expanded. As far as I understand you, one can generally assume that the array found here must start in the stripe because either: > - the array actually starts in the stripe. Since it is larger than a stripe it must be the rightmost object. > - the array covers the whole stripe and doesn't start in the stripe. In this case we would never land here since the check !start_array->object_starts_in_range(cur_stripe_addr, cur_stripe_end_addr) in line 258 filters this out already (no object would start in the stripe) > - the array ends in the stripe, but not on the stripe end. In this case the rightmost object is not the array itself.. > > But when the object start array overlaps with the space_top, the start_array->object_starts_in_range(cur_stripe_addr, cur_stripe_end_addr) could return true, if there is an object at space_top. In this case the array would not start in the stripe and the check ensures that. Thanks for looking at the PR! I agree that it's not easy to see what's intended. And it's not that easy to explain either. It might be possible to remove the condition and the comment if we check `first_obj_addr >= cur_stripe_end_addr` in L279 and just continue if true. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1275206185 From iklam at openjdk.org Wed Jul 26 17:22:41 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 26 Jul 2023 17:22:41 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v2] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 18:48:58 GMT, Matias Saavedra Silva wrote: >> Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. >> >> >> // We have finished dumping the static archive. At this point, there may be pending VM >> // operations. We have changed some global states (such as vmClasses::_klasses) that >> // may cause these VM operations to fail. For safety, forget these operations and >> // exit the VM directly. >> void MetaspaceShared::exit_after_static_dump() { >> os::_exit(0); >> } >> >> >> As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: >> 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead >> 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. >> 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. >> >> Verified with tier 1-9 tests. > > Matias Saavedra Silva has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Merge fix > - Restores java loaders > - Ioi and David comments > - Windows fix > - 8306582: Remove MetaspaceShared::exit_after_static_dump() Looks good overall. Some suggestions for code refactoring. src/hotspot/share/cds/heapShared.cpp line 377: > 375: return *(_scratch_references_table->get(src)); > 376: } > 377: The logic of these two new functions are covered by the KlassToOopHandleTable class, so I would suggest refactoring KlassToOopHandleTable to class MetaspaceObjToOopHandleTable: public ResourceHashtableremove_oop(k); if (k->is_instance_klass()) { _scratch_resolved_references_table->remove(InstanceKlass::cast(k)->constants()); } } src/hotspot/share/oops/constantPool.cpp line 303: > 301: for (int i = 0; i < rr_len; i++) { > 302: oop obj = rr->obj_at(i); > 303: HeapShared::add_scratch_resolved_reference(orig_pool, i, nullptr); I think the logic can be more obvious of the the management of the contents of the `scratch_rr` array is moved here. Something like: if (rr != nullptr) { scratch_rr = HeapShared::scratch_resolved_references(orig_pool); .... scratch_rr->obj_at_put(i, nullptr); .... return scratch_rr; } That way, you won't have two functions with similar sounding names `add_scratch_resolved_reference` and `add_scratch_resolved_references`. HeapShared is now solely responsible for keeping track of the scratch arrays, but ConstantPool will be responsible for the contents of the scratch arrays. Also, it's not necessary to put the `scratch_rr` in an `OopHandle` as we won't safepoint in this loop. ------------- Changes requested by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14879#pullrequestreview-1548221065 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1275271843 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1275259540 From vladimir.kozlov at oracle.com Wed Jul 26 17:23:58 2023 From: vladimir.kozlov at oracle.com (Vladimir Kozlov) Date: Wed, 26 Jul 2023 10:23:58 -0700 Subject: Result: New HotSpot Group Member: Fei Yang Message-ID: <6067fccf-9b90-7ea5-0dc3-39c7f617f1b8@oracle.com> The vote for Fei Yang [1] is now closed. Yes: 6 Veto: 0 Abstain: 0 According to the Bylaws definition of Lazy Consensus, this is sufficient to approve the nomination. Regards, Vladimir K [1] https://mail.openjdk.org/pipermail/hotspot-dev/2023-July/076691.html From alanb at openjdk.org Wed Jul 26 18:55:56 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 26 Jul 2023 18:55:56 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v2] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 18:48:58 GMT, Matias Saavedra Silva wrote: >> Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. >> >> >> // We have finished dumping the static archive. At this point, there may be pending VM >> // operations. We have changed some global states (such as vmClasses::_klasses) that >> // may cause these VM operations to fail. For safety, forget these operations and >> // exit the VM directly. >> void MetaspaceShared::exit_after_static_dump() { >> os::_exit(0); >> } >> >> >> As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: >> 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead >> 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. >> 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. >> >> Verified with tier 1-9 tests. > > Matias Saavedra Silva has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge fix > - Restores java loaders > - Ioi and David comments > - Windows fix > - 8306582: Remove MetaspaceShared::exit_after_static_dump() src/java.base/share/native/libjli/java.c line 464: > 462: if (dumpSharedSpaces) { > 463: CHECK_EXCEPTION_LEAVE(0); > 464: LEAVE(); What is exit status ($?) when -Xshare:dump fails. It looks like any pending exception will be printed and it will exit with 0 but maybe I've missed something. In passing, the java launcher uses 4-space indent rather than 2. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1275366634 From shade at openjdk.org Wed Jul 26 20:43:52 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 26 Jul 2023 20:43:52 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes Message-ID: As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. Additional testing: - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) ------------- Commit messages: - Workaround for JDK-8313210 - Fixing CodeCache analytics - Initial work Changes: https://git.openjdk.org/jdk/pull/15043/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15043&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313202 Stats: 104 lines in 15 files changed: 44 ins; 13 del; 47 mod Patch: https://git.openjdk.org/jdk/pull/15043.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15043/head:pull/15043 PR: https://git.openjdk.org/jdk/pull/15043 From rrich at openjdk.org Wed Jul 26 21:13:57 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 26 Jul 2023 21:13:57 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v2] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Make sure to skip stripes where no object starts ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/53c7b662..0eb924e1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=00-01 Stats: 15 lines in 1 file changed: 10 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Wed Jul 26 21:24:53 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Wed, 26 Jul 2023 21:24:53 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v2] In-Reply-To: <-0Wglb5yDJhAZ0I0uNab-PylFb5hLzYodYmnQjlmuTE=.f2aa059a-40ef-4968-84a6-569c61a4c55e@github.com> References: <-0Wglb5yDJhAZ0I0uNab-PylFb5hLzYodYmnQjlmuTE=.f2aa059a-40ef-4968-84a6-569c61a4c55e@github.com> Message-ID: On Wed, 26 Jul 2023 16:16:35 GMT, Richard Reingruber wrote: >> src/hotspot/share/gc/parallel/psCardTable.cpp line 297: >> >>> 295: if (obj_sz >= large_obj_arr_min_words() && cast_to_oop(obj_addr)->is_objArray() && >>> 296: obj_addr >= cur_stripe_addr) { >>> 297: // the last condition is not redundant as we can reach here if an obj starts at space_top >> >> I think the comment here must be expanded. As far as I understand you, one can generally assume that the array found here must start in the stripe because either: >> - the array actually starts in the stripe. Since it is larger than a stripe it must be the rightmost object. >> - the array covers the whole stripe and doesn't start in the stripe. In this case we would never land here since the check !start_array->object_starts_in_range(cur_stripe_addr, cur_stripe_end_addr) in line 258 filters this out already (no object would start in the stripe) >> - the array ends in the stripe, but not on the stripe end. In this case the rightmost object is not the array itself.. >> >> But when the object start array overlaps with the space_top, the start_array->object_starts_in_range(cur_stripe_addr, cur_stripe_end_addr) could return true, if there is an object at space_top. In this case the array would not start in the stripe and the check ensures that. > > Thanks for looking at the PR! > I agree that it's not easy to see what's intended. And it's not that easy to explain either. > It might be possible to remove the condition and the comment if we check `first_obj_addr >= cur_stripe_end_addr` in L279 and just continue if true. The guiding principle of `PSCardTable::scavenge_contents_parallel` is to scan only objects that start in the current stripe. There is a special case for the last stripe ending at `space_top` if `space_top` is not card-aligned and an object starts there. With https://github.com/openjdk/jdk/pull/14846/commits/0eb924e1d4dfa1f751a55038063af4f6092be8c2 this is handled explicitly and more clearly to make sure stripes without object starts are skipped right away. This is important for correct handling of large object arrays. IMHO also helpful aside from large arrays. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1275495494 From matsaave at openjdk.org Wed Jul 26 21:54:42 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 26 Jul 2023 21:54:42 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v2] In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 18:52:41 GMT, Alan Bateman wrote: >> Matias Saavedra Silva has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Merge fix >> - Restores java loaders >> - Ioi and David comments >> - Windows fix >> - 8306582: Remove MetaspaceShared::exit_after_static_dump() > > src/java.base/share/native/libjli/java.c line 464: > >> 462: if (dumpSharedSpaces) { >> 463: CHECK_EXCEPTION_LEAVE(0); >> 464: LEAVE(); > > What is exit status ($?) when -Xshare:dump fails. It looks like any pending exception will be printed and it will exit with 0 but maybe I've missed something. > > In passing, the java launcher uses 4-space indent rather than 2. I am not incredibly familiar with the java launcher but I believe what I have is a mistake. I will change it to `CHECK_EXCEPTION_LEAVE(1)` instead. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1275516774 From dholmes at openjdk.org Wed Jul 26 22:31:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 26 Jul 2023 22:31:40 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) Sorry the names are awful and confusing. If you want to special-case this then make an argument to re-instate MutexLockerEx. I don't think JDK-8313081 is sufficient justification for a change of this nature/scope. ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15043#pullrequestreview-1548711153 From amitkumar at openjdk.org Thu Jul 27 04:47:52 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Thu, 27 Jul 2023 04:47:52 GMT Subject: RFR: 8310596: Utilize existing method frame::interpreter_frame_monitor_size_in_bytes() In-Reply-To: References: Message-ID: <8ThSz82Lkr9uFeCODyCAWaFHrnib66QKlcJSJ4XTbb8=.b79fe46f-370a-453f-9d10-86b11c16fb4e@github.com> On Mon, 17 Jul 2023 11:21:47 GMT, Amit Kumar wrote: > [small cleanup/enhancement] Adds & updates some code to use `interpreter_frame_monitor_size_in_bytes()` method. I wasn't sure whether we should remove it from PPC & S390 implementation or add it to other archs. For now I have added it but will be happy to revert as PR reviews. > > For s390 `interpreter_frame_interpreterstate_size_in_bytes()` was also cleaned up as it is not being used. > > fastdebug build tested on **s390** & **apple m1-pro**. Hi community, Can I get review on this. Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/14902#issuecomment-1652905367 From duke at openjdk.org Thu Jul 27 05:08:54 2023 From: duke at openjdk.org (sid8606) Date: Thu, 27 Jul 2023 05:08:54 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 21:26:06 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address suggestions from Jorn Vernee > > src/hotspot/cpu/s390/downcallLinker_s390.cpp line 162: > >> 160: >> 161: assert(!_needs_return_buffer, "unexpected needs_return_buffer"); >> 162: bool should_save_return_value = _needs_transition;; > > This should always be `true`, so I don't think you need the `if` statements around the spill/fill code below. > > See: https://github.com/openjdk/jdk/pull/15025 (`should_save_return_value` being dependent on `_needs_transition` is a bug). Thank you, I'll make changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1275736710 From stuefe at openjdk.org Thu Jul 27 05:31:39 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 27 Jul 2023 05:31:39 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) I think this change makes sense. Especially something like ReentrantLocker. Maybe we can make the syntax a bit more palatable? A variant would be to give the existing MutexLocker a boolean template parameter, default to false, "allowNull". Paths that allow nullptr must opt in by specifying the parameter directly. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1652937383 From alanb at openjdk.org Thu Jul 27 05:48:42 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 27 Jul 2023 05:48:42 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v2] In-Reply-To: References: Message-ID: <6_vq-zO6qwPdKTwVAvyooF5dcVIQRadZEIj8QoVd8uk=.f37fb049-86a5-40e7-babd-e3ce8e402dda@github.com> On Wed, 26 Jul 2023 21:52:04 GMT, Matias Saavedra Silva wrote: >> src/java.base/share/native/libjli/java.c line 464: >> >>> 462: if (dumpSharedSpaces) { >>> 463: CHECK_EXCEPTION_LEAVE(0); >>> 464: LEAVE(); >> >> What is exit status ($?) when -Xshare:dump fails. It looks like any pending exception will be printed and it will exit with 0 but maybe I've missed something. >> >> In passing, the java launcher uses 4-space indent rather than 2. > > I am not incredibly familiar with the java launcher but I believe what I have is a mistake. I will change it to `CHECK_EXCEPTION_LEAVE(1)` instead. Yes, if you change it CHECK_EXCEPTION_LEAVE(1)then any pending exception will be printed and it will exit with 1; if there is no pending exception then it exit with the value of `ret` as the exit status, which will be 0 here. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1275758802 From dholmes at openjdk.org Thu Jul 27 06:10:42 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 06:10:42 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v2] In-Reply-To: <6_vq-zO6qwPdKTwVAvyooF5dcVIQRadZEIj8QoVd8uk=.f37fb049-86a5-40e7-babd-e3ce8e402dda@github.com> References: <6_vq-zO6qwPdKTwVAvyooF5dcVIQRadZEIj8QoVd8uk=.f37fb049-86a5-40e7-babd-e3ce8e402dda@github.com> Message-ID: On Thu, 27 Jul 2023 05:45:59 GMT, Alan Bateman wrote: >> I am not incredibly familiar with the java launcher but I believe what I have is a mistake. I will change it to `CHECK_EXCEPTION_LEAVE(1)` instead. > > Yes, if you change it CHECK_EXCEPTION_LEAVE(1)then any pending exception will be printed and it will exit with 1; if there is no pending exception then it exit with the value of `ret` as the exit status, which will be 0 here. Sorry I missed this. Unclear when `CHECK_EXCEPTION_LEAVE(0)` would ever be appropriate??? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1275772680 From alanb at openjdk.org Thu Jul 27 06:25:48 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 27 Jul 2023 06:25:48 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v2] In-Reply-To: References: <6_vq-zO6qwPdKTwVAvyooF5dcVIQRadZEIj8QoVd8uk=.f37fb049-86a5-40e7-babd-e3ce8e402dda@github.com> Message-ID: On Thu, 27 Jul 2023 06:07:42 GMT, David Holmes wrote: >> Yes, if you change it CHECK_EXCEPTION_LEAVE(1)then any pending exception will be printed and it will exit with 1; if there is no pending exception then it exit with the value of `ret` as the exit status, which will be 0 here. > > Sorry I missed this. Unclear when `CHECK_EXCEPTION_LEAVE(0)` would ever be appropriate??? I can't think of any case where `CHECK_EXCEPTION_LEAVE(0)` would be useful, in which case we need to check the -version/-showversion handling in the unlikely event that VersionProps.print completes with an exception (nothing to do this PR of course). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1275784806 From rrich at openjdk.org Thu Jul 27 06:29:50 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 27 Jul 2023 06:29:50 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v2] In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 21:13:57 GMT, Richard Reingruber wrote: >> This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. >> >> The algorithm to share scanning large arrays is supposed to be a straight >> forward extension of the scheme implemented in >> `PSCardTable::scavenge_contents_parallel`. >> >> - A worker scans the part of a large array located in its stripe >> >> - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. >> >> - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) >> >> The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. >> >> #### Performance testing >> >> ##### BigArrayInOldGenRR.java >> >> [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). >> >> [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. >> >> Observations >> >> * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. >> >> * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. >> >> * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid ... > > Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: > > Make sure to skip stripes where no object starts Test failure caused by https://bugs.openjdk.org/browse/JDK-8312534 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14846#issuecomment-1652985347 From rrich at openjdk.org Thu Jul 27 06:39:40 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 27 Jul 2023 06:39:40 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v2] In-Reply-To: References: <-0Wglb5yDJhAZ0I0uNab-PylFb5hLzYodYmnQjlmuTE=.f2aa059a-40ef-4968-84a6-569c61a4c55e@github.com> Message-ID: On Wed, 26 Jul 2023 21:21:30 GMT, Richard Reingruber wrote: > The guiding principle of PSCardTable::scavenge_contents_parallel is to scan only objects that start in the current stripe And with this pr also large array chunks. When we found that the last stripe before `space_top` has no object start then we know there is no array chunk to scan either. There might be the end of a large array but this is scanned together with the chunk in the previous stripe (see https://github.com/openjdk/jdk/pull/14846/files#diff-fc75bbdc27c5b981a8b66f77cbd83e6d86d8f7d3e313701207d880b7036c8b92R238-R239). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1275795987 From dholmes at openjdk.org Thu Jul 27 07:31:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 07:31:52 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) Mutexes can be used in generic code that might be used before VM init reaches the point of being multi-threaded, as well as being used after that. The former case is transparently handled because the mutex ptr at the point is null and so no locking is attempted, nor is it needed. If we are going to go back in time then just reinstate MutexLockerEx - lets not go down this current path of over complexifying the mutex code! In many (all?) cases where you may need a check for a non-null mutex you more generally want a subsystem initialization check - which can be done once. Adding null-checks to nearly every mutex use is not at all beneficial IMHO. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1653054230 From shade at openjdk.org Thu Jul 27 08:20:49 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Jul 2023 08:20:49 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 22:28:31 GMT, David Holmes wrote: > Sorry the names are awful and confusing. ... So what would be the good name/convention to capture these three cases: 1. (default one) Mutex is not null, do not hold the mutex already. This is `MutexLocker` in this PR. 2. (special case) Check condition before locking the mutex, mutex should not be null otherwise. This is `ConditionalMutexLocker` in this PR. 3. (special case) Check that mutex is not held, lock otherwise. This is `ReentrantMutexLocker` in this PR. Ideally, something that would make a casual reader understand that the code like this can actually _skip_ synchronization: MutexLockerEx ml(MyLock_lock, ...) > If you want to special-case this then make an argument to re-instate MutexLockerEx. I cannot see how `MutexLockerEx` conveys the null-accepting/lock-skipping property. Yes, I understand that before [JDK-8222811](https://bugs.openjdk.org/browse/JDK-8222811), we had that null-accepting form as `MutexLockerEx`, but then I think we can name it better if we are about to backtrack on it. > I don't think JDK-8313081 is sufficient justification for a change of this nature/scope. The justification seems to be discussed in the bug report itself, so I'll discuss it there. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1653124603 From shade at openjdk.org Thu Jul 27 08:36:52 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Jul 2023 08:36:52 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: <6HAEVRAvKprF7YsdUz-D-HA61sm6PIYyaT-lpAMdEfc=.f03f7373-639a-4962-9a78-f53482278b27@github.com> On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) All right, the discussion is here then? To recap: JDK-8313081 shows that it is easy to pass null Mutex by accident, which would skip synchronization. Reviewers (both you, me, and others) did not catch this. No internal asserts caught this. No testing caught this. We got lucky it only produced the accounting error. So there are two cases: a) MutexLocker accepts null Mutex during VM initialization, and that is arguably correct behavior like JDK-8313210 b) MutexLocker accepts null Mutex during normal operation, and that is an error leading to cases like JDK-8313081 What JDK-8313081 tells me is that by chasing (a) we are missing the safety in (b). I would argue that detecting _missing synchronization_ in product code due to (b) is more important than asserting-and-special-casing on (a). I don't think we should wait for a more serious breakage to occur to start protecting from these kinds of bugs. Note that the way current PR works, we still accept null Mutex in product bits for (a), we "just" assert it in debug. If we don't do these checks in Mutex code itself, then the paranoid thing would be to require `assert(MyLock_lock != nullptr)` before every (all?) critical mutexes acquisitions that are expected to always lock. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1653144754 From shade at openjdk.org Thu Jul 27 08:45:52 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 27 Jul 2023 08:45:52 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) There is a lateral thing we can do to mitigate future problems like JDK-8313081: make all named Mutexes/Monitors unconditionally initialized, regardless of runtime options and/or defines. It feels unsufficient, as it still does not protect from misuse of private Mutexes/Monitors, but at least one could be reasonably sure the named Mutex/Monitor from `mutexLocker.hpp` is guaranteed to be initialized during the normal VM operation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1653170061 From dholmes at openjdk.org Thu Jul 27 12:18:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 12:18:50 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) The issue with JDK-8313081 was NOT due to null mutexes - that was the last line of defense if you like. With JDK-8313081 the real problem was that we took a mutex intended only for G1 and used it outside its intended domain without realizing it was conditionally created. Sure a null check would eventually have found it but as I said last line of defense. We screwed up and missed the problem. Having a null mutex mean "no locking" is a feature not a bug. Case (b) should be detected by proper sub-system initialization checks post VM init when the mutexes should all be initialized. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1653496319 PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1653498807 From coleenp at openjdk.org Thu Jul 27 12:55:49 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 12:55:49 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) I don't hate this change at all and the names convey the meaning much better than MutexLockerEx ever did. The case of SystemDictionary, for example, is so much cleaner IMO. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15043#pullrequestreview-1549780945 From coleenp at openjdk.org Thu Jul 27 12:58:51 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 12:58:51 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v4] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 15:16:06 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add some more u1 casts to keep Windows compiler happy hopefully. My first version did the explicit | 2 separately and if you don't like all the casts, I think that's the most explicit version. It sets the value, then sets the top bit that says it's not the default value. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1653568693 From matsaave at openjdk.org Thu Jul 27 13:40:56 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 27 Jul 2023 13:40:56 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v3] In-Reply-To: References: Message-ID: > Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. > > > // We have finished dumping the static archive. At this point, there may be pending VM > // operations. We have changed some global states (such as vmClasses::_klasses) that > // may cause these VM operations to fail. For safety, forget these operations and > // exit the VM directly. > void MetaspaceShared::exit_after_static_dump() { > os::_exit(0); > } > > > As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: > 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead > 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. > 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. > > Verified with tier 1-9 tests. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Refactored KlassToOopHandleTable, Ioi and Alan comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14879/files - new: https://git.openjdk.org/jdk/pull/14879/files/900fa187..dc1915dc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14879&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14879&range=01-02 Stats: 56 lines in 4 files changed: 6 ins; 21 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/14879.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14879/head:pull/14879 PR: https://git.openjdk.org/jdk/pull/14879 From coleenp at openjdk.org Thu Jul 27 14:11:48 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 14:11:48 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v5] In-Reply-To: References: Message-ID: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> > Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. > Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Back to V1. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14892/files - new: https://git.openjdk.org/jdk/pull/14892/files/f135916d..d380e8ef Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=03-04 Stats: 7 lines in 1 file changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14892.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14892/head:pull/14892 PR: https://git.openjdk.org/jdk/pull/14892 From duke at openjdk.org Thu Jul 27 16:19:41 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Thu, 27 Jul 2023 16:19:41 GMT Subject: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 05:36:15 GMT, David Holmes wrote: >> This patch adds NestHost and NestMembers attributes to the class dumped by SA. >> >> Testing: `test/hotspot/jtreg/serviceability/sa` and `test/jdk/sun/tools/jhsdb` >> Manual testing by dumping `j.l.String` and `j.l.String$CaseInsensitiveComparator` classes. >> `j.l.String` shows one entry in `NestMembers` attribute for `j.l.String$CaseInsensitiveComparator` and `j.l.String$CaseInsensitiveComparator` has `j.l.String` as its `NestHost`. > > We need to be sure this works as expected for top-level classes that have no nest members, and deeply nested nest members, plus dynamically injected hidden classes that are nest members. I'm unclear if this is intended to only expose the same details as would be statically defined in the attribute in the classfile? @dholmes-ora sorry for responding late. I got sidetracked by some other work. > We need to be sure this works as expected for top-level classes that have no nest members, and deeply nested nest members, plus dynamically injected hidden classes that are nest members. I am not sure I understand this concern. We are getting nest-host and nest-members from the InstanceKlass. As long as this information is recorded in InstanceKlass, it would work. Can you please elaborate your concern about the cases you feel may not work. > I'm unclear if this is intended to only expose the same details as would be statically defined in the attribute in the classfile? It is to expose the details as the JVM sees, which may be different from what is statically defined in the classfile if agents are involved. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15005#issuecomment-1653934767 From rrich at openjdk.org Thu Jul 27 16:20:13 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 27 Jul 2023 16:20:13 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v3] In-Reply-To: References: Message-ID: > This pr introduces parallel scanning of large object arrays in the old generation containing roots for young collections of Parallel GC. This allows for better distribution of the actual work (following the array references) as opposed to "stealing" from other task queues which can lead to inverse scaling demonstrated by small tests (attached to JDK-8310031) and also observed in gerrit production systems. > > The algorithm to share scanning large arrays is supposed to be a straight > forward extension of the scheme implemented in > `PSCardTable::scavenge_contents_parallel`. > > - A worker scans the part of a large array located in its stripe > > - Except for the end of the large array reaching into a stripe which is scanned by the thread owning the previous stripe. This is just what the current implementation does: it skips objects crossing into the stripe. > > - For this it is necessary that large arrays cover at least 3 stripes (see `PSCardTable::large_obj_arr_min_words`) > > The implementation also makes use of the precise card marks for arrays. Only dirty regions are actually scanned. > > #### Performance testing > > ##### BigArrayInOldGenRR.java > > [BigArrayInOldGenRR.java](https://bugs.openjdk.org/secure/attachment/104422/BigArrayInOldGenRR.java) is a micro benchmark that assigns new objects to a large array in a loop. Creating new array elements triggers young collections. In each collection the large array is scanned because of its references to the new elements in the young generation. The benchmark score is the geometric mean of the duration of the last 5 young collections (lower is better). > > [BigArrayInOldGenRR.pdf](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.pdf)([BigArrayInOldGenRR.ods](https://cr.openjdk.org/~rrich/webrevs/8310031/BigArrayInOldGenRR.ods)) presents the benchmark results with 1 to 64 gc threads. > > Observations > > * JDK22 scales inversely. Adding gc threads prolongues young collections. With 32 threads young collections take ~15x longer than single threaded. > > * Fixed JDK22 scales well. Adding gc theads reduces the duration of young collections. With 32 threads young collections are 5x shorter than single threaded. > > * With just 1 gc thread there is a regression. Young collections are 1.5x longer with the fix. I assume the reason is that the iteration over the array elements is interrupted at the end of a stripe which makes it less efficient. The prize for parallelization is paid without actually doing it. Also ParallelGC will use at lea... Richard Reingruber has updated the pull request incrementally with one additional commit since the last revision: Limit effect of previous commit to large array handling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14846/files - new: https://git.openjdk.org/jdk/pull/14846/files/0eb924e1..5b802ed3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14846&range=01-02 Stats: 23 lines in 1 file changed: 14 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14846.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14846/head:pull/14846 PR: https://git.openjdk.org/jdk/pull/14846 From rrich at openjdk.org Thu Jul 27 16:33:44 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Thu, 27 Jul 2023 16:33:44 GMT Subject: RFR: 8310031: Parallel: Implement better work distribution for large object arrays in old gen [v3] In-Reply-To: References: <-0Wglb5yDJhAZ0I0uNab-PylFb5hLzYodYmnQjlmuTE=.f2aa059a-40ef-4968-84a6-569c61a4c55e@github.com> Message-ID: On Thu, 27 Jul 2023 06:36:45 GMT, Richard Reingruber wrote: >> The guiding principle of `PSCardTable::scavenge_contents_parallel` is to scan only objects that start in the current stripe. There is a special case for the last stripe ending at `space_top` if `space_top` is not card-aligned and an object starts there. With https://github.com/openjdk/jdk/pull/14846/commits/0eb924e1d4dfa1f751a55038063af4f6092be8c2 this is handled explicitly and more clearly to make sure stripes without object starts are skipped right away. This is important for correct handling of large object arrays. IMHO also helpful aside from large arrays. > >> The guiding principle of PSCardTable::scavenge_contents_parallel is to scan only objects that start in the current stripe > > And with this pr also large array chunks. When we found that the last stripe before `space_top` has no object start then we know there is no array chunk to scan either. There might be the end of a large array but this is scanned together with the chunk in the previous stripe (see https://github.com/openjdk/jdk/pull/14846/files#diff-fc75bbdc27c5b981a8b66f77cbd83e6d86d8f7d3e313701207d880b7036c8b92R238-R239). I didn't like the last commit (https://github.com/openjdk/jdk/commit/0eb924e1d4dfa1f751a55038063af4f6092be8c2): it introduced a condition that is only needed for large array handling adding another hunk to the patch. Instead I've expanded the constraints ```c++ // 4. range of large array elements to be scanned: [first_obj_addr, cur_stripe_end_addr) // limited to dirty regions It is enforced where large arrays are handled. Comments explain the assumptions and assertions check them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14846#discussion_r1276540097 From iklam at openjdk.org Thu Jul 27 18:09:51 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 27 Jul 2023 18:09:51 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v3] In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 13:40:56 GMT, Matias Saavedra Silva wrote: >> Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. >> >> >> // We have finished dumping the static archive. At this point, there may be pending VM >> // operations. We have changed some global states (such as vmClasses::_klasses) that >> // may cause these VM operations to fail. For safety, forget these operations and >> // exit the VM directly. >> void MetaspaceShared::exit_after_static_dump() { >> os::_exit(0); >> } >> >> >> As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: >> 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead >> 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. >> 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. >> >> Verified with tier 1-9 tests. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Refactored KlassToOopHandleTable, Ioi and Alan comments Looks good. Just a couple of nits. src/hotspot/share/cds/heapShared.cpp line 331: > 329: mtClassShared> { > 330: public: > 331: oop get_oop(MetaspaceObj* obj) { The word `obj` may be confusing here as it may look like an oop. I think it's better to change it to `ptr`. A bit of background: CDS copies two types of objects: oops (archiveHeapWriter.cpp and heapShared.cpp) and MetsapceObjs (in archiveBuilder.cpp) In the former two files, the convention is to use `obj` to refer to oops, and `ptr` to refer to MetsapceObjs. In archiveBuilder.cpp, since we never use a mix of oops and MetsapceObjs, we use `obj` to refer to MetsapceObjs. This is mostly for historical purposes, and we might some day change it to use `ptr` to be more consistent. src/hotspot/share/oops/constantPool.cpp line 226: > 224: // Handle scratch_handle(THREAD, scratch_references); > 225: //HeapShared::add_scratch_resolved_references(this, loader_data->add_handle(scratch_handle)); > 226: HeapShared::add_scratch_resolved_references(this, scratch_references); Comments should be removed? ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14879#pullrequestreview-1550470579 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1276637489 PR Review Comment: https://git.openjdk.org/jdk/pull/14879#discussion_r1276639485 From matsaave at openjdk.org Thu Jul 27 19:09:00 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 27 Jul 2023 19:09:00 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v4] In-Reply-To: References: Message-ID: > Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. > > > // We have finished dumping the static archive. At this point, there may be pending VM > // operations. We have changed some global states (such as vmClasses::_klasses) that > // may cause these VM operations to fail. For safety, forget these operations and > // exit the VM directly. > void MetaspaceShared::exit_after_static_dump() { > os::_exit(0); > } > > > As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: > 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead > 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. > 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. > > Verified with tier 1-9 tests. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Ioi comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14879/files - new: https://git.openjdk.org/jdk/pull/14879/files/dc1915dc..64d8c638 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14879&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14879&range=02-03 Stats: 10 lines in 2 files changed: 0 ins; 3 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14879.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14879/head:pull/14879 PR: https://git.openjdk.org/jdk/pull/14879 From dlong at openjdk.org Thu Jul 27 19:19:41 2023 From: dlong at openjdk.org (Dean Long) Date: Thu, 27 Jul 2023 19:19:41 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v5] In-Reply-To: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> References: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> Message-ID: On Thu, 27 Jul 2023 14:11:48 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Back to V1. V1 looks good and seems to produce identical code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1654363335 From coleenp at openjdk.org Thu Jul 27 19:28:58 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 19:28:58 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer Message-ID: This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. Tested with tier1 linux/macosx/windows on x86 and aarch64. ------------- Commit messages: - 8312262: Klass::array_klass() should return ArrayKlass pointer Changes: https://git.openjdk.org/jdk/pull/15059/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15059&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8312262 Stats: 61 lines in 11 files changed: 1 ins; 5 del; 55 mod Patch: https://git.openjdk.org/jdk/pull/15059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15059/head:pull/15059 PR: https://git.openjdk.org/jdk/pull/15059 From coleenp at openjdk.org Thu Jul 27 19:52:53 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 19:52:53 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 16:48:55 GMT, Jean-Philippe Bempel wrote: >> Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. > > Jean-Philippe Bempel has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Revert resolved class to unresolved for comparison > > remove is_unresolved_class_mismatch Sorry I didn't see this update. The change looks good to me. You could do the cleanup that Serguei suggests with t2 and removing JVM_CONSTANT_Class case. I thought maybe they should be left in case we want to generalize this compare_entry_to function someday, that's why I didn't suggest removing it. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14780#pullrequestreview-1550650275 From coleenp at openjdk.org Thu Jul 27 19:52:56 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 19:52:56 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v4] In-Reply-To: <3caMuTqUS6CJ_NhT-o_G62gfy3FaDPCKy2A9ZOnbg3U=.670425ac-6e2a-4e6a-96a0-eec53c1ab96a@github.com> References: <3caMuTqUS6CJ_NhT-o_G62gfy3FaDPCKy2A9ZOnbg3U=.670425ac-6e2a-4e6a-96a0-eec53c1ab96a@github.com> Message-ID: <50oMYkxOt3i9QIv-mwssUy7-qmPi7JksNAnszdeIj0c=.3499a1ad-d5e3-46fe-9a9d-ae37b982a44d@github.com> On Wed, 19 Jul 2023 09:59:04 GMT, Serguei Spitsyn wrote: >> Jean-Philippe Bempel has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: >> >> Revert resolved class to unresolved for comparison >> >> remove is_unresolved_class_mismatch > > src/hotspot/share/oops/constantPool.cpp line 1295: > >> 1293: t1 = JVM_CONSTANT_UnresolvedClass; >> 1294: } >> 1295: > > All consequences of this change are not clear to me yet. > The lines 1307-1314 become not needed anymore. > Also, should the same be done for t2 as well? t2 could be a resolved class, and can be compared with unresolved class because the function klass_name_at() works for both. It might be a good idea to change both of them though, although not necessary imo. You're right 1307-1314 are not reached anymore. Neither is the ClassIndex case but not to remove as part of this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1276753305 From dlong at openjdk.org Thu Jul 27 20:15:42 2023 From: dlong at openjdk.org (Dean Long) Date: Thu, 27 Jul 2023 20:15:42 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 19:22:52 GMT, Coleen Phillimore wrote: > This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. > Tested with tier1 linux/macosx/windows on x86 and aarch64. Marked as reviewed by dlong (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15059#pullrequestreview-1550700621 From coleenp at openjdk.org Thu Jul 27 20:15:51 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 20:15:51 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v5] In-Reply-To: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> References: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> Message-ID: On Thu, 27 Jul 2023 14:11:48 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Back to V1. Thanks, Dean. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1654496116 From dlong at openjdk.org Thu Jul 27 20:18:51 2023 From: dlong at openjdk.org (Dean Long) Date: Thu, 27 Jul 2023 20:18:51 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 19:22:52 GMT, Coleen Phillimore wrote: > This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. > Tested with tier1 linux/macosx/windows on x86 and aarch64. src/hotspot/share/oops/arrayKlass.hpp line 62: > 60: ObjArrayKlass* higher_dimension() const { return _higher_dimension; } > 61: inline ObjArrayKlass* higher_dimension_acquire() const; // load with acquire semantics > 62: void set_higher_dimension(ObjArrayKlass* k) { _higher_dimension = k; } The set_higher_dimension() function seems to be unused. Should we remove it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276777259 From matsaave at openjdk.org Thu Jul 27 20:53:08 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 27 Jul 2023 20:53:08 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool Message-ID: Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. ------------- Commit messages: - 8307312: Replace "int which" with "int cp_index" in constantPool Changes: https://git.openjdk.org/jdk/pull/15027/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15027&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307312 Stats: 381 lines in 8 files changed: 0 ins; 0 del; 381 mod Patch: https://git.openjdk.org/jdk/pull/15027.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15027/head:pull/15027 PR: https://git.openjdk.org/jdk/pull/15027 From ccheung at openjdk.org Thu Jul 27 21:00:53 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 27 Jul 2023 21:00:53 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 20:16:17 GMT, Dean Long wrote: >> This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. >> Tested with tier1 linux/macosx/windows on x86 and aarch64. > > src/hotspot/share/oops/arrayKlass.hpp line 62: > >> 60: ObjArrayKlass* higher_dimension() const { return _higher_dimension; } >> 61: inline ObjArrayKlass* higher_dimension_acquire() const; // load with acquire semantics >> 62: void set_higher_dimension(ObjArrayKlass* k) { _higher_dimension = k; } > > The set_higher_dimension() function seems to be unused. Should we remove it? I'm using the function in the following PR https://github.com/openjdk/jdk/pull/14959. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276813629 From coleenp at openjdk.org Thu Jul 27 21:48:51 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 21:48:51 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer In-Reply-To: References: Message-ID: <8XHqsZLQdnHQoxLCByQs0yJ84LmCJkHLpt0OkxoKLq4=.e4a307e1-2cbe-45dc-ac5a-50a287b8c7d6@github.com> On Thu, 27 Jul 2023 20:57:56 GMT, Calvin Cheung wrote: >> src/hotspot/share/oops/arrayKlass.hpp line 62: >> >>> 60: ObjArrayKlass* higher_dimension() const { return _higher_dimension; } >>> 61: inline ObjArrayKlass* higher_dimension_acquire() const; // load with acquire semantics >>> 62: void set_higher_dimension(ObjArrayKlass* k) { _higher_dimension = k; } >> >> The set_higher_dimension() function seems to be unused. Should we remove it? > > I'm using the function in the following PR https://github.com/openjdk/jdk/pull/14959. Should you be using release_set_higher_dimension in your change @calvinccheung ? It does seem better to not have unsafe alternatives. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276850667 From ccheung at openjdk.org Thu Jul 27 21:56:53 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 27 Jul 2023 21:56:53 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 19:22:52 GMT, Coleen Phillimore wrote: > This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. > Tested with tier1 linux/macosx/windows on x86 and aarch64. Looks good. Just couple of nits. I think there are some changes could be made under the `sun.jvm.hotspot.oops` package to reduce the amount of type casting. For example: `./sun/jvm/hotspot/oops/ArrayKlass.java: public Klass getHigherDimension() { return (Klass) higherDimension.getValue(this); }` The above method could return a ObjArrayKlass so the following typecasts are not needed: ./sun/jvm/hotspot/oops/ObjArrayKlass.java: ObjArrayKlass ak = (ObjArrayKlass) getHigherDimension(); ./sun/jvm/hotspot/oops/TypeArrayKlass.java: ObjArrayKlass ak = (ObjArrayKlass) getHigherDimension(); src/hotspot/share/oops/arrayKlass.cpp line 130: > 128: check_array_allocation_length(length, arrayOopDesc::max_array_length(T_ARRAY), CHECK_NULL); > 129: size_t size = objArrayOopDesc::object_size(length); > 130: ArrayKlass* ak = array_klass(n+dimension(), CHECK_NULL); Please add a blank space before and after the '+'. src/hotspot/share/oops/objArrayKlass.cpp line 179: > 177: if (length != 0) { > 178: for (int index = 0; index < length; index++) { > 179: oop sub_array = ld_klass->multi_allocate(rank-1, &sizes[1], CHECK_NULL); Please add a blank space before and after the '-'. ------------- PR Review: https://git.openjdk.org/jdk/pull/15059#pullrequestreview-1550859087 PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276849391 PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276850058 From ccheung at openjdk.org Thu Jul 27 21:59:40 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 27 Jul 2023 21:59:40 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer In-Reply-To: <8XHqsZLQdnHQoxLCByQs0yJ84LmCJkHLpt0OkxoKLq4=.e4a307e1-2cbe-45dc-ac5a-50a287b8c7d6@github.com> References: <8XHqsZLQdnHQoxLCByQs0yJ84LmCJkHLpt0OkxoKLq4=.e4a307e1-2cbe-45dc-ac5a-50a287b8c7d6@github.com> Message-ID: On Thu, 27 Jul 2023 21:45:54 GMT, Coleen Phillimore wrote: >> I'm using the function in the following PR https://github.com/openjdk/jdk/pull/14959. > > Should you be using release_set_higher_dimension in your change @calvinccheung ? It does seem better to not have unsafe alternatives. Sure, I can use release_set_higher_dimension in my change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276857531 From dholmes at openjdk.org Thu Jul 27 22:31:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 22:31:52 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v5] In-Reply-To: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> References: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> Message-ID: On Thu, 27 Jul 2023 14:11:48 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Back to V1. Maybe somewhere a comment could be added like: // Update _value in two steps to avoid awkward casts needed to silence conversion warnings otherwise someone is bound to rewrite those two lines as one at some point in the future. Thanks. src/hotspot/share/utilities/tribool.hpp line 44: > 42: TriBool() : _value(0) {} > 43: TriBool(bool value) : _value(value) { > 44: _value = _value | 2; // set to not-default Why not `|=` here? ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14892#pullrequestreview-1550935419 PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1276877688 From dholmes at openjdk.org Thu Jul 27 22:41:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 22:41:52 GMT Subject: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 16:16:37 GMT, Ashutosh Mehra wrote: >> We need to be sure this works as expected for top-level classes that have no nest members, and deeply nested nest members, plus dynamically injected hidden classes that are nest members. I'm unclear if this is intended to only expose the same details as would be statically defined in the attribute in the classfile? > > @dholmes-ora sorry for responding late. I got sidetracked by some other work. > >> We need to be sure this works as expected for top-level classes that have no nest members, and deeply nested nest members, plus dynamically injected hidden classes that are nest members. > > I am not sure I understand this concern. We are getting nest-host and nest-members from the InstanceKlass. As long as this information is recorded in InstanceKlass, it would work. Can you please elaborate your concern about the cases you feel may not work. > >> I'm unclear if this is intended to only expose the same details as would be statically defined in the attribute in the classfile? > > It is to expose the details as the JVM sees, which may be different from what is statically defined in the classfile if agents are involved. @ashu-mehra you indicated that you had only done two basic manual tests to check the output. You need to check it for the cases that I flagged too. In the VM every top-level class is its own nest-host, but that is not expressed in a classfile attribute (it is just the defined semantics) so displaying this as-if it were an explicit attribute may not be right. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15005#issuecomment-1654673306 From coleenp at openjdk.org Thu Jul 27 22:46:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 22:46:15 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v2] In-Reply-To: References: Message-ID: <4k_1XcrxeulBnq4zVHnpSP3osYNC5FXz6UDrMh93PvI=.37f04514-2c00-43c3-ae12-9acf9ea3c473@github.com> On Thu, 27 Jul 2023 21:43:59 GMT, Calvin Cheung wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Calvin's suggestions and remove set_higher_dimension. > > src/hotspot/share/oops/arrayKlass.cpp line 130: > >> 128: check_array_allocation_length(length, arrayOopDesc::max_array_length(T_ARRAY), CHECK_NULL); >> 129: size_t size = objArrayOopDesc::object_size(length); >> 130: ArrayKlass* ak = array_klass(n+dimension(), CHECK_NULL); > > Please add a blank space before and after the '+'. ok > src/hotspot/share/oops/objArrayKlass.cpp line 179: > >> 177: if (length != 0) { >> 178: for (int index = 0; index < length; index++) { >> 179: oop sub_array = ld_klass->multi_allocate(rank-1, &sizes[1], CHECK_NULL); > > Please add a blank space before and after the '-'. ok ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276887177 PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276887382 From coleenp at openjdk.org Thu Jul 27 22:46:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 22:46:15 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v2] In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 22:41:00 GMT, Coleen Phillimore wrote: >> This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. >> Tested with tier1 linux/macosx/windows on x86 and aarch64. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Calvin's suggestions and remove set_higher_dimension. Thanks for reviewing Calvin. I've rerun the SA tests with your SA suggested changes. ------------- PR Review: https://git.openjdk.org/jdk/pull/15059#pullrequestreview-1550943773 From coleenp at openjdk.org Thu Jul 27 22:46:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 22:46:15 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v2] In-Reply-To: References: <8XHqsZLQdnHQoxLCByQs0yJ84LmCJkHLpt0OkxoKLq4=.e4a307e1-2cbe-45dc-ac5a-50a287b8c7d6@github.com> Message-ID: On Thu, 27 Jul 2023 21:57:15 GMT, Calvin Cheung wrote: >> Should you be using release_set_higher_dimension in your change @calvinccheung ? It does seem better to not have unsafe alternatives. > > Sure, I can use release_set_higher_dimension in my change. That would be good. Not this change, but the higher and lower dimensions have to change to not be set inside a Mutex so it'll be safer to only have the acquire/release versions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276886493 From coleenp at openjdk.org Thu Jul 27 22:46:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 22:46:15 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v2] In-Reply-To: References: <8XHqsZLQdnHQoxLCByQs0yJ84LmCJkHLpt0OkxoKLq4=.e4a307e1-2cbe-45dc-ac5a-50a287b8c7d6@github.com> Message-ID: On Thu, 27 Jul 2023 22:37:22 GMT, Coleen Phillimore wrote: >> Sure, I can use release_set_higher_dimension in my change. > > That would be good. Not this change, but the higher and lower dimensions have to change to not be set inside a Mutex so it'll be safer to only have the acquire/release versions. Thank you for noticing the set_higher_dimension function is unused, Dean. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276888078 From coleenp at openjdk.org Thu Jul 27 22:46:15 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 22:46:15 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v2] In-Reply-To: References: Message-ID: > This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. > Tested with tier1 linux/macosx/windows on x86 and aarch64. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Calvin's suggestions and remove set_higher_dimension. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15059/files - new: https://git.openjdk.org/jdk/pull/15059/files/2edd5c48..a9a9eb50 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15059&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15059&range=00-01 Stats: 10 lines in 6 files changed: 0 ins; 1 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/15059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15059/head:pull/15059 PR: https://git.openjdk.org/jdk/pull/15059 From coleenp at openjdk.org Thu Jul 27 22:50:16 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 22:50:16 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v6] In-Reply-To: References: Message-ID: > Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. > Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: - David suggestion - David suggestion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14892/files - new: https://git.openjdk.org/jdk/pull/14892/files/d380e8ef..288b47c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14892&range=04-05 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14892.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14892/head:pull/14892 PR: https://git.openjdk.org/jdk/pull/14892 From coleenp at openjdk.org Thu Jul 27 22:50:16 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Thu, 27 Jul 2023 22:50:16 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v5] In-Reply-To: References: <3FqjCTXftM_LFFk7QPSfriRnvmjrAud3dhLN6uWRFr8=.1609d013-2684-40cb-a74f-a65011e4732a@github.com> Message-ID: <5ZUYaViHUGcj3oIrwEp2K6U8yiGB6PASDxP1z7LuIqo=.867bd9e5-8c13-460d-95a7-851c6e143417@github.com> On Thu, 27 Jul 2023 22:26:13 GMT, David Holmes wrote: >> Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: >> >> Back to V1. > > src/hotspot/share/utilities/tribool.hpp line 44: > >> 42: TriBool() : _value(0) {} >> 43: TriBool(bool value) : _value(value) { >> 44: _value = _value | 2; // set to not-default > > Why not `|=` here? I was trying different versions to see if |= would give a warning. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1276891643 From dholmes at openjdk.org Thu Jul 27 22:55:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 22:55:50 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: On Wed, 26 Jul 2023 17:06:02 GMT, Aleksey Shipilev wrote: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) To convey intent we need to, IMO, change the naming style to something that relates to when we lock - the "FooLocker" naming doesn't work well in that regard: Conditional is okay except that "unless already locked" is also a condition - and Reentrant is totally wrong for that case. Something like: - MutexLockWhen for the general predicate case - MutexLockIfNeeded for the already locked (or safepoint?) case. To be clear I very strongly object to ReentrantMutexLocker as a name. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1654682294 PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1654683573 From dholmes at openjdk.org Thu Jul 27 23:20:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 23:20:40 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v6] In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 22:50:16 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - David suggestion > - David suggestion Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14892#pullrequestreview-1550988170 From dholmes at openjdk.org Thu Jul 27 23:24:51 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 27 Jul 2023 23:24:51 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v2] In-Reply-To: References: <8XHqsZLQdnHQoxLCByQs0yJ84LmCJkHLpt0OkxoKLq4=.e4a307e1-2cbe-45dc-ac5a-50a287b8c7d6@github.com> Message-ID: On Thu, 27 Jul 2023 22:39:10 GMT, Coleen Phillimore wrote: >> That would be good. Not this change, but the higher and lower dimensions have to change to not be set inside a Mutex so it'll be safer to only have the acquire/release versions. > > Thank you for noticing the set_higher_dimension function is unused, Dean. But Calvin wants it and doesn't need the release version. Using release where not needed just causes confusion because it screams out "something lock-free and concurrent is happening here", which is not the case in Calvin's code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1276921700 From duke at openjdk.org Fri Jul 28 03:43:33 2023 From: duke at openjdk.org (sid8606) Date: Fri, 28 Jul 2023 03:43:33 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v4] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for s390x (Big Endian). sid8606 has updated the pull request incrementally with one additional commit since the last revision: revert platform byte ordering changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/f916864d..75229b4c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=02-03 Stats: 3 lines in 1 file changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From duke at openjdk.org Fri Jul 28 03:50:18 2023 From: duke at openjdk.org (sid8606) Date: Fri, 28 Jul 2023 03:50:18 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v5] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for s390x (Big Endian). sid8606 has updated the pull request incrementally with one additional commit since the last revision: remove if statements around the spill/fill code See: #15025 (should_save_return_value being dependent on _needs_transition is a bug). ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/75229b4c..107b971f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=03-04 Stats: 15 lines in 1 file changed: 0 ins; 15 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From dlong at openjdk.org Fri Jul 28 03:56:44 2023 From: dlong at openjdk.org (Dean Long) Date: Fri, 28 Jul 2023 03:56:44 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v6] In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 22:50:16 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - David suggestion > - David suggestion This version still looks good. If you want to do the |= in only one place, you could do it in TriBool operator = and have the other two places use TriBool operator =. src/hotspot/share/utilities/tribool.hpp line 45: > 43: TriBool(bool value) : _value(value) { > 44: // set to not-default in separate step to avoid conversion warnings > 45: _value |= 2; Suggestion: TriBool(bool value) { *this = value; src/hotspot/share/utilities/tribool.hpp line 81: > 79: _slot ^= ((u1)_value) << _offset; // reset the tribool > 80: _value = newval; > 81: _value |= 2; // set to not-default Suggestion: TriBool::operator=(newval); ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1654958860 PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1277063600 PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1277063977 From duke at openjdk.org Fri Jul 28 03:59:26 2023 From: duke at openjdk.org (sid8606) Date: Fri, 28 Jul 2023 03:59:26 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v6] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for s390x (Big Endian). sid8606 has updated the pull request incrementally with one additional commit since the last revision: Preserve and restore register Z_R6 Though Z_R6 is argument register it is a saved register so preserve and restore Z_R6 register ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/107b971f..cc2292dd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=04-05 Stats: 13 lines in 2 files changed: 3 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From duke at openjdk.org Fri Jul 28 04:00:46 2023 From: duke at openjdk.org (sid8606) Date: Fri, 28 Jul 2023 04:00:46 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 05:05:57 GMT, sid8606 wrote: >> src/hotspot/cpu/s390/downcallLinker_s390.cpp line 162: >> >>> 160: >>> 161: assert(!_needs_return_buffer, "unexpected needs_return_buffer"); >>> 162: bool should_save_return_value = _needs_transition;; >> >> This should always be `true`, so I don't think you need the `if` statements around the spill/fill code below. >> >> See: https://github.com/openjdk/jdk/pull/15025 (`should_save_return_value` being dependent on `_needs_transition` is a bug). > > Thank you, I'll make changes. Made the changes in a new commit ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1277065390 From duke at openjdk.org Fri Jul 28 04:04:02 2023 From: duke at openjdk.org (sid8606) Date: Fri, 28 Jul 2023 04:04:02 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: <5ifOngKE41_DMQw35KiJQzf8L1MP_smglYyyNCipf_k=.5c7aad97-0615-4ef4-a61a-98d3ecb282b3@github.com> References: <5ifOngKE41_DMQw35KiJQzf8L1MP_smglYyyNCipf_k=.5c7aad97-0615-4ef4-a61a-98d3ecb282b3@github.com> Message-ID: On Fri, 21 Jul 2023 12:17:55 GMT, Martin Doerr wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address suggestions from Jorn Vernee > > I can see it failing with `make run-test TEST="jdk/java/foreign"` on g9-z15, glibc 2.31 (patch applied to jdk head). @TheRealMDoerr I have fixed StdLibTest.java fails on glib2.31 version in a new commit. I have tested on my end now it passes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14801#issuecomment-1654964500 From jwaters at openjdk.org Fri Jul 28 07:32:04 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 28 Jul 2023 07:32:04 GMT Subject: RFR: 8313302: Fix formatting errors on Windows Message-ID: Fix several formatting errors on Windows ------------- Commit messages: - 8313302 Changes: https://git.openjdk.org/jdk/pull/15063/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15063&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313302 Stats: 128 lines in 14 files changed: 0 ins; 0 del; 128 mod Patch: https://git.openjdk.org/jdk/pull/15063.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15063/head:pull/15063 PR: https://git.openjdk.org/jdk/pull/15063 From jwaters at openjdk.org Fri Jul 28 07:43:04 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 28 Jul 2023 07:43:04 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v2] In-Reply-To: References: Message-ID: > Fix several formatting errors on Windows Julian Waters has updated the pull request incrementally with one additional commit since the last revision: Bug ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15063/files - new: https://git.openjdk.org/jdk/pull/15063/files/39548d5c..d6bff320 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15063&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15063&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/15063.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15063/head:pull/15063 PR: https://git.openjdk.org/jdk/pull/15063 From jwaters at openjdk.org Fri Jul 28 07:50:19 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 28 Jul 2023 07:50:19 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: References: Message-ID: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> > Fix several formatting errors on Windows Julian Waters has updated the pull request incrementally with one additional commit since the last revision: zPhysicalMemoryBacking_windows.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15063/files - new: https://git.openjdk.org/jdk/pull/15063/files/d6bff320..db9102a2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15063&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15063&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/15063.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15063/head:pull/15063 PR: https://git.openjdk.org/jdk/pull/15063 From mbaesken at openjdk.org Fri Jul 28 08:17:11 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 28 Jul 2023 08:17:11 GMT Subject: RFR: JDK-8313251: Add NativeLibraryLoad event Message-ID: Add a NativeLibraryLoad event that provides us more detail about shared lib/dll loads. This gives a time stamp and success + error details of the load operation. It enhances the already existing information we get from the existing NativeLibrary event (that periodically samples the native modules of the jvm process). ------------- Commit messages: - JDK-8313251 Changes: https://git.openjdk.org/jdk/pull/15065/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15065&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313251 Stats: 186 lines in 9 files changed: 186 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15065.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15065/head:pull/15065 PR: https://git.openjdk.org/jdk/pull/15065 From mbaesken at openjdk.org Fri Jul 28 09:38:13 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 28 Jul 2023 09:38:13 GMT Subject: RFR: JDK-8313251: Add NativeLibraryLoad event [v2] In-Reply-To: References: Message-ID: > Add a NativeLibraryLoad event that provides us more detail about shared lib/dll loads. This gives a time stamp and success + error details of the load operation. It enhances the already existing information we get from the existing NativeLibrary event (that periodically samples the native modules of the jvm process). Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: add macro guards because the build errors in zero build ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15065/files - new: https://git.openjdk.org/jdk/pull/15065/files/eb87bf2c..a934d3f7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15065&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15065&range=00-01 Stats: 38 lines in 4 files changed: 29 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/15065.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15065/head:pull/15065 PR: https://git.openjdk.org/jdk/pull/15065 From stuefe at openjdk.org Fri Jul 28 09:51:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 28 Jul 2023 09:51:56 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 07:50:19 GMT, Julian Waters wrote: >> Fix several formatting errors on Windows > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > zPhysicalMemoryBacking_windows.cpp Hi Julian, I'm sorry, but I object to this change. It is another heap of aesthetic changes, with very very few changes actually needed (see inline remarks). Contrary to the title, it also touches shared code, and there are (almost) no errors to fix. And the one arguably incorrect place, in code heap, is not a "formatting error on windows". Please consider that your aesthetic taste is not shared by everyone, but changes like these cause a lot of work for others. Cheers, Thomas src/hotspot/os/windows/gc/x/xMapper_windows.cpp line 118: > 116: const uintptr_t addr = map_view_no_placeholder(file_handle, file_offset, size); > 117: if (addr == 0) { > 118: log_error(gc)("Failed to map view of paging file mapping (%ld)", GetLastError()); Here, and in many other places: All these `%d` -> `%ld` changes are questionable. Why do you think `l` is more correct? Strictly spoken both existing and your versions are incorrect since GetLastError returns a DWORD, which is an unsigned 32-bit integer. Were I to change anything, I would print an unsigned integer. But I would not change anything, since all documented error codes are well below 0x8000_0000. I think we are good here. src/hotspot/os/windows/gc/x/xMapper_windows.cpp line 253: > 251: > 252: if (!res) { > 253: fatal("Failed to unreserve memory: " INTPTR_FORMAT " " SIZE_FORMAT "M (%ld)", Here, and in many other places: PTR_FORMAT -> INTPTR_FORMAT is pointless, since both macros are the same. We use PTR_FORMAT in many many places to print pointers. I prefer it to INTPTR_FORMAT, since it does not convey the assumption of a type. After all, the type does not matter: we just print the pointer as a raw hex value. Also, INTPTR suggests intptr_t which suggests signedness, which has no place here. *If* we want to change this, we should first agree on the INTPTR_FORMAT vs PTR_FORMAT difference. And why we use intptr_t in many places that actually call for an uintptr_t. But I would not change it. There is no need. src/hotspot/os/windows/gc/z/zMapper_windows.cpp line 267: > 265: > 266: if (!res) { > 267: fatal_error("Failed to split placeholder", untype(addr), size); Please don't. `untype` does an assertion check. We don't want that here, when constructing arguments for a fatal error message. src/hotspot/os/windows/osThread_windows.hpp line 30: > 28: typedef void* HANDLE; > 29: public: > 30: typedef unsigned int thread_id_t; Since Windows is LLP64, this has no effect, even though beginthreadex returns an unsigned int. src/hotspot/os/windows/os_windows.cpp line 3382: > 3380: if (Verbose && PrintMiscellaneous) { > 3381: reserveTimer.stop(); > 3382: tty->print_cr("reserve_memory of %zx bytes took " JLONG_FORMAT " ms (" JLONG_FORMAT " ticks)", bytes, SIZE_FORMAT. src/hotspot/os/windows/perfMemory_windows.cpp line 259: > 257: if (PrintMiscellaneous && Verbose) { > 258: warning("%s is not a directory, file attributes = " > 259: "0x%08lx\n", path, fa); UINT32_FORMAT src/hotspot/os/windows/symbolengine.cpp line 635: > 633: } else { > 634: st->print("initialized successfully"); > 635: st->print(" - sym options: 0x%lX", WindowsDbgHelp::symGetOptions()); No. SymGetOptions returns a DWORD, X is correct here. src/hotspot/os_cpu/windows_x86/os_windows_x86.cpp line 422: > 420: st->print(", RBX=0x%016llx", uc->Rbx); > 421: st->print(", RCX=0x%016llx", uc->Rcx); > 422: st->print(", RDX=0x%016llx", uc->Rdx); Why? src/hotspot/share/code/codeHeapState.cpp line 1334: > 1332: if (SizeDistributionArray != nullptr) { > 1333: unsigned long total_count = 0; > 1334: uint64_t total_size = 0; This is a functional change (actually the only one I could spot), and it increases the size of this type to 64-bit on Windows. It may also be incorrect, since on 32-bit platforms you probably don't want 64-bit here. The correct type to use would be size_t. Since this change is hidden in a deluge of unnecessary cosmetic changes, if we feel this is needed, this should be addressed in a separate RFE. But then again, probably not: code cache size is limited to 32-bit, and I bet we have a lot more places to change if we wanted to change that. Also way out of scope for the RFE description. src/hotspot/share/code/codeHeapState.cpp line 1387: > 1385: " occupied space) is used by the blocks in the given size range.\n" > 1386: " %ld characters are printed per percentage point.\n", pctFactor/100); > 1387: ast->print_cr("total size of all blocks: " INT64_FORMAT_W(7) "M", (total_size< References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: <6B76iPop_OLgdGhVzafFiOMStv1CAVVQZ4z7AjxOPE0=.1585baa3-dd5d-4778-8e9a-772846ae6c3f@github.com> On Fri, 28 Jul 2023 09:46:47 GMT, Thomas Stuefe wrote: > Hi Julian, > > I'm sorry, but I object to this change. > > It is another heap of aesthetic changes, with very very few changes actually needed (see inline remarks). Contrary to the title, it also touches shared code, and there are (almost) no errors to fix. And the one arguably incorrect place, in code heap, is not a "formatting error on windows". > > Please consider that your aesthetic taste is not shared by everyone, but changes like these cause a lot of work for others. > > Cheers, Thomas Hi Thomas, Sigh, I was afraid of that. I've not been putting out good quality changes lately :( I will mention that these are not merely aesthetic changes however, but actually is an effort spawned out of JDK-8288293, since all of these are flagged as formatting errors. It may be the case that I'm putting a bit too much faith in gcc's format checking, but I do want to discuss the rest in the review comments since it is a bit easier > src/hotspot/os/windows/gc/x/xMapper_windows.cpp line 118: > >> 116: const uintptr_t addr = map_view_no_placeholder(file_handle, file_offset, size); >> 117: if (addr == 0) { >> 118: log_error(gc)("Failed to map view of paging file mapping (%ld)", GetLastError()); > > Here, and in many other places: > > All these `%d` -> `%ld` changes are questionable. Why do you think `l` is more correct? > > Strictly spoken both existing and your versions are incorrect since GetLastError returns a DWORD, which is an unsigned 32-bit integer. > > Were I to change anything, I would print an unsigned integer. But I would not change anything, since all documented error codes are well below 0x8000_0000. I think we are good here. DWORD is defined as an unsigned long on all Windows compilers, which more accurately maps to %ld under C++ rules. This was more for correctness, but may not be strictly needed ------------- PR Comment: https://git.openjdk.org/jdk/pull/15063#issuecomment-1655418084 PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277359872 From jwaters at openjdk.org Fri Jul 28 10:11:44 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 28 Jul 2023 10:11:44 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 09:24:33 GMT, Thomas Stuefe wrote: >> Julian Waters has updated the pull request incrementally with one additional commit since the last revision: >> >> zPhysicalMemoryBacking_windows.cpp > > src/hotspot/os/windows/osThread_windows.hpp line 30: > >> 28: typedef void* HANDLE; >> 29: public: >> 30: typedef unsigned int thread_id_t; > > Since Windows is LLP64, this has no effect, even though beginthreadex returns an unsigned int. It's also another formatting change, as this is printed as %d in shared code. This also matches the definition of int on other platforms. This should functionally be harmless, because ints are promoted to longs (DWORD) whenever thread_id_t is passed to a callee that expects a DWORD, and since longs are the same size as ints this is effectively an implicit cast with no effect in those cases > src/hotspot/os/windows/perfMemory_windows.cpp line 259: > >> 257: if (PrintMiscellaneous && Verbose) { >> 258: warning("%s is not a directory, file attributes = " >> 259: "0x%08lx\n", path, fa); > > UINT32_FORMAT Noted, if this is accepted > src/hotspot/os_cpu/windows_x86/os_windows_x86.cpp line 422: > >> 420: st->print(", RBX=0x%016llx", uc->Rbx); >> 421: st->print(", RCX=0x%016llx", uc->Rcx); >> 422: st->print(", RDX=0x%016llx", uc->Rdx); > > Why? All of these are either DWORD64 or DWORD, none are actually intptr_t's or uintptr_t's > src/hotspot/share/code/codeHeapState.cpp line 1334: > >> 1332: if (SizeDistributionArray != nullptr) { >> 1333: unsigned long total_count = 0; >> 1334: uint64_t total_size = 0; > > This is a functional change (actually the only one I could spot), and it increases the size of this type to 64-bit on Windows. > > It may also be incorrect, since on 32-bit platforms you probably don't want 64-bit here. The correct type to use would be size_t. > > Since this change is hidden in a deluge of unnecessary cosmetic changes, if we feel this is needed, this should be addressed in a separate RFE. > > But then again, probably not: code cache size is limited to 32-bit, and I bet we have a lot more places to change if we wanted to change that. > > Also way out of scope for the RFE description. For this one (and the one below) I couldn't figure out whether the intention was for a 32 or 64 bit number in this case. This would be 64 bit on every platform but on Windows, which is 32 bit, and the code directly below this also faces the same issue too, is this intentional? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277362853 PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277363449 PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277364035 PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277365355 From stuefe at openjdk.org Fri Jul 28 10:25:41 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 28 Jul 2023 10:25:41 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 07:50:19 GMT, Julian Waters wrote: >> Fix several formatting errors on Windows > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > zPhysicalMemoryBacking_windows.cpp > > Hi Julian, > > I'm sorry, but I object to this change. > > It is another heap of aesthetic changes, with very very few changes actually needed (see inline remarks). Contrary to the title, it also touches shared code, and there are (almost) no errors to fix. And the one arguably incorrect place, in code heap, is not a "formatting error on windows". > > Please consider that your aesthetic taste is not shared by everyone, but changes like these cause a lot of work for others. > > Cheers, Thomas > > Hi Thomas, > > Sigh, I was afraid of that. I've not been putting out good quality changes lately :( I will mention that these are not merely aesthetic changes however, but actually is an effort spawned out of JDK-8288293, since all of these are flagged as formatting errors. It may be the case that I'm putting a bit too much faith in gcc's format checking, but I do want to discuss the rest in the review comments since it is a bit easier So the point of this change is to satisfy gcc on Windows? Accomodating a new build platform, making (and keeping!) it warning-free is a considerable effort. Even if you do it, it has a lot of side effects on others: reviewer churn, accidental bugs, makes backports more difficult... For smallish things its okay, but if it keeps causing massive changes like this one we should discuss this first. In my eyes, this is similar to adding a new platform, for which the bar is very (it is technically less complex than a new platform, but OTOH it is also less isolated). As a side note, I think I mentioned this before, but it would be very helpful if you would put more love into your JBS issue descriptions and - texts. Cheers, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/15063#issuecomment-1655447802 From stuefe at openjdk.org Fri Jul 28 10:34:51 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 28 Jul 2023 10:34:51 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 10:06:01 GMT, Julian Waters wrote: >> src/hotspot/os/windows/osThread_windows.hpp line 30: >> >>> 28: typedef void* HANDLE; >>> 29: public: >>> 30: typedef unsigned int thread_id_t; >> >> Since Windows is LLP64, this has no effect, even though beginthreadex returns an unsigned int. > > It's also another formatting change, as this is printed as %d in shared code. This also matches the definition of int on other platforms. This should functionally be harmless, because ints are promoted to longs (DWORD) whenever thread_id_t is passed to a callee that expects a DWORD, and since longs are the same size as ints this is effectively an implicit cast with no effect in those cases I can live with this change. Its also small and isolated. >> src/hotspot/os_cpu/windows_x86/os_windows_x86.cpp line 422: >> >>> 420: st->print(", RBX=0x%016llx", uc->Rbx); >>> 421: st->print(", RCX=0x%016llx", uc->Rcx); >>> 422: st->print(", RDX=0x%016llx", uc->Rdx); >> >> Why? > > All of these are either DWORD64 or DWORD, none are actually intptr_t's or uintptr_t's In that case UINT64_FORMAT_X would be more correct. But again, unless there is an important reason to do so, I would leave changing this to when someone changes/moves around the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277382451 PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277384588 From jwaters at openjdk.org Fri Jul 28 10:40:52 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 28 Jul 2023 10:40:52 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 07:50:19 GMT, Julian Waters wrote: >> Fix several formatting errors on Windows > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > zPhysicalMemoryBacking_windows.cpp Hi Thomas, > So the point of this change is to satisfy gcc on Windows? Accomodating a new build platform, making (and keeping!) it warning-free is a considerable effort. Even if you do it, it has a lot of side effects on others: reviewer churn, accidental bugs, makes backports more difficult... Not really, I was perfectly capable of solving these issues by silencing the error checker in compilerWarnings_gcc.hpp (see ATTRIBUTE_PRINTF and ATTRIBUTE_SCANF), I decided not to do so since I believed these were reasonable changes as the time to fix the formatting on Windows, I was not aware of the actual scale the finished change would be on. I can retract this change if need be, it's not critical to the Project > For smallish things its okay, but if it keeps causing massive changes like this one we should discuss this first. In my eyes, this is similar to adding a new platform, for which the bar is very high (it is technically less complex than a new platform, but OTOH it is also less isolated). Where would changes like these be discussed? As far as I know, I'm really the only one working on a Project like this. As a side note, the actual changes to HotSpot to get it compiling on gcc are actually minimal, this is technically not one of those changes, and a good chunk of them have already been integrated into mainline Should I take the rest of this to build-dev? Thanks for your time, Julian ------------- PR Comment: https://git.openjdk.org/jdk/pull/15063#issuecomment-1655466004 From stuefe at openjdk.org Fri Jul 28 10:40:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 28 Jul 2023 10:40:54 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 10:08:56 GMT, Julian Waters wrote: >> src/hotspot/share/code/codeHeapState.cpp line 1334: >> >>> 1332: if (SizeDistributionArray != nullptr) { >>> 1333: unsigned long total_count = 0; >>> 1334: uint64_t total_size = 0; >> >> This is a functional change (actually the only one I could spot), and it increases the size of this type to 64-bit on Windows. >> >> It may also be incorrect, since on 32-bit platforms you probably don't want 64-bit here. The correct type to use would be size_t. >> >> Since this change is hidden in a deluge of unnecessary cosmetic changes, if we feel this is needed, this should be addressed in a separate RFE. >> >> But then again, probably not: code cache size is limited to 32-bit, and I bet we have a lot more places to change if we wanted to change that. >> >> Also way out of scope for the RFE description. > > For this one (and the one below) I couldn't figure out whether the intention was for a 32 or 64 bit number in this case. This would be 64 bit on every platform but on Windows, which is 32 bit, and the code directly below this also faces the same issue too, is this intentional? long is 32-bit on Linux x86 and arm32 too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1277388936 From clanger at openjdk.org Fri Jul 28 10:52:15 2023 From: clanger at openjdk.org (Christoph Langer) Date: Fri, 28 Jul 2023 10:52:15 GMT Subject: RFR: 8313316: Disable runtime/ErrorHandling/MachCodeFramesInErrorFile.java on ppc64le Message-ID: Exclude the test on ppc64le due to failure. ------------- Commit messages: - JDK-8313316 Changes: https://git.openjdk.org/jdk/pull/15068/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15068&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313316 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15068.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15068/head:pull/15068 PR: https://git.openjdk.org/jdk/pull/15068 From mbaesken at openjdk.org Fri Jul 28 11:31:48 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 28 Jul 2023 11:31:48 GMT Subject: RFR: 8313316: Disable runtime/ErrorHandling/MachCodeFramesInErrorFile.java on ppc64le In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 10:42:10 GMT, Christoph Langer wrote: > Exclude the test on ppc64le due to failure. Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15068#pullrequestreview-1551928293 From coleenp at openjdk.org Fri Jul 28 12:08:20 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:08:20 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v3] In-Reply-To: References: Message-ID: > This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. > Tested with tier1 linux/macosx/windows on x86 and aarch64. Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: Add back set_higher_dimension ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15059/files - new: https://git.openjdk.org/jdk/pull/15059/files/a9a9eb50..33a29f45 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15059&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15059&range=01-02 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15059.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15059/head:pull/15059 PR: https://git.openjdk.org/jdk/pull/15059 From coleenp at openjdk.org Fri Jul 28 12:08:20 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:08:20 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v3] In-Reply-To: References: <8XHqsZLQdnHQoxLCByQs0yJ84LmCJkHLpt0OkxoKLq4=.e4a307e1-2cbe-45dc-ac5a-50a287b8c7d6@github.com> Message-ID: On Thu, 27 Jul 2023 23:21:37 GMT, David Holmes wrote: >> Thank you for noticing the set_higher_dimension function is unused, Dean. > > But Calvin wants it and doesn't need the release version. Using release where not needed just causes confusion because it screams out "something lock-free and concurrent is happening here", which is not the case in Calvin's code. I can add it back and we can examine it when we fix JDK-8308745. Maybe add an assert or something that the vm is single threaded. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15059#discussion_r1277459290 From coleenp at openjdk.org Fri Jul 28 12:10:50 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:10:50 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v2] In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 22:46:15 GMT, Coleen Phillimore wrote: >> This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. >> Tested with tier1 linux/macosx/windows on x86 and aarch64. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Calvin's suggestions and remove set_higher_dimension. Thanks for reviewing Dean. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15059#issuecomment-1655572968 From coleenp at openjdk.org Fri Jul 28 12:11:59 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:11:59 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v6] In-Reply-To: References: Message-ID: <9Mmh1HMBvK8thMrL88rLdpQAsy8FNUdAu3ZR_tdFh9k=.ec90f5cd-2a39-46ba-8304-11c7d0a1ec92@github.com> On Thu, 27 Jul 2023 22:50:16 GMT, Coleen Phillimore wrote: >> Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. >> Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug > > Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: > > - David suggestion > - David suggestion Thanks for all the work reviewing Dean and David. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14892#issuecomment-1655571656 From coleenp at openjdk.org Fri Jul 28 12:12:02 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:12:02 GMT Subject: RFR: 8312121: Fix -Wconversion warnings in tribool.hpp [v6] In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 03:53:29 GMT, Dean Long wrote: >> Coleen Phillimore has updated the pull request incrementally with two additional commits since the last revision: >> >> - David suggestion >> - David suggestion > > src/hotspot/share/utilities/tribool.hpp line 45: > >> 43: TriBool(bool value) : _value(value) { >> 44: // set to not-default in separate step to avoid conversion warnings >> 45: _value |= 2; > > Suggestion: > > TriBool(bool value) { > *this = value; Thanks for the suggestion but this makes me squint to try to figure out what it does and why it's correct. > src/hotspot/share/utilities/tribool.hpp line 81: > >> 79: _slot ^= ((u1)_value) << _offset; // reset the tribool >> 80: _value = newval; >> 81: _value |= 2; // set to not-default > > Suggestion: > > TriBool::operator=(newval); I don't think this is more readable. I believe you that it doesn't give -Wconversion warnings though. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1277466616 PR Review Comment: https://git.openjdk.org/jdk/pull/14892#discussion_r1277467452 From coleenp at openjdk.org Fri Jul 28 12:12:02 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:12:02 GMT Subject: Integrated: 8312121: Fix -Wconversion warnings in tribool.hpp In-Reply-To: References: Message-ID: On Fri, 14 Jul 2023 19:47:39 GMT, Coleen Phillimore wrote: > Assigning _value first, and then doing _value | 2 doesn't get -Wconversion warnings. Also, reduced include file inclusion a little. > Tested with tier1 on linux-x64-debug, windows-x64-debug, macos-aarch64-debug This pull request has now been integrated. Changeset: 47c4b992 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/47c4b992b44a5ce120aa4fe9e01279d4c52bca0a Stats: 14 lines in 5 files changed: 8 ins; 1 del; 5 mod 8312121: Fix -Wconversion warnings in tribool.hpp Reviewed-by: dlong, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/14892 From coleenp at openjdk.org Fri Jul 28 12:14:56 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:14:56 GMT Subject: RFR: 8312190: Fix c++11-narrowing warnings in hotspot code In-Reply-To: <7v2STKkQMw5UNgAjVYlFPMO6BpoTnsXSXkXMNXXSRMg=.fb773240-2143-44d0-819d-ab5db5d70213@github.com> References: <7v2STKkQMw5UNgAjVYlFPMO6BpoTnsXSXkXMNXXSRMg=.fb773240-2143-44d0-819d-ab5db5d70213@github.com> Message-ID: On Wed, 19 Jul 2023 07:49:08 GMT, Daniel Jeli?ski wrote: >> This patch fixes compilation warnings produced by Clang when compiling on Windows. >> >> Clang emulates MSVC behavior and uses `int` for enumeration types that do not explicitly specify the underlying type. This patch sets an explicit underlying type for 3 enumerations to fix the warnings. >> >> See Microsoft's documentation of [Zc:enumTypes](https://learn.microsoft.com/en-us/cpp/build/reference/zc-enumtypes?view=msvc-170) for more information. > > Mach5 came back clean. Thanks for the reviews! @djelinski Please review PR https://github.com/openjdk/jdk/pull/15056 and/or test with your compiler, please. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14907#issuecomment-1655578024 From coleenp at openjdk.org Fri Jul 28 12:23:54 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:23:54 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v4] In-Reply-To: References: Message-ID: <6hoNYr4A3A_b-4Jrdtny-m-YjCn-ytF36NaZfQh4o98=.b58935e9-ee7f-4427-bf88-5117ddc65296@github.com> On Tue, 18 Jul 2023 16:48:55 GMT, Jean-Philippe Bempel wrote: >> Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. > > Jean-Philippe Bempel has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Revert resolved class to unresolved for comparison > > remove is_unresolved_class_mismatch I have a suggestion for your test and copyright fix. test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/RedefineLeakThrowable.java line 2: > 1: /* > 2: * Copyright (c) 2016, 2023, Oracle and/or its affiliates. All rights reserved. Please remove the 2016 copyright on the new test. test/hotspot/jtreg/serviceability/jvmti/RedefineClasses/RedefineLeakThrowable.java line 58: > 56: } > 57: } > 58: } If you look in that directory, there's an even simpler way to redefine classes without the Transformer and agent code. It looks like this: RedefineClassHelper.redefineClass(RedefineRunningMethods_B.class, evenNewerB); ------------- Changes requested by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14780#pullrequestreview-1552009914 PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1277479719 PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1277482646 From fbredberg at openjdk.org Fri Jul 28 12:30:15 2023 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Fri, 28 Jul 2023 12:30:15 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames Message-ID: Implementation of relativized last_sp (top_frame_sp on PowerPC) in interpreter frames for x64, aarch64, ppc64le and riscv. Not relativized last_sp on arm, zero and s390 but done some changes to cope with the changed generic code. By changing the "last_sp" member in interpreter frames from being an absolute address into an offset that is relative to the frame pointer, we don't need to change the value as we freeze and thaw frames of virtual threads. This is since we might freeze and thaw from and to different worker threads, so the absolute address to locals might change, but the offset from the frame pointer will be constant. This subtask only handles "last_sp" (and its close equivalent "top_frame_sp" on PowerPC). The relativization of other interpreter frame members are handled in other subtasks to JDK-8289296. Tested tier1-tier7 on supported platforms. The rest was sanity tested using Qemu. ------------- Commit messages: - Fixed an assert failure on linux-x86 - 8308984: Relativize last_sp in interpreter frames Changes: https://git.openjdk.org/jdk/pull/14545/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14545&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308984 Stats: 148 lines in 28 files changed: 103 ins; 1 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/14545.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14545/head:pull/14545 PR: https://git.openjdk.org/jdk/pull/14545 From aph at openjdk.org Fri Jul 28 12:38:56 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 28 Jul 2023 12:38:56 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v6] In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 03:59:26 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Preserve and restore register Z_R6 > > Though Z_R6 is argument register it is a saved register > so preserve and restore Z_R6 register src/hotspot/cpu/s390/upcallLinker_s390.cpp line 72: > 70: // Z_SP saved/restored by prologue/epilogue > 71: if (reg == Z_SP) continue; > 72: // though Z_R6 is argument register it is a saved register Suggestion: ` // although Z_R6 is used for parameter passing, it must be saved and restored by a called function.` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1277496026 From stuefe at openjdk.org Fri Jul 28 12:42:50 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 28 Jul 2023 12:42:50 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 10:38:03 GMT, Julian Waters wrote: > > Hi Thomas, > > > So the point of this change is to satisfy gcc on Windows? Accomodating a new build platform, making (and keeping!) it warning-free is a considerable effort. Even if you do it, it has a lot of side effects on others: reviewer churn, accidental bugs, makes backports more difficult... > > Not really, I was perfectly capable of solving these issues by silencing the error checker in compilerWarnings_gcc.hpp (see ATTRIBUTE_PRINTF and ATTRIBUTE_SCANF), I decided not to do so since I believed these were reasonable changes as the time to fix the formatting on Windows, I was not aware of the actual scale the finished change would be on. I can retract this change if need be, it's not critical to the Project > > > For smallish things its okay, but if it keeps causing massive changes like this one we should discuss this first. In my eyes, this is similar to adding a new platform, for which the bar is very high (it is technically less complex than a new platform, but OTOH it is also less isolated). > > Where would changes like these be discussed? As far as I know, I'm really the only one working on a Project like this. As a side note, the actual changes to HotSpot to get it compiling on gcc are actually minimal, this is technically not one of those changes, and a good chunk of them have already been integrated into mainline > > Should I take the rest of this to build-dev? > For discussing things that relate to hotspot coding, e.g. INTPTR_FORMAT, hotspot-dev would be better. > Thanks for your time, Julian No problem. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15063#issuecomment-1655623178 From fbredberg at openjdk.org Fri Jul 28 12:44:50 2023 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Fri, 28 Jul 2023 12:44:50 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Mon, 19 Jun 2023 15:59:15 GMT, Fredrik Bredberg wrote: > Implementation of relativized last_sp (top_frame_sp on PowerPC) in interpreter frames for x64, aarch64, ppc64le and riscv. > Not relativized last_sp on arm, zero and s390 but done some changes to cope with the changed generic code. > > By changing the "last_sp" member in interpreter frames from being an absolute address into an offset that is relative to the frame pointer, we don't need to change the value as we freeze and thaw frames of virtual threads. This is since we might freeze and thaw from and to different worker threads, so the absolute address to locals might change, but the offset from the frame pointer will be constant. > > This subtask only handles "last_sp" (and its close equivalent "top_frame_sp" on PowerPC). The relativization of other interpreter frame members are handled in other subtasks to JDK-8289296. > > Tested tier1-tier7 on supported platforms. The rest was sanity tested using Qemu. I've done basic loom testing on ppc64le and riscv64 using Qemu, but would appreciate if @reinrich and @RealFYang could take it for a real test drive. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14545#issuecomment-1655625280 From coleenp at openjdk.org Fri Jul 28 12:49:46 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 12:49:46 GMT Subject: RFR: 8308762: Metaspace leak with Instrumentation.retransform [v4] In-Reply-To: References: Message-ID: On Tue, 18 Jul 2023 16:48:55 GMT, Jean-Philippe Bempel wrote: >> Fix a small leak in constant pool merging during retransformation of a class. If this class has a catch block with `Throwable`, the class `Throwable` is pre-resolved in the constant pool, while all the other classes are in a unresolved state. So the constant pool merging process was considering the entry with pre-resolved class as different compared to the destination and create a new entry. We now try to consider it as equal specially for Methodref/Fieldref. > > Jean-Philippe Bempel has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision: > > Revert resolved class to unresolved for comparison > > remove is_unresolved_class_mismatch src/hotspot/share/prims/jvmtiRedefineClasses.cpp line 1710: > 1708: *merge_cp_p, scratch_i)) { > 1709: // The mismatch in compare_entry_to() above is because of a > 1710: // resolved versus unresolved class entry at the same index I'm sorry for the piecemeal review. There's another comment that mentions this comment that should be removed. It starts with this: - // The find_matching_entry() call above could fail to find a match ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14780#discussion_r1277505986 From rrich at openjdk.org Fri Jul 28 14:45:48 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 28 Jul 2023 14:45:48 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 12:41:40 GMT, Fredrik Bredberg wrote: > I've done basic loom testing on ppc64le and riscv64 using Qemu, but would appreciate if @reinrich and @RealFYang could take it for a real test drive. PPC code looks good. I've tested `jtreg:test/jdk/jdk/internal/vm/Continuation` successfully. Will do more testing until monday. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14545#issuecomment-1655808028 From ccheung at openjdk.org Fri Jul 28 15:54:43 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Fri, 28 Jul 2023 15:54:43 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v3] In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 12:08:20 GMT, Coleen Phillimore wrote: >> This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. >> Tested with tier1 linux/macosx/windows on x86 and aarch64. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add back set_higher_dimension Thanks for adding back the set_higher_dimension function. ------------- Marked as reviewed by ccheung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15059#pullrequestreview-1552407141 From coleenp at openjdk.org Fri Jul 28 16:26:48 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 16:26:48 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 20:20:33 GMT, Matias Saavedra Silva wrote: > Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. > > The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. Looks good. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15027#pullrequestreview-1552457115 From coleenp at openjdk.org Fri Jul 28 16:35:00 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 16:35:00 GMT Subject: RFR: 8312262: Klass::array_klass() should return ArrayKlass pointer [v3] In-Reply-To: References: Message-ID: <-knObe-qDwUk6XX7rvbKHrYUO3kafhsv00q-bMy7cOA=.6f9cc49c-ec86-4a9f-96a3-9c26986c2399@github.com> On Fri, 28 Jul 2023 12:08:20 GMT, Coleen Phillimore wrote: >> This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. >> Tested with tier1 linux/macosx/windows on x86 and aarch64. > > Coleen Phillimore has updated the pull request incrementally with one additional commit since the last revision: > > Add back set_higher_dimension Thanks for reviewing, Calvin. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15059#issuecomment-1655973452 From coleenp at openjdk.org Fri Jul 28 16:35:01 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Fri, 28 Jul 2023 16:35:01 GMT Subject: Integrated: 8312262: Klass::array_klass() should return ArrayKlass pointer In-Reply-To: References: Message-ID: <8lP7jmZOGRIIZKTgDflV5DWu6wetodor0kEYptCfeQo=.584c34a4-92bc-4c63-aa6f-21bce0fb1956@github.com> On Thu, 27 Jul 2023 19:22:52 GMT, Coleen Phillimore wrote: > This is a simple change to make array_klass() return ArrayKlass (the first dimension of TypeArrayKlass is a TypeArrayKlass so can't use ObjArrayKlass), higher_dimension is always an ObjArrayKlass and lower_dimension can be a TypeArrayKlass. The change removes some casts. > Tested with tier1 linux/macosx/windows on x86 and aarch64. This pull request has now been integrated. Changeset: e8970417 Author: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/e897041770f9e321cd8526c6a29c5e19bbecaa55 Stats: 68 lines in 14 files changed: 1 ins; 5 del; 62 mod 8312262: Klass::array_klass() should return ArrayKlass pointer Reviewed-by: dlong, ccheung ------------- PR: https://git.openjdk.org/jdk/pull/15059 From dcubed at openjdk.org Fri Jul 28 16:37:51 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 28 Jul 2023 16:37:51 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes In-Reply-To: References: Message-ID: <9p47J4tN0_ULOjMRQQbXXxauFpjKStySU1Bgn-SoKfg=.9ca71dba-0eea-45b5-9fdb-b95df244a5cb@github.com> On Thu, 27 Jul 2023 22:52:53 GMT, David Holmes wrote: >> As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. >> >> There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. >> >> Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. >> >> More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. >> >> Additional testing: >> - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits >> - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits >> - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) > > To be clear I very strongly object to ReentrantMutexLocker as a name. @dholmes-ora - Thanks for your comments about ReentrantMutexLocker. I couldn't figure out a good way to write why that name was bugging me since Mutexs are not reentrant. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1655978790 From shade at openjdk.org Fri Jul 28 18:57:15 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 28 Jul 2023 18:57:15 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes [v2] In-Reply-To: References: Message-ID: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: Replace ReentrantMutexLocker with ConditionalMutexLocker ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15043/files - new: https://git.openjdk.org/jdk/pull/15043/files/5962871f..4b140819 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15043&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15043&range=00-01 Stats: 37 lines in 10 files changed: 6 ins; 11 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/15043.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15043/head:pull/15043 PR: https://git.openjdk.org/jdk/pull/15043 From shade at openjdk.org Fri Jul 28 19:04:55 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 28 Jul 2023 19:04:55 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes [v2] In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 18:57:15 GMT, Aleksey Shipilev wrote: >> As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. >> >> There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. >> >> Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. >> >> More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. >> >> Additional testing: >> - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits >> - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits >> - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) > > Aleksey Shipilev has updated the pull request incrementally with one additional commit since the last revision: > > Replace ReentrantMutexLocker with ConditionalMutexLocker All right, I did not like `ReentrantMutexLocker` anyways. Replaced it with a bit more verbose `ConditionalMutexLocker` uses. Still cannot find a better name than `ConditionalMutexLocker`. Let me sleep on `MutexLockerWhen`... ------------- PR Comment: https://git.openjdk.org/jdk/pull/15043#issuecomment-1656191363 From vkempik at openjdk.org Fri Jul 28 21:58:56 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 28 Jul 2023 21:58:56 GMT Subject: RFR: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic [v8] In-Reply-To: <1kKdchjPCDUAUEbHwMTpkXajFjlEl0nGWn4YTLJscv8=.d49b5e9a-92e8-4017-a00c-b0ac9a040c71@github.com> References: <1kKdchjPCDUAUEbHwMTpkXajFjlEl0nGWn4YTLJscv8=.d49b5e9a-92e8-4017-a00c-b0ac9a040c71@github.com> Message-ID: On Wed, 26 Jul 2023 09:52:17 GMT, Vladimir Kempik wrote: >> Please review this fix. it eliminates misaligned loads in String.compare on risc-v >> >> for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, >> it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. >> >> so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. >> >> I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. >> >> Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. >> >> Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. >> >> for large strings, the instrinsics from stubGenerator.cpp is used >> for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. >> >> large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: >> These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. >> >> This also enables regression test for string.Compare which previously was aarch64-only >> >> Testing: tier1 and tier2 clean on hifive. >> >> JMH testing, hifive: >> before: >> >> Benchmark (delta) (size) Mode Cnt Score Error Units >> StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op >> StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op >> StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op >> StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op >> StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op >> StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? ... > > Vladimir Kempik has updated the pull request incrementally with one additional commit since the last revision: > > more nits tier1/tier2 are fine ------------- PR Comment: https://git.openjdk.org/jdk/pull/14534#issuecomment-1656369827 From vkempik at openjdk.org Fri Jul 28 21:58:56 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 28 Jul 2023 21:58:56 GMT Subject: Integrated: 8310268: RISC-V: misaligned memory access in String.Compare intrinsic In-Reply-To: References: Message-ID: On Mon, 19 Jun 2023 05:57:05 GMT, Vladimir Kempik wrote: > Please review this fix. it eliminates misaligned loads in String.compare on risc-v > > for small compares ( <= 72 bytes), the instrinsic in c2_MacroAssembler.cpp is used, > it reads ( in case of UU/LL) 8 bytes per loop, and at then end, it reads tail - misaligned load of last 8 bytes from the string. > > so if string length is not 8x bytes long then last load is misaligned, also it performs read/compare of some data which already was processed. > > I have changed that to compare only last length%8 bytes using SHORT_STRING part of intrinsic for UL/LU. But for UU/LL I have made an optimised version. > > Thanks to optimisations for conditional branching at line [947](https://github.com/openjdk/jdk/pull/14534/files#diff-35eb1d2f1e2f0514dd46bd7fbad49ff2c87703d5a3041a6433956df00a3fe6e6R947) I?ve got no perf drop on thead ( with +AvoidUnalignedAccesses) which supports misaligned access. > > Improvements to inflate_XX() methods gives 3-5% improvements for UL/LU cases on thead, almost no perf change on hifive. > > for large strings, the instrinsics from stubGenerator.cpp is used > for UU/LL - generate_compare_long_string_same_encoding, I have just replaced misaligned ld with load_long_misaligned. Since this tail reading is not on hot path, this give some small penalty for thead when -XX:+AvoidUnalignedAccesses. > > large LU/UL comparision is done in compare_long_string_different_encoding in sutbGenerator.cpp: > These changes are partially based on feilongjiang's patch, but I have changed tail reading to prevent reading past the end of string. I have observed no perf difference between feilongjiang's and my version. > > This also enables regression test for string.Compare which previously was aarch64-only > > Testing: tier1 and tier2 clean on hifive. > > JMH testing, hifive: > before: > > Benchmark (delta) (size) Mode Cnt Score Error Units > StringCompareToDifferentLength.compareToLL 2 24 avgt 9 6.474 ? 1.475 ms/op > StringCompareToDifferentLength.compareToLL 2 36 avgt 9 125.823 ? 1.947 ms/op > StringCompareToDifferentLength.compareToLL 2 72 avgt 9 10.512 ? 0.236 ms/op > StringCompareToDifferentLength.compareToLL 2 128 avgt 9 13.032 ? 0.821 ms/op > StringCompareToDifferentLength.compareToLL 2 256 avgt 9 18.983 ? 0.318 ms/op > StringCompareToDifferentLength.compareToLL 2 512 avgt 9 29.925 ? 0.458 ms/op > StringCompareToDifferentLength.compareToLL 2 ... This pull request has now been integrated. Changeset: d6245b68 Author: Vladimir Kempik URL: https://git.openjdk.org/jdk/commit/d6245b6832ccd1da04616e8ba4b90321b2551971 Stats: 145 lines in 4 files changed: 25 ins; 44 del; 76 mod 8310268: RISC-V: misaligned memory access in String.Compare intrinsic Co-authored-by: Feilong Jiang Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/14534 From kbarrett at openjdk.org Sat Jul 29 00:41:02 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Sat, 29 Jul 2023 00:41:02 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: <-_YAKPEFAUT7HCxmF4iFxdyyyYfK6uK7sMNY044C0HE=.be47a1e4-ff65-4efd-b897-c35262203d1b@github.com> On Fri, 28 Jul 2023 10:36:12 GMT, Thomas Stuefe wrote: >> For this one (and the one below) I couldn't figure out whether the intention was for a 32 or 64 bit number in this case. This would be 64 bit on every platform but on Windows, which is 32 bit, and the code directly below this also faces the same issue too, is this intentional? > > long is 32-bit on Linux x86 and arm32 too. [not a review, just a drive-by comment] The type `long` really shouldn't be used in shared code. There have been discussions and maybe some efforts to nuke uses, but having just done the grep, there are still quite a few. The one in question might actually be an appropriate place to use `uintx`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1278191849 From jwaters at openjdk.org Sat Jul 29 04:49:56 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 29 Jul 2023 04:49:56 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v17] In-Reply-To: References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: On Fri, 23 Jun 2023 02:31:11 GMT, Julian Waters wrote: >> C++11 added the alignas attribute, for the purpose of specifying alignment on types, much like compiler specific syntax such as gcc's __attribute__((aligned(x))) or Visual C++'s __declspec(align(x)). >> >> We can phase out the use of the macro in favor of the standard attribute. In the meantime, we can replace the compiler specific definitions of ATTRIBUTE_ALIGNED with a portable definition. We might deprecate the use of the macro but changing its implementation quickly and cleanly applies the feature where the macro is being used. >> >> Note: With certain parts of HotSpot using ATTRIBUTE_ALIGNED so indiscriminately, this commit will likely take some time to get right >> >> This will require adding the alignas attribute to the list of language features approved for use in HotSpot code. (Completed with [8297912](https://github.com/openjdk/jdk/pull/11446)) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: > > - Merge branch 'openjdk:master' into alignas > - Merge branch 'master' into alignas > - Merge branch 'openjdk:master' into alignas > - alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - ... and 7 more: https://git.openjdk.org/jdk/compare/5a82fa3b...bb9ae391 Bumping ------------- PR Comment: https://git.openjdk.org/jdk/pull/11431#issuecomment-1656554688 From jwaters at openjdk.org Sat Jul 29 04:59:57 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 29 Jul 2023 04:59:57 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> Message-ID: On Fri, 28 Jul 2023 07:50:19 GMT, Julian Waters wrote: >> Fix several formatting errors on Windows > > Julian Waters has updated the pull request incrementally with one additional commit since the last revision: > > zPhysicalMemoryBacking_windows.cpp Will close and address the issues discovered in this change individually ------------- PR Comment: https://git.openjdk.org/jdk/pull/15063#issuecomment-1656556048 From jwaters at openjdk.org Sat Jul 29 04:59:58 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 29 Jul 2023 04:59:58 GMT Subject: RFR: 8313302: Fix formatting errors on Windows [v3] In-Reply-To: <-_YAKPEFAUT7HCxmF4iFxdyyyYfK6uK7sMNY044C0HE=.be47a1e4-ff65-4efd-b897-c35262203d1b@github.com> References: <1W3XYadOwbC1ENELwpfdDpowgOmwnSRLnFd0F4W74Xo=.7d3f4b91-7f1f-41ba-9e97-4533d21693e6@github.com> <-_YAKPEFAUT7HCxmF4iFxdyyyYfK6uK7sMNY044C0HE=.be47a1e4-ff65-4efd-b897-c35262203d1b@github.com> Message-ID: On Sat, 29 Jul 2023 00:38:18 GMT, Kim Barrett wrote: >> long is 32-bit on Linux x86 and arm32 too. > > [not a review, just a drive-by comment] > The type `long` really shouldn't be used in shared code. There have been discussions and maybe some efforts > to nuke uses, but having just done the grep, there are still quite a few. The one in question might actually be an > appropriate place to use `uintx`. Hi Kim, mind if you share the grep command for that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15063#discussion_r1278246025 From jwaters at openjdk.org Sat Jul 29 04:59:59 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 29 Jul 2023 04:59:59 GMT Subject: Withdrawn: 8313302: Fix formatting errors on Windows In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 07:25:26 GMT, Julian Waters wrote: > Fix several formatting errors on Windows This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/15063 From tanksherman27 at gmail.com Sat Jul 29 05:06:21 2023 From: tanksherman27 at gmail.com (Julian Waters) Date: Sat, 29 Jul 2023 13:06:21 +0800 Subject: Formatting on Windows Message-ID: As discussed in https://github.com/openjdk/jdk/pull/15063, how many of these are formatting errors worth fixing, and how should they be fixed if they have to be? gcc has flagged all of them as formatting errors, and while I could simply silence the checker in compilerWarnings_gcc.hpp on Windows, I'm unsure if that's the right course of action to take best regards, Julian -------------- next part -------------- An HTML attachment was scrubbed... URL: From david.holmes at oracle.com Sat Jul 29 12:45:27 2023 From: david.holmes at oracle.com (David Holmes) Date: Sat, 29 Jul 2023 22:45:27 +1000 Subject: Formatting on Windows In-Reply-To: References: Message-ID: <9209d59c-3f5d-5c39-924c-e3345c52e463@oracle.com> Hi Julian, On 29/07/2023 3:06 pm, Julian Waters wrote: > As discussed in https://github.com/openjdk/jdk/pull/15063 > , how many of?these are > formatting errors worth fixing, and how should?they be fixed if?they > have?to be? gcc has flagged all of?them as formatting errors, and while > I could simply silence?the checker in compilerWarnings_gcc.hpp on > Windows, I'm unsure if?that's?the right course of action?to?take Where was it determined that building for Windows using gcc was a goal of OpenJDK? This is (as Thomas tried to point out) effectively another port, so who is testing and supporting it? I can understand it is an interesting side project for you, but we all end up with the burden of having to deal with this - first by reviewing these PR's and then by dealing with future breakage. AFAIK no one delivering OpenJDK is using gcc for a Windows build. You can't expect everyone to check whether a change that is fine with VS on Windows is also fine with gcc, so this will be forever breaking with ensuing PRs (that others have to expend effort reviewing) to fix it again. Sorry to dampen your enthusiasm on contributing here but there has to be a strong reason to make these kinds of changes. Who wants/needs this? Regards, David > best regards, > Julian From thomas.stuefe at gmail.com Sat Jul 29 13:43:12 2023 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sat, 29 Jul 2023 15:43:12 +0200 Subject: Formatting on Windows In-Reply-To: <9209d59c-3f5d-5c39-924c-e3345c52e463@oracle.com> References: <9209d59c-3f5d-5c39-924c-e3345c52e463@oracle.com> Message-ID: On Sat, Jul 29, 2023 at 2:45?PM David Holmes wrote: > Hi Julian, > > On 29/07/2023 3:06 pm, Julian Waters wrote: > > As discussed in https://github.com/openjdk/jdk/pull/15063 > > , how many of these are > > formatting errors worth fixing, and how should they be fixed if they > > have to be? gcc has flagged all of them as formatting errors, and while > > I could simply silence the checker in compilerWarnings_gcc.hpp on > > Windows, I'm unsure if that's the right course of action to take > > Where was it determined that building for Windows using gcc was a goal > of OpenJDK? This is (as Thomas tried to point out) effectively another > port, so who is testing and supporting it? I can understand it is an > interesting side project for you, but we all end up with the burden of > having to deal with this - first by reviewing these PR's and then by > dealing with future breakage. AFAIK no one delivering OpenJDK is using > gcc for a Windows build. You can't expect everyone to check whether a > change that is fine with VS on Windows is also fine with gcc, so this > will be forever breaking with ensuing PRs (that others have to expend > effort reviewing) to fix it again. > > Sorry to dampen your enthusiasm on contributing here but there has to be > a strong reason to make these kinds of changes. Who wants/needs this? > > Adding my 5 cents to this: While I understand the theoretical allure of being able to use gcc on Windows, I am very skeptical of the practical relevance. I have worked on weird Operating Systems with odd C++ toolchains. My experience is that you never want to base an important infrastructure project like OpenJDK on a niche Compiler. This we call the "Little Red Riding Hood Path" in German, where you are off the trodden paths, all alone by yourself. You will run into many problems nobody else runs into. The OpenJDK itself is complex enough without having to deal with an unreliable build toolchain. Therefore, a hypothetical "gcc on Windows" port will cause constant maintenance efforts and be a source of fun long after you have written your patches. Those are the immediate costs. Add to that the disturbance you cause by invasive "shotgun spread" changes. Add to that the cost of opportunity: your massive changes have to be reviewed by someone, and that someone cannot review something else, possibly more important. Add to that the effort of maintaining another build environment, which will never replace VS, so it comes atop. Those are the cost. If there were an entity - a corporation or foundation - with engineers on their payroll and a good track record for keeping things running, and that entity wanted to sponsor the "gcc on Windows" work and shoulder at least the immediate costs, this would be worth considering. But I don't see that. Nor do I see a need. As long as we have a VS toolchain that works well. Should MS ever screw that up, the arguments for a gcc port for Windows might become more compelling. But even then, it would be worth considering alternatives first, e.g. using Intel compilers or other products. Lastly, I hate discouraging contributors. You are obviously very enthusiastic and driven. If you just want to dig your teeth into OpenJDK, there are plenty of things you could do. In general, the more focused (few lines of code) and useful (fixing actual issues) a patch is, the better. Cheers, Thomas -------------- next part -------------- An HTML attachment was scrubbed... URL: From tanksherman27 at gmail.com Sat Jul 29 14:13:06 2023 From: tanksherman27 at gmail.com (Julian Waters) Date: Sat, 29 Jul 2023 22:13:06 +0800 Subject: Formatting on Windows In-Reply-To: References: <9209d59c-3f5d-5c39-924c-e3345c52e463@oracle.com> Message-ID: Hi Thomas and David, No worries, no discouragement taken. I don't mind the criticism, my only worry is that I haven't angered any reviewers recently these past few days. I actually started the efforts on behalf of the MSYS2 project (which primarily uses the gcc compiler and is in desperate need and want of a working JDK), so I'll put a break to the changes I've been making lately and let them decide on what to do going forward. I'll admit I've not been happy with how my changes have declined from high quality ones to being hard to review either, so hopefully I can return to properly helping out soon. Till then, thanks for the advice to both best regards, Julian On Sat, Jul 29, 2023 at 9:43?PM Thomas St?fe wrote: > > > On Sat, Jul 29, 2023 at 2:45?PM David Holmes > wrote: > >> Hi Julian, >> >> On 29/07/2023 3:06 pm, Julian Waters wrote: >> > As discussed in https://github.com/openjdk/jdk/pull/15063 >> > , how many of these are >> > formatting errors worth fixing, and how should they be fixed if they >> > have to be? gcc has flagged all of them as formatting errors, and while >> > I could simply silence the checker in compilerWarnings_gcc.hpp on >> > Windows, I'm unsure if that's the right course of action to take >> >> Where was it determined that building for Windows using gcc was a goal >> of OpenJDK? This is (as Thomas tried to point out) effectively another >> port, so who is testing and supporting it? I can understand it is an >> interesting side project for you, but we all end up with the burden of >> having to deal with this - first by reviewing these PR's and then by >> dealing with future breakage. AFAIK no one delivering OpenJDK is using >> gcc for a Windows build. You can't expect everyone to check whether a >> change that is fine with VS on Windows is also fine with gcc, so this >> will be forever breaking with ensuing PRs (that others have to expend >> effort reviewing) to fix it again. >> >> Sorry to dampen your enthusiasm on contributing here but there has to be >> a strong reason to make these kinds of changes. Who wants/needs this? >> >> > Adding my 5 cents to this: > > While I understand the theoretical allure of being able to use gcc on > Windows, I am very skeptical of the practical relevance. > > I have worked on weird Operating Systems with odd C++ toolchains. My > experience is that you never want to base an important infrastructure > project like OpenJDK on a niche Compiler. This we call the "Little Red > Riding Hood Path" in German, where you are off the trodden paths, all alone > by yourself. You will run into many problems nobody else runs into. The > OpenJDK itself is complex enough without having to deal with an unreliable > build toolchain. > > Therefore, a hypothetical "gcc on Windows" port will cause constant > maintenance efforts and be a source of fun long after you have written your > patches. Those are the immediate costs. Add to that the disturbance you > cause by invasive "shotgun spread" changes. Add to that the cost of > opportunity: your massive changes have to be reviewed by someone, and that > someone cannot review something else, possibly more important. Add to that > the effort of maintaining another build environment, which will never > replace VS, so it comes atop. > > Those are the cost. If there were an entity - a corporation or foundation > - with engineers on their payroll and a good track record for keeping > things running, and that entity wanted to sponsor the "gcc on Windows" work > and shoulder at least the immediate costs, this would be worth considering. > But I don't see that. > > Nor do I see a need. As long as we have a VS toolchain that works well. > Should MS ever screw that up, the arguments for a gcc port for Windows > might become more compelling. But even then, it would be worth considering > alternatives first, e.g. using Intel compilers or other products. > > Lastly, I hate discouraging contributors. You are obviously very > enthusiastic and driven. If you just want to dig your teeth into OpenJDK, > there are plenty of things you could do. In general, the more focused (few > lines of code) and useful (fixing actual issues) a patch is, the better. > > Cheers, Thomas > > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From duke at openjdk.org Sat Jul 29 16:40:02 2023 From: duke at openjdk.org (duke) Date: Sat, 29 Jul 2023 16:40:02 GMT Subject: Withdrawn: 8297967: Make frame::safe_for_sender safer In-Reply-To: References: Message-ID: On Thu, 1 Dec 2022 16:47:48 GMT, Johannes Bechberger wrote: > Makes `frame::safe_for_sender` safer by checking that the location of the return address, sender stack pointer, and link address is accessible. This makes the method safer in the case of broken frames. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/11461 From kim.barrett at oracle.com Sun Jul 30 12:49:06 2023 From: kim.barrett at oracle.com (Kim Barrett) Date: Sun, 30 Jul 2023 12:49:06 +0000 Subject: Formatting on Windows In-Reply-To: References: <9209d59c-3f5d-5c39-924c-e3345c52e463@oracle.com> Message-ID: <11A3AAF6-7E52-41EE-B846-F39E1E8F0EF0@oracle.com> > On Jul 29, 2023, at 10:13 AM, Julian Waters wrote: > > Hi Thomas and David, > > No worries, no discouragement taken. I don't mind the criticism, my only worry is that I haven't angered any reviewers recently these past few days. I actually started the efforts on behalf of the MSYS2 project (which primarily uses the gcc compiler and is in desperate need and want of a working JDK), so I'll put a break to the changes I've been making lately and let them decide on what to do going forward. I'll admit I've not been happy with how my changes have declined from high quality ones to being hard to review either, so hopefully I can return to properly helping out soon. Till then, thanks for the advice to both Just because MSYS2 (or some other project) ?needs" an OpenJDK port doesn?t mean the OpenJDK project should be on the hook to provide it. A port needs a maintainer organization that commits to providing the necessary resources and can be trusted to follow through on that commitment. (There are probably other requirements, but that?s the first one.) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From dholmes at openjdk.org Sun Jul 30 22:25:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 30 Jul 2023 22:25:53 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 20:20:33 GMT, Matias Saavedra Silva wrote: > Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. > > The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. The basic `which` change is good and useful, but I don't think most of the other renamings were really necessary. It is hard to maintain a convention that an index into to the CP must always be called `cp_index` - the `cp` is often redundant given the context. Please double-check all comments where parameter names have changed to ensure they refer to the new names. Thanks. src/hotspot/share/oops/constantPool.cpp line 237: > 235: int len = length(); > 236: int num_klasses = 0; > 237: for (int cp_index = 1; cp_index 454: } > 455: > 456: void ConstantPool::string_at_put(int obj_index, oop str) { Should `obj_index` be `cp_index` here? src/hotspot/share/oops/constantPool.cpp line 945: > 943: // called to create oops from constants to use in arguments for invokedynamic > 944: oop ConstantPool::resolve_constant_at_impl(const constantPoolHandle& this_cp, > 945: int cp_index, int cache_index, Not sure that we really need to always say `cp_index` given the context. Changing `which` is good, but `index` was perfectly fine IMO. src/hotspot/share/oops/constantPool.cpp line 1615: > 1613: // to the constant pool to_cp's entries starting at to_i. A total of > 1614: // (end_i - start_i) + 1 entries are copied. > 1615: void ConstantPool::copy_cp_to_impl(const constantPoolHandle& from_cp, int start_cpi, int end_cpi, The comment needs updating now the parameter names have changed. Again I don't think there was any need to change the names in these cases. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15027#pullrequestreview-1553712856 PR Review Comment: https://git.openjdk.org/jdk/pull/15027#discussion_r1278624771 PR Review Comment: https://git.openjdk.org/jdk/pull/15027#discussion_r1278624805 PR Review Comment: https://git.openjdk.org/jdk/pull/15027#discussion_r1278625974 PR Review Comment: https://git.openjdk.org/jdk/pull/15027#discussion_r1278625517 From dholmes at openjdk.org Sun Jul 30 22:59:49 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 30 Jul 2023 22:59:49 GMT Subject: RFR: JDK-8313251: Add NativeLibraryLoad event [v2] In-Reply-To: References: Message-ID: <5GxVuSiKl0IRkZa3BUBkwV3qbgE6XwQzjZPiLhWq078=.c7f3ae5f-3860-4dde-ba44-64324aee9bdb@github.com> On Fri, 28 Jul 2023 09:38:13 GMT, Matthias Baesken wrote: >> Add a NativeLibraryLoad event that provides us more detail about shared lib/dll loads. This gives a time stamp and success + error details of the load operation. It enhances the already existing information we get from the existing NativeLibrary event (that periodically samples the native modules of the jvm process). > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > add macro guards because the build errors in zero build test/jdk/jdk/jfr/event/runtime/TestNativeLibraryLoadEvent.java line 2: > 1: /* > 2: * Copyright (c) 2013, 2023, Oracle and/or its affiliates. All rights reserved. New test should have single copyright year, even if copied from existing test. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15065#discussion_r1278631089 From fyang at openjdk.org Mon Jul 31 01:04:00 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 31 Jul 2023 01:04:00 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Mon, 19 Jun 2023 15:59:15 GMT, Fredrik Bredberg wrote: > Implementation of relativized last_sp (top_frame_sp on PowerPC) in interpreter frames for x64, aarch64, ppc64le and riscv. > Not relativized last_sp on arm, zero and s390 but done some changes to cope with the changed generic code. > > By changing the "last_sp" member in interpreter frames from being an absolute address into an offset that is relative to the frame pointer, we don't need to change the value as we freeze and thaw frames of virtual threads. This is since we might freeze and thaw from and to different worker threads, so the absolute address to locals might change, but the offset from the frame pointer will be constant. > > This subtask only handles "last_sp" (and its close equivalent "top_frame_sp" on PowerPC). The relativization of other interpreter frame members are handled in other subtasks to JDK-8289296. > > Tested tier1-tier7 on supported platforms. The rest was sanity tested using Qemu. This has passed hotspot_loom, jdk_loom and tier1-3 tests on linux-riscv64 platform. Minor suggestions for riscv code. Thanks. src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp line 430: > 428: // Restore stack bottom in case i2c adjusted stack > 429: __ ld(t0, Address(fp, frame::interpreter_frame_last_sp_offset * wordSize)); > 430: __ shadd(esp, t0, fp, t1, LogBytesPerWord); Suggestion: `__ shadd(esp, t0, fp, t0, LogBytesPerWord);` src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp line 488: > 486: // Restore expression stack pointer > 487: __ ld(t0, Address(fp, frame::interpreter_frame_last_sp_offset * wordSize)); > 488: __ shadd(esp, t0, fp, t1, LogBytesPerWord); Suggestion: `__ shadd(esp, t0, fp, t0, LogBytesPerWord);` src/hotspot/cpu/riscv/templateInterpreterGenerator_riscv.cpp line 1610: > 1608: // Restore the last_sp and null it out > 1609: __ ld(t0, Address(fp, frame::interpreter_frame_last_sp_offset * wordSize)); > 1610: __ shadd(esp, t0, fp, t1, LogBytesPerWord); Suggestion: `__ shadd(esp, t0, fp, t0, LogBytesPerWord);` ------------- Changes requested by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14545#pullrequestreview-1553749311 PR Review Comment: https://git.openjdk.org/jdk/pull/14545#discussion_r1278649581 PR Review Comment: https://git.openjdk.org/jdk/pull/14545#discussion_r1278649627 PR Review Comment: https://git.openjdk.org/jdk/pull/14545#discussion_r1278649642 From iklam at openjdk.org Mon Jul 31 04:42:02 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 31 Jul 2023 04:42:02 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 20:20:33 GMT, Matias Saavedra Silva wrote: > Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. > > The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. Marked as reviewed by iklam (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15027#pullrequestreview-1553948524 From mbaesken at openjdk.org Mon Jul 31 07:12:21 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 31 Jul 2023 07:12:21 GMT Subject: RFR: JDK-8313251: Add NativeLibraryLoad event [v3] In-Reply-To: References: Message-ID: > Add a NativeLibraryLoad event that provides us more detail about shared lib/dll loads. This gives a time stamp and success + error details of the load operation. It enhances the already existing information we get from the existing NativeLibrary event (that periodically samples the native modules of the jvm process). Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Adjust COPYRIGHT year info ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15065/files - new: https://git.openjdk.org/jdk/pull/15065/files/a934d3f7..da6b9861 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15065&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15065&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15065.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15065/head:pull/15065 PR: https://git.openjdk.org/jdk/pull/15065 From rrich at openjdk.org Mon Jul 31 07:42:51 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Mon, 31 Jul 2023 07:42:51 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 14:42:37 GMT, Richard Reingruber wrote: > > I've done basic loom testing on ppc64le and riscv64 using Qemu, but would appreciate if @reinrich and @RealFYang could take it for a real test drive. > > PPC code looks good. I've tested `jtreg:test/jdk/jdk/internal/vm/Continuation` successfully. Will do more testing until monday. So there are crashes in our tests with this pr applied. E.g. `make test TEST=runtime/CommandLine/OptionsValidation/TestOptionsWithRangesDynamic.java` crashes always. To me it seems that the issue is unrelated to virtual threads. It has to be caused by relativization of `top_frame_sp` Unfortunately I cannot look into it since I'm actually on vacation. @TheRealMDoerr might want to look into it. He's coming back from vacation in a couple of days. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14545#issuecomment-1657838880 From clanger at openjdk.org Mon Jul 31 07:45:59 2023 From: clanger at openjdk.org (Christoph Langer) Date: Mon, 31 Jul 2023 07:45:59 GMT Subject: Integrated: 8313316: Disable runtime/ErrorHandling/MachCodeFramesInErrorFile.java on ppc64le In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 10:42:10 GMT, Christoph Langer wrote: > Exclude the test on ppc64le due to failure. This pull request has now been integrated. Changeset: 807ca2d3 Author: Christoph Langer URL: https://git.openjdk.org/jdk/commit/807ca2d3a1d498f8d51a33b062a003c96344d9b7 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod 8313316: Disable runtime/ErrorHandling/MachCodeFramesInErrorFile.java on ppc64le Reviewed-by: mbaesken ------------- PR: https://git.openjdk.org/jdk/pull/15068 From duke at openjdk.org Mon Jul 31 08:04:25 2023 From: duke at openjdk.org (sid8606) Date: Mon, 31 Jul 2023 08:04:25 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v7] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for s390x (Big Endian). sid8606 has updated the pull request incrementally with one additional commit since the last revision: Restructure comment sentence ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/cc2292dd..12d1a397 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=05-06 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From duke at openjdk.org Mon Jul 31 08:04:30 2023 From: duke at openjdk.org (sid8606) Date: Mon, 31 Jul 2023 08:04:30 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v6] In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 12:36:13 GMT, Andrew Haley wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Preserve and restore register Z_R6 >> >> Though Z_R6 is argument register it is a saved register >> so preserve and restore Z_R6 register > > src/hotspot/cpu/s390/upcallLinker_s390.cpp line 72: > >> 70: // Z_SP saved/restored by prologue/epilogue >> 71: if (reg == Z_SP) continue; >> 72: // though Z_R6 is argument register it is a saved register > > Suggestion: > ` // although Z_R6 is used for parameter passing, it must be saved and restored by a called function.` Thanks for the review. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1278922263 From jvernee at openjdk.org Mon Jul 31 08:12:01 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 31 Jul 2023 08:12:01 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v6] In-Reply-To: References: Message-ID: On Fri, 28 Jul 2023 03:59:26 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Preserve and restore register Z_R6 > > Though Z_R6 is argument register it is a saved register > so preserve and restore Z_R6 register src/hotspot/cpu/s390/upcallLinker_s390.cpp line 45: > 43: if (reg == Z_SP) continue; > 44: // though Z_R6 is argument register it is a saved register > 45: if (!abi.is_volatile_reg(reg) || reg == Z_R6) { So, is the prior assumption that all argument registers are also volatile incorrect? If so, I think it would be better to change `ABIDescriptor::is_volatile_reg` ([1]) to only look at the `_integer_additional_volatile_registers` list (maybe rename it too), and then list all the volatile regs in LinuxS390CallArranger explicitly, excluding R6 ([2]). That way all the information about which regs are volatile is in one place (LinuxS390CallArranger). [1]: https://github.com/openjdk/jdk/pull/14801/files#diff-7096e1975de20baa3219d616506f26ba2b4500bf5ad28c331e3d6049a32a461eR39-R42 [2]: https://github.com/openjdk/jdk/pull/14801/files#diff-09da9016992b04bab73b6cc6aad8ca719e86c90c398af4e000583c1f8220a99bR73 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1278932461 From eosterlund at openjdk.org Mon Jul 31 08:32:52 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 31 Jul 2023 08:32:52 GMT Subject: RFR: 8310239: Add missing cross modifying fence in nmethod entry barriers [v2] In-Reply-To: References: Message-ID: On Tue, 20 Jun 2023 08:26:08 GMT, Erik ?sterlund wrote: >> In fact, there is a current race in the nmethod entry barriers, where what we are doing violates the AMD APM (cf. APM volume 2 section 7.6.1 https://www.amd.com/system/files/TechDocs/24593.pdf). >> In particular, if the compare instruction of the nmethod entry barrier is not yet patched and we call a slow path on thread 1, then before taking the nmethod entry lock, another thread 2 could fix and disarm the nmethod. Then thread 1 will observe *data* suggesting the nmethod has been patched, but never re-executes the patched compare (which might indeed still be stale), hence not qualifying for asynchronous cross modifying code, and neither do we run a serializing cpuid instruction, qualifying for synchronous cross modifying code. In this scenario, we can indeed start executing the nmethod instructions, while observing inconsistent concurrent patching effects, where some instructions will be updated and some not. >> >> The following patch ensures that x86 nmethod entry barriers execute cross modifying fence after calling into the VM, where another thread could have disarmed the nmethod. I also ensured the other platforms perform their fencing after the VM call, instead of before - including a cross_modify_fence in the shared code for OSR nmethod entries. While fencing before will flush out the instruction pipeline, and it shouldn't be populated with problematic instructions until after we start executing the nmethod again, it feels unnecessary to fence on the wrong side of the modifications it wishes to guard, and hence not strictly following the synchronous cross modifying fence recipe. >> >> I'm currently running tier1-5 and running performance testing in aurora. In the interest of time, I'm opening this PR before getting the final result, and will report the results when they come in. > > Erik ?sterlund has updated the pull request incrementally with one additional commit since the last revision: > > Typo in comment So I survived the deer. Any takers on this one? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14543#issuecomment-1657912007 From aph at openjdk.org Mon Jul 31 08:39:54 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 31 Jul 2023 08:39:54 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Mon, 19 Jun 2023 15:59:15 GMT, Fredrik Bredberg wrote: > Implementation of relativized last_sp (top_frame_sp on PowerPC) in interpreter frames for x64, aarch64, ppc64le and riscv. > Not relativized last_sp on arm, zero and s390 but done some changes to cope with the changed generic code. > > By changing the "last_sp" member in interpreter frames from being an absolute address into an offset that is relative to the frame pointer, we don't need to change the value as we freeze and thaw frames of virtual threads. This is since we might freeze and thaw from and to different worker threads, so the absolute address to locals might change, but the offset from the frame pointer will be constant. > > This subtask only handles "last_sp" (and its close equivalent "top_frame_sp" on PowerPC). The relativization of other interpreter frame members are handled in other subtasks to JDK-8289296. > > Tested tier1-tier7 on supported platforms. The rest was sanity tested using Qemu. src/hotspot/cpu/aarch64/templateInterpreterGenerator_aarch64.cpp line 469: > 467: // Restore stack bottom in case i2c adjusted stack > 468: __ ldr(rscratch1, Address(rfp, frame::interpreter_frame_last_sp_offset * wordSize)); > 469: __ lea(esp, Address(rfp, rscratch1, Address::lsl(3))); Suggestion: __ lea(esp, Address(rfp, rscratch1, Address::lsl(Interpreter::logStackElementSize))); I'm not sure that `logStackElementSize` buys us anything, but we might as well be consistent in this patch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14545#discussion_r1278964118 From aph at openjdk.org Mon Jul 31 08:46:53 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 31 Jul 2023 08:46:53 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Mon, 19 Jun 2023 15:59:15 GMT, Fredrik Bredberg wrote: > Implementation of relativized last_sp (top_frame_sp on PowerPC) in interpreter frames for x64, aarch64, ppc64le and riscv. > Not relativized last_sp on arm, zero and s390 but done some changes to cope with the changed generic code. > > By changing the "last_sp" member in interpreter frames from being an absolute address into an offset that is relative to the frame pointer, we don't need to change the value as we freeze and thaw frames of virtual threads. This is since we might freeze and thaw from and to different worker threads, so the absolute address to locals might change, but the offset from the frame pointer will be constant. > > This subtask only handles "last_sp" (and its close equivalent "top_frame_sp" on PowerPC). The relativization of other interpreter frame members are handled in other subtasks to JDK-8289296. > > Tested tier1-tier7 on supported platforms. The rest was sanity tested using Qemu. I think you need to correct `frame::describe` to make it print the relativized SP in an easy-to-understand form. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14545#issuecomment-1657932001 From duke at openjdk.org Mon Jul 31 09:54:53 2023 From: duke at openjdk.org (sid8606) Date: Mon, 31 Jul 2023 09:54:53 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v6] In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 08:08:37 GMT, Jorn Vernee wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Preserve and restore register Z_R6 >> >> Though Z_R6 is argument register it is a saved register >> so preserve and restore Z_R6 register > > src/hotspot/cpu/s390/upcallLinker_s390.cpp line 45: > >> 43: if (reg == Z_SP) continue; >> 44: // though Z_R6 is argument register it is a saved register >> 45: if (!abi.is_volatile_reg(reg) || reg == Z_R6) { > > So, is the prior assumption that all argument registers are also volatile incorrect? > > If so, I think it would be better to change `ABIDescriptor::is_volatile_reg` ([1]) to only look at the `_integer_additional_volatile_registers` list (maybe rename it too), and then list all the volatile regs in LinuxS390CallArranger explicitly, excluding R6 ([2]). That way all the information about which regs are volatile is in one place (LinuxS390CallArranger). > > [1]: https://github.com/openjdk/jdk/pull/14801/files#diff-7096e1975de20baa3219d616506f26ba2b4500bf5ad28c331e3d6049a32a461eR39-R42 > [2]: https://github.com/openjdk/jdk/pull/14801/files#diff-09da9016992b04bab73b6cc6aad8ca719e86c90c398af4e000583c1f8220a99bR73 Yes, s390x ABI says R6 is an argument register as we as non-volatile register. It make sense, I'll make suggested changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1279064461 From clanger at openjdk.org Mon Jul 31 10:50:13 2023 From: clanger at openjdk.org (Christoph Langer) Date: Mon, 31 Jul 2023 10:50:13 GMT Subject: [jdk21] RFR: 8313316: Disable runtime/ErrorHandling/MachCodeFramesInErrorFile.java on ppc64le Message-ID: Hi all, This pull request contains a backport of [JDK-8313316](https://bugs.openjdk.org/browse/JDK-8313316), commit [807ca2d3](https://github.com/openjdk/jdk/commit/807ca2d3a1d498f8d51a33b062a003c96344d9b7) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Christoph Langer on 31 Jul 2023 and was reviewed by Matthias Baesken. Thanks! ------------- Commit messages: - Backport 807ca2d3a1d498f8d51a33b062a003c96344d9b7 Changes: https://git.openjdk.org/jdk21/pull/152/files Webrev: https://webrevs.openjdk.org/?repo=jdk21&pr=152&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313316 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk21/pull/152.diff Fetch: git fetch https://git.openjdk.org/jdk21.git pull/152/head:pull/152 PR: https://git.openjdk.org/jdk21/pull/152 From mbaesken at openjdk.org Mon Jul 31 10:50:13 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 31 Jul 2023 10:50:13 GMT Subject: [jdk21] RFR: 8313316: Disable runtime/ErrorHandling/MachCodeFramesInErrorFile.java on ppc64le In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 10:38:44 GMT, Christoph Langer wrote: > Hi all, > > This pull request contains a backport of [JDK-8313316](https://bugs.openjdk.org/browse/JDK-8313316), commit [807ca2d3](https://github.com/openjdk/jdk/commit/807ca2d3a1d498f8d51a33b062a003c96344d9b7) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Christoph Langer on 31 Jul 2023 and was reviewed by Matthias Baesken. > > Thanks! Marked as reviewed by mbaesken (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk21/pull/152#pullrequestreview-1554482320 From shade at openjdk.org Mon Jul 31 13:23:15 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 31 Jul 2023 13:23:15 GMT Subject: RFR: 8313202: MutexLocker should disallow null Mutexes [v3] In-Reply-To: References: Message-ID: > As seen in [JDK-8313081](https://bugs.openjdk.org/browse/JDK-8313081), it is fairly easy to pass nullptr `Mutex` to `MutexLocker` by accident, which would just silently avoid the lock. > > There are a few places in Hotspot where we pass `nullptr` to simulate re-entrancy and/or conditionally take the lock. Those places can be more explicit, and the default `MutexLocker` can disallow nullptrs for extra safety. > > Open for some bikeshedding on the names of the new `MutexLockers`. Particularly `ReentrantMutexLocker` might lull readers into believing it does safepoint checks on re-entrant "lock", which it actually does not do. > > More thorough testing with different GC/JIT combinations is running now, we might find more issues there. Meanwhile, please comment on the approach. > > Additional testing: > - [x] `grep -R "MutexLocker " src/hotspot | grep -i null`, no hits > - [x] `grep -R "MutexLocker " src/hotspot | grep -i ?`, no hits > - [x] Linux AArch64 fastdebug, `tier1 tier2 tier3` (re-run in progress) Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Accept one more potentially nullptr mutex - Merge branch 'master' into JDK-8313202-mutexlocker-nulls - Replace ReentrantMutexLocker with ConditionalMutexLocker - Workaround for JDK-8313210 - Fixing CodeCache analytics - Initial work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15043/files - new: https://git.openjdk.org/jdk/pull/15043/files/4b140819..770c95af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15043&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15043&range=01-02 Stats: 6253 lines in 270 files changed: 3612 ins; 1039 del; 1602 mod Patch: https://git.openjdk.org/jdk/pull/15043.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15043/head:pull/15043 PR: https://git.openjdk.org/jdk/pull/15043 From tonyp at openjdk.org Mon Jul 31 13:56:56 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 31 Jul 2023 13:56:56 GMT Subject: RFR: 8313322: RISC-V: implement MD5 intrinsic Message-ID: What the title says. I started with the aarch64 version but changed it quite heavily. I haven't done anything with the macro assembler before, so detailed / picky feedback is very welcome! ------------- Commit messages: - 8313322: RISC-V: implement MD5 intrinsic Changes: https://git.openjdk.org/jdk/pull/15090/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15090&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8313322 Stats: 385 lines in 3 files changed: 381 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/15090.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15090/head:pull/15090 PR: https://git.openjdk.org/jdk/pull/15090 From duke at openjdk.org Mon Jul 31 14:04:23 2023 From: duke at openjdk.org (sid8606) Date: Mon, 31 Jul 2023 14:04:23 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v8] In-Reply-To: References: Message-ID: <-1Z9zN0Ljsm8JremgZrieIxTqr36QN2WrsBeDLelJpo=.4015c349-9a5a-45e6-9f71-76737e2154d7@github.com> > Implementation of "Foreign Function & Memory API" for s390x (Big Endian). sid8606 has updated the pull request incrementally with one additional commit since the last revision: List all the volatile regs in LinuxS390CallArranger explicitly Signed-off-by: Sidraya ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/12d1a397..fd2c6701 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=06-07 Stats: 11 lines in 4 files changed: 0 ins; 4 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From duke at openjdk.org Mon Jul 31 14:12:00 2023 From: duke at openjdk.org (sid8606) Date: Mon, 31 Jul 2023 14:12:00 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v6] In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 09:51:44 GMT, sid8606 wrote: >> src/hotspot/cpu/s390/upcallLinker_s390.cpp line 45: >> >>> 43: if (reg == Z_SP) continue; >>> 44: // though Z_R6 is argument register it is a saved register >>> 45: if (!abi.is_volatile_reg(reg) || reg == Z_R6) { >> >> So, is the prior assumption that all argument registers are also volatile incorrect? >> >> If so, I think it would be better to change `ABIDescriptor::is_volatile_reg` ([1]) to only look at the `_integer_additional_volatile_registers` list (maybe rename it too), and then list all the volatile regs in LinuxS390CallArranger explicitly, excluding R6 ([2]). That way all the information about which regs are volatile is in one place (LinuxS390CallArranger). >> >> [1]: https://github.com/openjdk/jdk/pull/14801/files#diff-7096e1975de20baa3219d616506f26ba2b4500bf5ad28c331e3d6049a32a461eR39-R42 >> [2]: https://github.com/openjdk/jdk/pull/14801/files#diff-09da9016992b04bab73b6cc6aad8ca719e86c90c398af4e000583c1f8220a99bR73 > > Yes, s390x ABI says R6 is an argument register as we as non-volatile register. > It make sense, I'll make suggested changes. Made the changes in new commit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1279354543 From amitkumar at openjdk.org Mon Jul 31 14:18:09 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 31 Jul 2023 14:18:09 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v8] In-Reply-To: <-1Z9zN0Ljsm8JremgZrieIxTqr36QN2WrsBeDLelJpo=.4015c349-9a5a-45e6-9f71-76737e2154d7@github.com> References: <-1Z9zN0Ljsm8JremgZrieIxTqr36QN2WrsBeDLelJpo=.4015c349-9a5a-45e6-9f71-76737e2154d7@github.com> Message-ID: On Mon, 31 Jul 2023 14:04:23 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > List all the volatile regs in LinuxS390CallArranger explicitly > > Signed-off-by: Sidraya Some nits :-) ------------- Marked as reviewed by amitkumar (Committer). PR Review: https://git.openjdk.org/jdk/pull/14801#pullrequestreview-1527808341 From amitkumar at openjdk.org Mon Jul 31 14:18:12 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 31 Jul 2023 14:18:12 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Sat, 8 Jul 2023 10:48:15 GMT, sid8606 wrote: >> Implementation of "Foreign Function & Memory API" for s390x (Big Endian). > > sid8606 has updated the pull request incrementally with one additional commit since the last revision: > > Address suggestions from Jorn Vernee src/hotspot/cpu/s390/frame_s390.cpp line 228: > 226: > 227: bool frame::upcall_stub_frame_is_first() const { > 228: assert(is_upcall_stub_frame(), "must be optimzed entry frame"); typo: s/optimzed/optimized src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390CallArranger.java line 70: > 68: private static final ABIDescriptor CLinux = abiFor( > 69: new VMStorage[] { r2, r3, r4, r5, r6, }, // GP input > 70: new VMStorage[] { f0, f2, f4, f6 }, // FP intput typo: s/intput/input src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390CallArranger.java line 205: > 203: } > 204: > 205: // Compute recipe for transfering arguments / return values to C from Java. typo: s/transfering/transferring src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390CallArranger.java line 257: > 255: } > 256: > 257: // Compute recipe for transfering arguments / return values from C to Java. typo: s/transfering/transferring ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1262149143 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1262152073 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1262153723 PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1262154919 From duke at openjdk.org Mon Jul 31 14:33:19 2023 From: duke at openjdk.org (sid8606) Date: Mon, 31 Jul 2023 14:33:19 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v9] In-Reply-To: References: Message-ID: > Implementation of "Foreign Function & Memory API" for s390x (Big Endian). sid8606 has updated the pull request incrementally with one additional commit since the last revision: Fix typos Signed-off-by: Sidraya ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14801/files - new: https://git.openjdk.org/jdk/pull/14801/files/fd2c6701..b81d5bb7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14801&range=07-08 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14801.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14801/head:pull/14801 PR: https://git.openjdk.org/jdk/pull/14801 From amitkumar at openjdk.org Mon Jul 31 14:33:20 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Mon, 31 Jul 2023 14:33:20 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 07:41:11 GMT, Amit Kumar wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address suggestions from Jorn Vernee > > src/hotspot/cpu/s390/frame_s390.cpp line 228: > >> 226: >> 227: bool frame::upcall_stub_frame_is_first() const { >> 228: assert(is_upcall_stub_frame(), "must be optimzed entry frame"); > > typo: s/optimzed/optimized @sid8606 you left this one. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1279382122 From duke at openjdk.org Mon Jul 31 14:33:20 2023 From: duke at openjdk.org (sid8606) Date: Mon, 31 Jul 2023 14:33:20 GMT Subject: RFR: 8311630: [s390] Implementation of Foreign Function & Memory API (Preview) [v3] In-Reply-To: References: Message-ID: On Thu, 13 Jul 2023 07:46:00 GMT, Amit Kumar wrote: >> sid8606 has updated the pull request incrementally with one additional commit since the last revision: >> >> Address suggestions from Jorn Vernee > > src/java.base/share/classes/jdk/internal/foreign/abi/s390/linux/LinuxS390CallArranger.java line 257: > >> 255: } >> 256: >> 257: // Compute recipe for transfering arguments / return values from C to Java. > > typo: s/transfering/transferring Fixed, Thanks @offamitkumar for pointing typos. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14801#discussion_r1279378873 From luhenry at openjdk.org Mon Jul 31 14:58:55 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 31 Jul 2023 14:58:55 GMT Subject: RFR: 8313322: RISC-V: implement MD5 intrinsic In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 13:50:10 GMT, Antonios Printezis wrote: > What the title says. I started with the aarch64 version but changed it quite heavily. > > I haven't done anything with the macro assembler before, so detailed / picky feedback is very welcome! Could you also indicate which test suite you've run to validate the change? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3917: > 3915: > 3916: // Set of L registers that correspond to a contiguous memory area. > 3917: // Each 64-byte register typically corresponds to 2 32-byte integers. `64-byte` -> `64bits`, same for `32-byte`. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3940: > 3938: } > 3939: > 3940: // Generate code extracting i-th unsigned word (4 bytes) from cached 64 bytes. `64 bytes` -> `64 bits` src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3954: > 3952: typedef RegCache<8> BufRegCache; > 3953: > 3954: void rotate_left_32(Register rd, Register rs, uint bits, Register rtmp1, Register rtmp2) { That could be in `macroAssembler_riscv.hpp` ------------- PR Review: https://git.openjdk.org/jdk/pull/15090#pullrequestreview-1554988952 PR Review Comment: https://git.openjdk.org/jdk/pull/15090#discussion_r1279426146 PR Review Comment: https://git.openjdk.org/jdk/pull/15090#discussion_r1279427048 PR Review Comment: https://git.openjdk.org/jdk/pull/15090#discussion_r1279427681 From matsaave at openjdk.org Mon Jul 31 15:05:55 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 15:05:55 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool In-Reply-To: References: Message-ID: On Sun, 30 Jul 2023 22:12:42 GMT, David Holmes wrote: >> Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. >> >> The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. > > src/hotspot/share/oops/constantPool.cpp line 456: > >> 454: } >> 455: >> 456: void ConstantPool::string_at_put(int obj_index, oop str) { > > Should `obj_index` be `cp_index` here? In this case `obj_index` refers to an index into the resolved references array. The argument `which` was supposed to refer to a constant pool index but it was not actually used. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15027#discussion_r1279441912 From tonyp at openjdk.org Mon Jul 31 15:12:53 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 31 Jul 2023 15:12:53 GMT Subject: RFR: 8313322: RISC-V: implement MD5 intrinsic In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 14:52:30 GMT, Ludovic Henry wrote: >> What the title says. I started with the aarch64 version but changed it quite heavily. >> >> I haven't done anything with the macro assembler before, so detailed / picky feedback is very welcome! > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3917: > >> 3915: >> 3916: // Set of L registers that correspond to a contiguous memory area. >> 3917: // Each 64-byte register typically corresponds to 2 32-byte integers. > > `64-byte` -> `64bits`, same for `32-byte`. 64-byte register! Will fix. > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3940: > >> 3938: } >> 3939: >> 3940: // Generate code extracting i-th unsigned word (4 bytes) from cached 64 bytes. > > `64 bytes` -> `64 bits` Actually, this was (almost) correct. The original class from aarch64 cached 8 64-bit values, so 64 bytes. But I generalized it, so I'll rephrase it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15090#discussion_r1279447499 PR Review Comment: https://git.openjdk.org/jdk/pull/15090#discussion_r1279451218 From tonyp at openjdk.org Mon Jul 31 15:27:03 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 31 Jul 2023 15:27:03 GMT Subject: RFR: 8313322: RISC-V: implement MD5 intrinsic In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 13:50:10 GMT, Antonios Printezis wrote: > What the title says. I started with the aarch64 version but changed it quite heavily. > > I haven't done anything with the macro assembler before, so detailed / picky feedback is very welcome! I mainly used `jtreg:test/jdk/java/security/MessageDigest` to test correctness. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15090#issuecomment-1658593564 From tonyp at openjdk.org Mon Jul 31 15:27:05 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 31 Jul 2023 15:27:05 GMT Subject: RFR: 8313322: RISC-V: implement MD5 intrinsic In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 14:55:29 GMT, Ludovic Henry wrote: > Could you also indicate which test suite you've run to validate the change? Added a comment for that. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15090#issuecomment-1658594547 From duke at openjdk.org Mon Jul 31 15:30:55 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Mon, 31 Jul 2023 15:30:55 GMT Subject: RFR: 8312623: SA add NestHost and NestMembers attributes when dumping class In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 22:39:03 GMT, David Holmes wrote: >> @dholmes-ora sorry for responding late. I got sidetracked by some other work. >> >>> We need to be sure this works as expected for top-level classes that have no nest members, and deeply nested nest members, plus dynamically injected hidden classes that are nest members. >> >> I am not sure I understand this concern. We are getting nest-host and nest-members from the InstanceKlass. As long as this information is recorded in InstanceKlass, it would work. Can you please elaborate your concern about the cases you feel may not work. >> >>> I'm unclear if this is intended to only expose the same details as would be statically defined in the attribute in the classfile? >> >> It is to expose the details as the JVM sees, which may be different from what is statically defined in the classfile if agents are involved. > > @ashu-mehra you indicated that you had only done two basic manual tests to check the output. You need to check it for the cases that I flagged too. In the VM every top-level class is its own nest-host, but that is not expressed in a classfile attribute (it is just the defined semantics) so displaying this as-if it were an explicit attribute may not be right. @dholmes-ora I confirmed there is no nest-host or nest-members attributes generated by this patch for a top level class which doesn't have any nest-members. Is that what you wanted to verify? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15005#issuecomment-1658599841 From tonyp at openjdk.org Mon Jul 31 15:37:13 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 31 Jul 2023 15:37:13 GMT Subject: RFR: 8313322: RISC-V: implement MD5 intrinsic In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 14:53:38 GMT, Ludovic Henry wrote: >> What the title says. I started with the aarch64 version but changed it quite heavily. >> >> I haven't done anything with the macro assembler before, so detailed / picky feedback is very welcome! > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3954: > >> 3952: typedef RegCache<8> BufRegCache; >> 3953: >> 3954: void rotate_left_32(Register rd, Register rs, uint bits, Register rtmp1, Register rtmp2) { > > That could be in `macroAssembler_riscv.hpp` I was thinking that maybe the 32-bit version might not be as helpful in the macro assembler. But, sure. There's already `ror_imm`. Do I call it `rol32_imm`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15090#discussion_r1279494067 From matsaave at openjdk.org Mon Jul 31 16:04:23 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 16:04:23 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool [v2] In-Reply-To: References: Message-ID: > Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. > > The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: David comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15027/files - new: https://git.openjdk.org/jdk/pull/15027/files/eae4e38a..8c7aff22 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15027&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15027&range=00-01 Stats: 88 lines in 2 files changed: 0 ins; 0 del; 88 mod Patch: https://git.openjdk.org/jdk/pull/15027.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15027/head:pull/15027 PR: https://git.openjdk.org/jdk/pull/15027 From tonyp at openjdk.org Mon Jul 31 16:05:13 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 31 Jul 2023 16:05:13 GMT Subject: RFR: 8313322: RISC-V: implement MD5 intrinsic [v2] In-Reply-To: References: Message-ID: <6zXzQDEH7fxazbf7vwFL4AebesePv4uPofa62bcpQDU=.91c008b8-d743-4a08-a5e3-c89259756023@github.com> > What the title says. I started with the aarch64 version but changed it quite heavily. > > I haven't done anything with the macro assembler before, so detailed / picky feedback is very welcome! Antonios Printezis has updated the pull request incrementally with one additional commit since the last revision: changes based on code review feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15090/files - new: https://git.openjdk.org/jdk/pull/15090/files/c56bee39..1a840b39 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15090&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15090&range=00-01 Stats: 21 lines in 3 files changed: 12 ins; 6 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/15090.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15090/head:pull/15090 PR: https://git.openjdk.org/jdk/pull/15090 From coleenp at openjdk.org Mon Jul 31 16:17:51 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 31 Jul 2023 16:17:51 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool [v2] In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 16:04:23 GMT, Matias Saavedra Silva wrote: >> Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. >> >> The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > David comments I hate to be disagreeable but the change to cp_index in resolve_constant_at_impl was good because cache_index is also a parameter in that function. Knowing which index is very helpful there. Can you change back just that one? ------------- PR Review: https://git.openjdk.org/jdk/pull/15027#pullrequestreview-1555249099 From matsaave at openjdk.org Mon Jul 31 16:46:19 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 16:46:19 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool [v3] In-Reply-To: References: Message-ID: > Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. > > The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Reverted resolved_constant_at ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15027/files - new: https://git.openjdk.org/jdk/pull/15027/files/8c7aff22..57939fe9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15027&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15027&range=01-02 Stats: 34 lines in 2 files changed: 0 ins; 0 del; 34 mod Patch: https://git.openjdk.org/jdk/pull/15027.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15027/head:pull/15027 PR: https://git.openjdk.org/jdk/pull/15027 From coleenp at openjdk.org Mon Jul 31 17:13:49 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 31 Jul 2023 17:13:49 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool [v3] In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 16:46:19 GMT, Matias Saavedra Silva wrote: >> Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. >> >> The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Reverted resolved_constant_at Thanks for reverting resolve_constant_at_impl. ------------- Marked as reviewed by coleenp (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15027#pullrequestreview-1555386484 From iklam at openjdk.org Mon Jul 31 17:13:50 2023 From: iklam at openjdk.org (Ioi Lam) Date: Mon, 31 Jul 2023 17:13:50 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool [v3] In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 16:46:19 GMT, Matias Saavedra Silva wrote: >> Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. >> >> The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Reverted resolved_constant_at New update looks good. ------------- Marked as reviewed by iklam (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15027#pullrequestreview-1555387824 From matsaave at openjdk.org Mon Jul 31 18:44:11 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 18:44:11 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v12] In-Reply-To: References: Message-ID: On Tue, 25 Jul 2023 15:49:10 GMT, Matias Saavedra Silva wrote: >> The current structure used to store the resolution information for fields, ConstantPoolCacheEntry, is difficult to interpret due to its ambigious fields f1 and f2. This structure can hold information for fields and methods and each of its fields can hold different types of values depending on the entry type. >> >> This enhancement introduces a new data structure that stores the necessary resolution data in an intuitive an extensible manner. These resolved entries are stored in an array inside the constant pool cache in a very similar manner to invokedynamic entries in JDK-8301995. >> >> Instances of ConstantPoolCache entry related to field resolution have been replaced with the new ResolvedFieldEntry. Verified with tier 1-9 tests. >> >> This change supports the following platforms: x86, aarch64, PPC. and RISCV > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Fix tos_state Thank you for the PPC port! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1658943586 From matsaave at openjdk.org Mon Jul 31 18:44:12 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 18:44:12 GMT Subject: RFR: 8301996: Move field resolution information out of the cpCache [v7] In-Reply-To: References: Message-ID: <7Vo_DX2DdY7ft3xOqfPnBYHL3DUTrzyaLl0nHK80xrs=.472d6480-312b-4478-a2c4-92b54bbc667d@github.com> On Tue, 11 Jul 2023 19:31:25 GMT, Frederic Parain wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Interpreter fix and cleanup > > Thank you for addressing the issues that were spotted. > Looks good to me now. Thank you for the excellent reviews, comments, and suggestions @fparain, @coleenp, @offamitkumar, and @dholmes-ora! Once again, thank you to @DingliZhang, @RealFYang, and @TheRealMDoerr for contributing the ports for this change. Github actions and tier tests show no issues so I believe this change is safe to integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14129#issuecomment-1658947099 From matsaave at openjdk.org Mon Jul 31 18:44:14 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 18:44:14 GMT Subject: Integrated: 8301996: Move field resolution information out of the cpCache In-Reply-To: References: Message-ID: On Wed, 24 May 2023 16:55:47 GMT, Matias Saavedra Silva wrote: > The current structure used to store the resolution information for fields, ConstantPoolCacheEntry, is difficult to interpret due to its ambigious fields f1 and f2. This structure can hold information for fields and methods and each of its fields can hold different types of values depending on the entry type. > > This enhancement introduces a new data structure that stores the necessary resolution data in an intuitive an extensible manner. These resolved entries are stored in an array inside the constant pool cache in a very similar manner to invokedynamic entries in JDK-8301995. > > Instances of ConstantPoolCache entry related to field resolution have been replaced with the new ResolvedFieldEntry. Verified with tier 1-9 tests. > > This change supports the following platforms: x86, aarch64, PPC. and RISCV This pull request has now been integrated. Changeset: 86783b98 Author: Matias Saavedra Silva URL: https://git.openjdk.org/jdk/commit/86783b985175de3a0c02215a862b2a2749d8b408 Stats: 1454 lines in 45 files changed: 908 ins; 171 del; 375 mod 8301996: Move field resolution information out of the cpCache Co-authored-by: Gui Cao Co-authored-by: Dingli Zhang Co-authored-by: Martin Doerr Reviewed-by: coleenp, fparain ------------- PR: https://git.openjdk.org/jdk/pull/14129 From matsaave at openjdk.org Mon Jul 31 20:27:03 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 20:27:03 GMT Subject: RFR: 8307312: Replace "int which" with "int cp_index" in constantPool [v3] In-Reply-To: References: Message-ID: <8Kcif6_IYHEtmJPwpphUdQFUZD2mqB2yZgCANxQWQFo=.dcfea12b-35ae-4ff5-9ed8-e4cd332fb792@github.com> On Mon, 31 Jul 2023 17:10:10 GMT, Coleen Phillimore wrote: >> Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: >> >> Reverted resolved_constant_at > > Thanks for reverting resolve_constant_at_impl. Thank you for the reviews @coleenp, @dholmes-ora, and @iklam! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15027#issuecomment-1659088913 From matsaave at openjdk.org Mon Jul 31 20:27:04 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Mon, 31 Jul 2023 20:27:04 GMT Subject: Integrated: 8307312: Replace "int which" with "int cp_index" in constantPool In-Reply-To: References: Message-ID: <4gt_-8T2rIgEhWM0rJ_0Kvym_euR6TXR-RtrX_P3Afw=.c9a14079-d17c-4c21-822e-7cbb8bb69350@github.com> On Tue, 25 Jul 2023 20:20:33 GMT, Matias Saavedra Silva wrote: > Many accessors in the constant pool take an argument "int which" that is meant to represent an ambiguous index. Despite this, several methods in the API use "int which" when the argument is exclusively a constant pool index. This patch aims to rename all of these instances to "int cp_index" in order to be more clear and to distinguish methods that take constant pool indices and methods that use derived indices. > > The callers have been updated to use more clear naming as well. Verified with tier 1-5 tests. This pull request has now been integrated. Changeset: c91a3002 Author: Matias Saavedra Silva URL: https://git.openjdk.org/jdk/commit/c91a3002fb4304b6184d1d8d5611873c4e028af2 Stats: 329 lines in 8 files changed: 0 ins; 0 del; 329 mod 8307312: Replace "int which" with "int cp_index" in constantPool Reviewed-by: coleenp, dholmes, iklam ------------- PR: https://git.openjdk.org/jdk/pull/15027 From heidinga at openjdk.org Mon Jul 31 20:32:58 2023 From: heidinga at openjdk.org (Dan Heidinga) Date: Mon, 31 Jul 2023 20:32:58 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Mon, 10 Jul 2023 05:35:53 GMT, Ioi Lam wrote: > The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future. Looking ahead to Project Leyden, I wouldn't be surprised if the current 1MB of archived heap became much larger. Demo apps on Graal Native Image are often ~4MB of image heap, and while it's not an apples-to-apples comparison, it suggests that somewhere between 5MB & 10MB isn't unreasonable for Leyden. Using 10MB as a baseline for easy math, 1.3ms/MB * 10MB = 13 ms for the new code? And (1.3ms-0.8ms) = 0.5ms/MB * 10MB = 5ms for the old code? Assuming I've interpreted the numbers correctly and importably that they scale linearly, it seems worth preserving the mmap approach for collectors that can support it. Does that seem reasonable? And justify preserving the mmap approach? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1659118383 From alanb at openjdk.org Mon Jul 31 20:48:45 2023 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 31 Jul 2023 20:48:45 GMT Subject: RFR: 8306582: Remove MetaspaceShared::exit_after_static_dump() [v4] In-Reply-To: References: Message-ID: On Thu, 27 Jul 2023 19:09:00 GMT, Matias Saavedra Silva wrote: >> Currently we exit the VM after static dumping with `MetaspaceShared::exit_after_static_dump()`. >> >> >> // We have finished dumping the static archive. At this point, there may be pending VM >> // operations. We have changed some global states (such as vmClasses::_klasses) that >> // may cause these VM operations to fail. For safety, forget these operations and >> // exit the VM directly. >> void MetaspaceShared::exit_after_static_dump() { >> os::_exit(0); >> } >> >> >> As the comment suggests, the VM state is altered when preparing and performing the static dump, so this change aims to prevent these state changes so the VM can exit normally after the static dump completes. There are three major aspects to this change: >> 1. Since the resolved references array in the Constant Pool is altered when preparing for a static dump, a "scratch copy" is created and archived instead >> 2. Symbols are sorted by address and have their hash recalculated. Similarly to point 1, the copies of the symbols that are to be archived have their hashes updated as opposed to the originals. >> 3. The handling of -Xshare:dump during argument parsing such that the VM can continue and exit normally with an exit code of 0. >> >> Verified with tier 1-9 tests. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Ioi comments Updated launcher change looks fine. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14879#pullrequestreview-1555751362 From eosterlund at openjdk.org Mon Jul 31 20:53:52 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Mon, 31 Jul 2023 20:53:52 GMT Subject: RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2] In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 20:29:46 GMT, Dan Heidinga wrote: > > The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future. > > > > Looking ahead to Project Leyden, I wouldn't be surprised if the current 1MB of archived heap became much larger. Demo apps on Graal Native Image are often ~4MB of image heap, and while it's not an apples-to-apples comparison, it suggests that somewhere between 5MB & 10MB isn't unreasonable for Leyden. > > > > Using 10MB as a baseline for easy math, 1.3ms/MB * 10MB = 13 ms for the new code? And (1.3ms-0.8ms) = 0.5ms/MB * 10MB = 5ms for the old code? Assuming I've interpreted the numbers correctly and importably that they scale linearly, it seems worth preserving the mmap approach for collectors that can support it. > > > > Does that seem reasonable? And justify preserving the mmap approach? I think Leyden has bigger fish to fry. If we start seeing this being costly, then there are still more cards we can play to optimize the GC agnostic approach, such as e.g. exploiting the mostly contiguous nature of our allocations. But I'd like to wait and see if it's worth it or not, compred to potentially bigger fish we can spend time on instead. I'm happy with current numbers, and see a path forward should we need them to improve going forward. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1659142695 From pchilanomate at openjdk.org Mon Jul 31 21:43:57 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Mon, 31 Jul 2023 21:43:57 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: <6hcwKJHv2mKfEalFYo7PW4I4f5xytgxO5Xi4VC7VV5M=.1bf4fa3f-7f98-4624-9c02-c20372c0643f@github.com> On Mon, 19 Jun 2023 15:59:15 GMT, Fredrik Bredberg wrote: > Implementation of relativized last_sp (top_frame_sp on PowerPC) in interpreter frames for x64, aarch64, ppc64le and riscv. > Not relativized last_sp on arm, zero and s390 but done some changes to cope with the changed generic code. > > By changing the "last_sp" member in interpreter frames from being an absolute address into an offset that is relative to the frame pointer, we don't need to change the value as we freeze and thaw frames of virtual threads. This is since we might freeze and thaw from and to different worker threads, so the absolute address to locals might change, but the offset from the frame pointer will be constant. > > This subtask only handles "last_sp" (and its close equivalent "top_frame_sp" on PowerPC). The relativization of other interpreter frame members are handled in other subtasks to JDK-8289296. > > Tested tier1-tier7 on supported platforms. The rest was sanity tested using Qemu. src/hotspot/share/runtime/continuationFreezeThaw.cpp line 817: > 815: return freeze_pinned_native; > 816: } > 817: prepare_freeze_interpreted_top_frame(f); Probably better to handle setting last_sp for this special case in freeze_start_frame_safepoint_stub(). But in any case the preemption code is not there yet so I would just remove these changes and prepare_freeze_interpreted_top_frame(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14545#discussion_r1279896849 From fbredberg at openjdk.org Mon Jul 31 22:13:54 2023 From: fbredberg at openjdk.org (Fredrik Bredberg) Date: Mon, 31 Jul 2023 22:13:54 GMT Subject: RFR: 8308984: Relativize last_sp (and top_frame_sp) in interpreter frames In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 07:40:16 GMT, Richard Reingruber wrote: >>> I've done basic loom testing on ppc64le and riscv64 using Qemu, but would appreciate if @reinrich and @RealFYang could take it for a real test drive. >> >> PPC code looks good. I've tested `jtreg:test/jdk/jdk/internal/vm/Continuation` successfully. Will do more testing until monday. > >> > I've done basic loom testing on ppc64le and riscv64 using Qemu, but would appreciate if @reinrich and @RealFYang could take it for a real test drive. >> >> PPC code looks good. I've tested `jtreg:test/jdk/jdk/internal/vm/Continuation` successfully. Will do more testing until monday. > > So there are crashes in our tests with this pr applied. > E.g. `make test TEST=runtime/CommandLine/OptionsValidation/TestOptionsWithRangesDynamic.java` crashes always. To me it seems that the issue is unrelated to virtual threads. It has to be caused by relativization of `top_frame_sp` > Unfortunately I cannot look into it since I'm actually on vacation. @TheRealMDoerr might want to look into it. He's coming back from vacation in a couple of days. Thank you @reinrich for pointing out the failing `runtime/CommandLine/OptionsValidation/TestOptionsWithRangesDynamic.java` test. I used `R11_scratch1` inside `InterpreterMacroAssembler::add_monitor_to_stack()` which apparently is not so good because it's the same register as is used for the local `esp` register. Since `R12_scratch2` is also used I looked around the code and figured I should be able to use `R23_tmp3` as a scratch register. It seems to work. Now the test passes. Do you or @TheRealMDoerr have any objections to using `R23_tmp3` as a scratch register in `InterpreterMacroAssembler::add_monitor_to_stack()` ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14545#issuecomment-1659258467