From kvn at openjdk.org Thu Jun 1 00:36:05 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 1 Jun 2023 00:36:05 GMT Subject: RFR: 8309136: [JVMCI] add -XX:+UseGraalJIT flag [v3] In-Reply-To: References: Message-ID: On Wed, 31 May 2023 23:26:17 GMT, Doug Simon wrote: >> Use of the Graal-based JIT in OpenJDK currently requires the following flag: `-XX:+EnableJVMCIProduct` >> >> This has no direct association with Graal. If the JDK image happens to include a non-Graal JVMCI implementation, it will be automatically selected. This would come as a surprise to users who equate JVMCI with Graal. >> >> This PR introduces a new flag, `-XX:+UseGraalJIT` to address these shortcomings. It is an alias for `-XX:+EnableJVMCIProduct -Djvmci.Compiler=graal`. >> >> When `-XX:+UseGraalJIT` is specified, the VM fails fast at startup if there is a non-Graal JVMCI implementation or no JVMCI implementation in the JDK image. > > Doug Simon has updated the pull request incrementally with five additional commits since the last revision: > > - improve error message when UseGraalJIT is used without -XX:+UnlockExperimentalVMOptions > - use strncmp instead of strcmp > - fix date in copyright header > - set UseGraalJIT value in enable_jvmci_product_mode > - added missing test of UseJVMCICompiler when adjusting JVMCI flags under -Xint Looks good now. You need second approval. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14231#pullrequestreview-1454368722 From amenkov at openjdk.org Thu Jun 1 00:57:10 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 1 Jun 2023 00:57:10 GMT Subject: RFR: 8308978: regression with a deadlock involving FollowReferences [v3] In-Reply-To: References: <2J1qItzUgmfjRPS0xUbHgXZQ-b12JBxe8XPRftU2GyA=.025e7855-5df4-413a-bea7-585a53832025@github.com> <6xcKqU3mLr9TocEUpoXXzcNWSnKijSlhcyxIfXXrFD0=.2987b8de-827e-43d0-907b-7ef2016ddab4@github.com> Message-ID: On Wed, 31 May 2023 22:57:08 GMT, Serguei Spitsyn wrote: > > Something went wrong after 1st merge, testing failed with OOMEInAQS (which is problem-listed) > > How do you run tests? Do you run tiers 1-5 or something else as well? Please, remember that the tier-5 runs the needed SVC tests with `main.wrapper=virtual`. tier1-tier5 as per David's request initial fix (before the test were problem-listed) was tested with JTREG_TEST_THREAD_FACTORY=Virtual ------------- PR Comment: https://git.openjdk.org/jdk/pull/14233#issuecomment-1571154252 From sviswanathan at openjdk.org Thu Jun 1 01:10:16 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Thu, 1 Jun 2023 01:10:16 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v2] In-Reply-To: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> References: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> Message-ID: On Tue, 30 May 2023 23:31:09 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments src/hotspot/cpu/x86/assembler_x86.cpp line 3559: > 3557: > 3558: void Assembler::vmovsd(XMMRegister dst, XMMRegister src, XMMRegister src2) { > 3559: assert(UseAVX > 0, "Requires some form ov AVX"); Typo "Requires some form **of** AVX" src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 125: > 123: __ vmovsd(xmm5, xmm18, xmm20); > 124: __ movq(xmm17, rax); > 125: __ vandpd(xmm0, xmm5, xmm17, Assembler::AVX_512bit); This and others below all should be Assembler::AVX_128bit. No need for AVX_512bit here. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 134: > 132: > 133: // q = DP_DIV_RZ(a, b); > 134: __ vmovsd(xmm5, xmm18, xmm1); This and other usage of vmovsd with blending two registers could be avoided. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 164: > 162: __ mov64(rax, 0x7FEFFFFFFFFFFFFF); > 163: __ movq(Address(rsp, 0x20), rax); > 164: __ movsd(xmm2, Address(rsp, 0x20)); You could directly do: __ movsd(xmm2, ExternalAddress((address)CONST_MAX), rax); src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 171: > 169: __ mov64(rax, 0x7FF0000000000000); > 170: __ movq(Address(rsp, 0x20), rax); > 171: __ movsd(xmm2, Address(rsp, 0x20)); You could directly do: __ movsd(xmm2, ExternalAddress((address)CONST_INF), rax); src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 179: > 177: __ mov64(rax, 0x7FE0000000000000); > 178: __ movq(Address(rsp, 0x20), rax); > 179: __ movsd(xmm21, Address(rsp, 0x20)); You could directly do: __ movsd(xmm2, ExternalAddress((address)CONST_e307), rax); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1210963286 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212455393 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212458387 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212457127 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212457314 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212457651 From dholmes at openjdk.org Thu Jun 1 03:08:07 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 1 Jun 2023 03:08:07 GMT Subject: RFR: 8309136: [JVMCI] add -XX:+UseGraalJIT flag [v3] In-Reply-To: References: Message-ID: On Wed, 31 May 2023 23:26:17 GMT, Doug Simon wrote: >> Use of the Graal-based JIT in OpenJDK currently requires the following flag: `-XX:+EnableJVMCIProduct` >> >> This has no direct association with Graal. If the JDK image happens to include a non-Graal JVMCI implementation, it will be automatically selected. This would come as a surprise to users who equate JVMCI with Graal. >> >> This PR introduces a new flag, `-XX:+UseGraalJIT` to address these shortcomings. It is an alias for `-XX:+EnableJVMCIProduct -Djvmci.Compiler=graal`. >> >> When `-XX:+UseGraalJIT` is specified, the VM fails fast at startup if there is a non-Graal JVMCI implementation or no JVMCI implementation in the JDK image. > > Doug Simon has updated the pull request incrementally with five additional commits since the last revision: > > - improve error message when UseGraalJIT is used without -XX:+UnlockExperimentalVMOptions > - use strncmp instead of strcmp > - fix date in copyright header > - set UseGraalJIT value in enable_jvmci_product_mode > - added missing test of UseJVMCICompiler when adjusting JVMCI flags under -Xint The changes here seem fine, but I remain concerned about the compiler detection logic that presently exists. I can run trivial "programs" using `-XX:+UnlockExperimentalVMOptions -XX:+EnableJVMCIProduct -Djvmci.Compiler=graal` and there is no error. That means that JIT compilation is not kicking in during VM init and that application code is being executed prior to the VM aborting! That is very wrong behaviour IMO, but it is pre-existing so would need to be addressed in separate bug. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14231#pullrequestreview-1454472386 From dholmes at openjdk.org Thu Jun 1 03:23:05 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 1 Jun 2023 03:23:05 GMT Subject: RFR: 8309210: Extend VM Operations hs_err logging In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:35:03 GMT, Stefan Karlsson wrote: > We have a section in the hs_err file, which prints the most recently run VM operations. Sometimes a VM operation type is used from multiple places in our code and it's not obvious why the VM operation was run. For example, HandshakeAllThreads doesn't tell us why we are running the handshake. I propose that we add an option for the VM operations to tell more about why they are used. > > The proposed patch enhances the Handshake VM operation and the ZGC pause VM Operations. Seems quite reasonable. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14248#pullrequestreview-1454483811 From thartmann at openjdk.org Thu Jun 1 05:26:21 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Thu, 1 Jun 2023 05:26:21 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: <9RnqD_bbEcutUNcfw9elCwYbPS_7-JiHMn_9cVAmDxQ=.830877a3-eac8-4af9-9ac0-5244c37989fb@github.com> On Tue, 30 May 2023 19:15:38 GMT, Johan Sj?len wrote: >> A final sweep of Hotspot to remove all re-added NULLs. With only 110 changes I'd appreciate if this was considered trivial. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Align > - Suggestions What's the plan now to prevent re-introducing `NULL`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1571360076 From stuefe at openjdk.org Thu Jun 1 05:42:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 1 Jun 2023 05:42:04 GMT Subject: RFR: 8309210: Extend VM Operations hs_err logging In-Reply-To: References: Message-ID: <6fZXjfH4rX_t8LdTd6q5yA87X6J5SFrP94p2-6Zt9dk=.56473ed6-3188-4833-a4ea-dd97f01a59d5@github.com> On Wed, 31 May 2023 14:35:03 GMT, Stefan Karlsson wrote: > We have a section in the hs_err file, which prints the most recently run VM operations. Sometimes a VM operation type is used from multiple places in our code and it's not obvious why the VM operation was run. For example, HandshakeAllThreads doesn't tell us why we are running the handshake. I propose that we add an option for the VM operations to tell more about why they are used. > > The proposed patch enhances the Handshake VM operation and the ZGC pause VM Operations. Looks good. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14248#pullrequestreview-1454607578 From dnsimon at openjdk.org Thu Jun 1 08:18:07 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 1 Jun 2023 08:18:07 GMT Subject: RFR: 8309136: [JVMCI] add -XX:+UseGraalJIT flag [v3] In-Reply-To: References: Message-ID: On Wed, 31 May 2023 23:26:17 GMT, Doug Simon wrote: >> Use of the Graal-based JIT in OpenJDK currently requires the following flag: `-XX:+EnableJVMCIProduct` >> >> This has no direct association with Graal. If the JDK image happens to include a non-Graal JVMCI implementation, it will be automatically selected. This would come as a surprise to users who equate JVMCI with Graal. >> >> This PR introduces a new flag, `-XX:+UseGraalJIT` to address these shortcomings. It is an alias for `-XX:+EnableJVMCIProduct -Djvmci.Compiler=graal`. >> >> When `-XX:+UseGraalJIT` is specified, the VM fails fast at startup if there is a non-Graal JVMCI implementation or no JVMCI implementation in the JDK image. > > Doug Simon has updated the pull request incrementally with five additional commits since the last revision: > > - improve error message when UseGraalJIT is used without -XX:+UnlockExperimentalVMOptions > - use strncmp instead of strcmp > - fix date in copyright header > - set UseGraalJIT value in enable_jvmci_product_mode > - added missing test of UseJVMCICompiler when adjusting JVMCI flags under -Xint Thanks for input David. I agree that it's best to open a new JBS issue to discuss concerns about lazy JVMCI compiler initialization. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14231#issuecomment-1571577103 From tschatzl at openjdk.org Thu Jun 1 08:57:08 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 1 Jun 2023 08:57:08 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:49:23 GMT, Ashutosh Mehra wrote: >> This patch is the first step towards having a single set of GC APIs for allocating heap space for the archived objects (See https://bugs.openjdk.org/browse/JDK-8296263). >> It moves some of the G1 specific logic from CDS to G1 gc without changing the functionality. >> >> Changes that add/update GC APIs for handling archive heap would be introduced in upcoming patches. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused method FileMapInfo::heap_region_mapped_address > > Signed-off-by: Ashutosh Mehra src/hotspot/share/cds/filemap.cpp line 2129: > 2127: > 2128: // allocate from java heap > 2129: HeapWord* start = G1CollectedHeap::heap()->alloc_archive_regions(word_size); I'm not convinced not giving a preferred location is a good idea. That seems to reduce the opportunity to directly map archives significantly. Previously, with only heap size changes, the archive could be mapped in still. How that reduces the API to the GC seems unclear, this call is still embedded in G1 specific code. Since this change is an intermediate step, could you provide an overview of the final API/change too? It is hard to comment on this without knowing where you are going with that. Without knowing where this ends up, I would prefer if this method passed in the full suggested memory range (or like memory reservation works, pass in a `requested_addr`/`requested_start` that is null in case it's not used if you prefer) instead of just the size. The method may still return the actual location. Other collectors may simply ignore that hint (which is better than the caller not even bothering to give the hint). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1212820720 From dchuyko at openjdk.org Thu Jun 1 09:30:41 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 1 Jun 2023 09:30:41 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives Message-ID: Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives. Methods in general are often inlined, and this information is hard to track down. Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Obviously there is a performance penalty, so it should be applied with care. Hot code will most likely be recompiled soon, as nothing happens to its hotness. A new flag '`-d`' has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and marks for deoptimization those methods that have any active non-default matching compiler directives. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be deoptimized. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be deoptimized, but this can be achieved by having rules for them. In addition, a new diagnistic command `Compiler.replace_directives`, has been added for convenience. It's like a combination of `Compiler.clear_directives` and `Compiler.add_directives`. It supports the same new optional '-d' flag that marks both cleared and added methods for deoptimization. The behavior of the '-d' flag is implemented in the new `CodeCache::mark_for_deoptimization_directives_matches` and `DirectivesStack::hasMatchingDirectives` methods. `CompilerDirectivesDCMDTest` now checks add, remove and replace commands in two modes (default and '-d') and checks that '-d' flag causes deoptimization. An alternative approach to the '-d' flag could be to have a special diagnostic command for deoptimization. It will get a list of method patterns and reuse the matcher, however this is not so trivial. Overall usage and effects will be similar but this is one more file format. The user will also need to monitor or query active directives in advance, e.g. to deoptimize all mentioned methods after clearing all directives. An alternative approach for selection of deoptimized methods could be to track down all inlining dependencies. This may be similar to searching references to old methods, but it requires scanning all code blobs, which looks too expensive. An alternative naming for the flag is welcome. The obvious '-f' ('force') unfortunately has a conflict. Other verbs can be 'update', 'refresh' or 'apply'. Deoptimization is just what's done to reconcile the state. It could be something else, like first compiling with a different compiler and then switching to that version. Although in the latter case, triggered compilation would be an essential detail. ------------- Commit messages: - Merge branch 'openjdk:master' into compiler-directives-force-update - Formatting - Formatting - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Correct arguments info for new commands - Update through de-optimization Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309271 Stats: 214 lines in 9 files changed: 194 ins; 0 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From aph at openjdk.org Thu Jun 1 09:46:25 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 1 Jun 2023 09:46:25 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: On Tue, 30 May 2023 00:57:40 GMT, David Holmes wrote: > Can we now poison NULL so it can't get reintroduced? Or would that potentially break standard headers? I'm sure it would. Maybe some changes to Skara? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1571706648 From jsjolen at openjdk.org Thu Jun 1 09:54:26 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 1 Jun 2023 09:54:26 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: <9RnqD_bbEcutUNcfw9elCwYbPS_7-JiHMn_9cVAmDxQ=.830877a3-eac8-4af9-9ac0-5244c37989fb@github.com> References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> <9RnqD_bbEcutUNcfw9elCwYbPS_7-JiHMn_9cVAmDxQ=.830877a3-eac8-4af9-9ac0-5244c37989fb@github.com> Message-ID: <2WGxhOHFo3u8TGyhklFPc-Ipml5HL8PO9aakXX9cgFA=.d8e6a4c6-2da5-4173-ac96-fa4088f25502@github.com> On Thu, 1 Jun 2023 05:23:25 GMT, Tobias Hartmann wrote: > What's the plan now to prevent re-introducing `NULL`? Hi Tobias. The only plan in place is social, the reviewers have to look out for it. I am however researching how to do this through machine. I'm currently researching ways of preventing any re-introductions by machine. These include poisoning the NULL macro by re-defining it and finding a tool which is capable of parsing C++ code which is yet to go through the pre-processor. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1571722147 From dnsimon at openjdk.org Thu Jun 1 10:16:24 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 1 Jun 2023 10:16:24 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: On Tue, 30 May 2023 19:15:38 GMT, Johan Sj?len wrote: >> A final sweep of Hotspot to remove all re-added NULLs. With only 110 changes I'd appreciate if this was considered trivial. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Align > - Suggestions It may be simpler to use simple grepping + an allow list. For example using [`ag`](https://github.com/ggreer/the_silver_searcher) and `grep` seems to catch the few remaining offenders: > ag NULL src/hotspot/ --cpp | grep -v _NULL | grep -v NULL_ | grep -v -E '[A-Z]NULL' | grep -v -E '//.*NULL' | grep -v '"NULL"' src/hotspot/cpu/ppc/macroAssembler_ppc.hpp:735: void load_klass_check_null(Register dst, Register src, Label* is_null = NULL); src/hotspot/cpu/ppc/stubGenerator_ppc.cpp:4700: if (UnsafeCopyMemory::_table == NULL) { src/hotspot/cpu/x86/jvmciCodeInstaller_x86.cpp:191: if (nop == NULL) { src/hotspot/cpu/riscv/codeBuffer_riscv.cpp:74: if (cb->stubs()->maybe_expand_to_ensure_remaining(total_requested_size) && cb->blob() == NULL) { src/hotspot/cpu/riscv/stubGenerator_riscv.cpp:4019: if (UnsafeCopyMemory::_table == NULL) { src/hotspot/cpu/riscv/stubGenerator_riscv.cpp:4077: if (bs_nm != NULL) { src/hotspot/cpu/aarch64/jvmciCodeInstaller_aarch64.cpp:125: NativeCall* call = NULL; src/hotspot/cpu/aarch64/jvmciCodeInstaller_aarch64.cpp:158: if (nop == NULL) { src/hotspot/share/jfr/dcmd/jfrDcmds.hpp:162: JavaPermission p = {"java.lang.management.ManagementPermission", "monitor", NULL}; src/hotspot/share/jfr/dcmd/jfrDcmds.hpp:187: JavaPermission p = {"java.lang.management.ManagementPermission", "monitor", NULL}; src/hotspot/share/include/jvm.h:423: * Find a class from a boot class loader. Returns NULL if class not found. src/hotspot/share/prims/jvmtiAgent.cpp:375: const jint err = (*on_load_entry)(&main_vm, const_cast(agent->options()), NULL); src/hotspot/share/prims/whitebox.cpp:1885: if (cp->cache() == NULL) { src/hotspot/share/prims/whitebox.cpp:1894: if (cp->cache() == NULL) { src/hotspot/share/classfile/stringTable.hpp:150: static oop init_shared_table(const DumpedInternedStrings* dumped_interned_strings) NOT_CDS_JAVA_HEAP_RETURN_(NULL); src/hotspot/share/utilities/globalDefinitions_xlc.hpp:95: #undef NULL src/hotspot/share/utilities/globalDefinitions_xlc.hpp:96: #define NULL 0L src/hotspot/share/utilities/globalDefinitions_xlc.hpp:98: #ifndef NULL src/hotspot/share/utilities/globalDefinitions_xlc.hpp:99: #define NULL 0 src/hotspot/share/cds/filemap.cpp:363: assert(ent != NULL, "sanity"); src/hotspot/share/utilities/globalDefinitions_visCPP.hpp:65:#undef NULL src/hotspot/share/utilities/globalDefinitions_visCPP.hpp:69:#define NULL 0LL src/hotspot/share/utilities/globalDefinitions_visCPP.hpp:71:#ifndef NULL src/hotspot/share/utilities/globalDefinitions_visCPP.hpp:72:#define NULL 0 src/hotspot/share/utilities/globalDefinitions.cpp:162: static_assert(sizeof(NULL) == sizeof(char*), "NULL must be same size as pointer"); src/hotspot/share/adlc/output_c.cpp:279: for (pipeline->_reslist.reset(); (resource = pipeline->_reslist.iter()) != NULL;) { src/hotspot/share/adlc/output_c.cpp:305: for (pipeline->_reslist.reset(); (resource = pipeline->_reslist.iter()) != NULL;) { src/hotspot/share/adlc/output_c.cpp:368: for (pipeline->_reslist.reset(); (resource = pipeline->_reslist.iter()) != NULL;) { src/hotspot/share/adlc/output_c.cpp:393: for (pipeline->_reslist.reset(); (resource = pipeline->_reslist.iter()) != NULL;) { src/hotspot/share/adlc/output_c.cpp:1009: for (_pipeline->_reslist.reset(); (resource = _pipeline->_reslist.iter()) != NULL;) { src/hotspot/share/gc/x/xBarrierSet.inline.hpp:187: return Raw::oop_arraycopy_in_heap(nullptr, 0, src, NULL, 0, dst, length); src/hotspot/share/gc/x/xPageTable.inline.hpp:43: if (entry != NULL && entry != _prev) { src/hotspot/share/gc/x/xBarrier.cpp:242: return NULL; src/hotspot/share/oops/cpCache.cpp:888: LogStream* log_stream = NULL; src/hotspot/share/oops/cpCache.cpp:906: assert(resolved_references->obj_at(appendix_index) == NULL, "init just once"); src/hotspot/share/oops/cpCache.cpp:914: if (log_stream != NULL) { src/hotspot/share/opto/runtime.cpp:491: fields[TypeFunc::Parms+0] = NULL; // void src/hotspot/share/jvmci/jvmciEnv.cpp:366: if (ex != NULL) { ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1571756690 From aph at openjdk.org Thu Jun 1 10:17:19 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 1 Jun 2023 10:17:19 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) In-Reply-To: References: Message-ID: On Fri, 26 May 2023 20:46:29 GMT, Kelvin Nilsen wrote: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 147: > 145: } > 146: #endif > 147: } This logic is so gnarly that it's very hard to review and maintain, and IMO it's dangerous. The problem is that its correctness depends on exactly how registers are allocated in its caller. This needs restructuring so that the register allocation is defined in a single place then passed down to everyone who needs it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1212922771 From stefank at openjdk.org Thu Jun 1 10:25:21 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 1 Jun 2023 10:25:21 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) In-Reply-To: References: Message-ID: On Fri, 26 May 2023 20:46:29 GMT, Kelvin Nilsen wrote: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. I've looked at the shared code and it's really nice that you've managed to keep them to a minimum. I have one tiny nit that would be nice to fix. src/hotspot/share/gc/shared/gcConfiguration.cpp line 88: > 86: } > 87: #endif > 88: return NA; You moved the order between Shenandoah and ZGC in `young_collector()`, so you should probably do the same here. ------------- PR Review: https://git.openjdk.org/jdk/pull/14185#pullrequestreview-1455087651 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1212919666 From jbhateja at openjdk.org Thu Jun 1 11:43:12 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Thu, 1 Jun 2023 11:43:12 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v2] In-Reply-To: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> References: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> Message-ID: On Tue, 30 May 2023 23:31:09 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Address review comments Hi @asgibbons , Kindly also include the results for following benchmark test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java Best Regards, Jatin Hi @asgibbons , Kindly also include the results for following benchmark test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java Best Regards, Jatin src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 234: > 232: // { > 233: // q = DP_DIV_RZ(a, bs); > 234: __ bind(L_1237); should be ok to do loop alignment padding, though may low trip count loop. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 306: > 304: > 305: Label L_104a, L_11bd, L_10c1, L_1090, L_11b9, L_10e7, L_11af, L_111c, L_10f3, L_116e, L_112a; > 306: Label L_1173, L_1157, L_117f, L_11a0; For the sake of clarity, can we segregate AVX2 functionality into a separate routine and indent the block. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 321: > 319: __ movl(rcx, rax); > 320: __ orl(rcx, 0x7f80); > 321: __ movl(Address(rsp, 0x04), rcx); It may rarely happen that scope of MXCSR change is beyond couple of instruction, hence we simply load the needed settings and later on re-load std MXCSR settings from default location `ldmxcsr(ExternalAddress(StubRoutines::x86::addr_mxcsr_std()));` src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 396: > 394: __ vdivpd(xmm0, xmm4, xmm3, Assembler::AVX_128bit); > 395: // q = DP_TRUNC(q); > 396: __ vroundsd(xmm0, xmm0, xmm0, 3); vroundsd can be removed if we defer MXCSR reinitialization beyond it. ------------- PR Review: https://git.openjdk.org/jdk/pull/14224#pullrequestreview-1455184476 PR Review: https://git.openjdk.org/jdk/pull/14224#pullrequestreview-1455256066 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212979736 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212982077 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1212995674 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1213009324 From shade at openjdk.org Thu Jun 1 11:57:15 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 1 Jun 2023 11:57:15 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 08:10:09 GMT, Boris Ulasevich wrote: >> Async profiler shows that applications spend up to 10% in itable_stubs. >> >> The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. >> >> This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). > > Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains three new commits since the last revision: > > - readability rework > - cleanup > - 8305959: x86: Improve itable_stub Nice. I have only minor comments left. src/hotspot/cpu/x86/vtableStubs_x86_32.cpp line 203: > 201: > 202: start_pc = __ pc(); > 203: __ push(temp_reg); Why do we need to save this one? Do we care if this "tmp" is clobbered? src/hotspot/cpu/x86/vtableStubs_x86_32.cpp line 204: > 202: start_pc = __ pc(); > 203: __ push(temp_reg); > 204: __ lookup_interface_method_stub(recv_klass_reg, Let's not lose the original comments here. `// Receiver subtype check against REFC` and others. src/hotspot/cpu/x86/vtableStubs_x86_64.cpp line 197: > 195: start_pc = __ pc(); > 196: > 197: __ lookup_interface_method_stub(recv_klass_reg, Same, let's not lose the original comments here: `// Receiver subtype check against REFC` and others. test/micro/org/openjdk/bench/vm/compiler/InterfaceCalls.java line 62: > 60: interface FirstInterfaceExtExt extends FirstInterfaceExt { > 61: default int getIntFirst() {return 45;} > 62: } Style: Suggestion: interface FirstInterfaceExt extends FirstInterface { default int getIntFirst() { return 44; } } interface FirstInterfaceExtExt extends FirstInterfaceExt { default int getIntFirst() { return 45; } } test/micro/org/openjdk/bench/vm/compiler/InterfaceCalls.java line 187: > 185: public FirstInterface[] as = new FirstInterface[asLength]; > 186: public FirstInterface[] noninlined = new FirstInterface[5]; > 187: public FirstInterfaceExtExt[] noninlinedextext = new FirstInterfaceExtExt[5]; Suggestion: public FirstInterface[] noninlined = new FirstInterface[asLength]; public FirstInterfaceExtExt[] noninlinedextext = new FirstInterfaceExtExt[asLength]; test/micro/org/openjdk/bench/vm/compiler/InterfaceCalls.java line 203: > 201: noninlined[3] = new FourthClassDontInline(); > 202: noninlined[4] = new FifthClassDontInline(); > 203: noninlinedextext[0] = new FirstClassDontInlineExtExt(); Suggestion: noninlined[4] = new FifthClassDontInline(); noninlinedextext[0] = new FirstClassDontInlineExtExt(); test/micro/org/openjdk/bench/vm/compiler/InterfaceCalls.java line 221: > 219: /** Tests single base interface method call */ > 220: @Benchmark > 221: public void testIfaceCall(Blackhole bh) { Why these tests are not in the form of the other benchmarks? @Benchmark public int test1stInt5Types() { FirstInterface ai = as[l]; l = ++ l % asLength; return ai.getIntFirst(); } I suspect those are carefully written so that a single call-site would be used for all receivers, thus limiting the profile-guided optimizations. ------------- PR Review: https://git.openjdk.org/jdk/pull/13460#pullrequestreview-1455266666 PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1213030376 PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1213027973 PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1213029621 PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1213031451 PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1213032105 PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1213032310 PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1213036770 From mdoerr at openjdk.org Thu Jun 1 12:18:19 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 1 Jun 2023 12:18:19 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) In-Reply-To: References: Message-ID: On Fri, 26 May 2023 20:46:29 GMT, Kelvin Nilsen wrote: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Issues already reported to GenShen engineers: gc/shenandoah/TestElasticTLAB.java#generational #? Internal Error (src\hotspot\share\gc\shenandoah\shenandoahFreeSet.cpp:695), pid=23288, tid=23784 #? assert(size % CardTable::card_size_in_words() == 0) failed: size must be multiple of card table size, was 258 gc/stress/gcold/TestGCOldWithShenandoah.java#generational #? Internal Error (src\hotspot\share\gc\shenandoah\heuristics\shenandoahOldHeuristics.cpp:82), pid=20828, tid=5836 #? assert(_old_generation->available() > old_evacuation_budget) failed: Cannot budget more than is available gc/shenandoah/oom/TestAllocOutOfMemory.java#large Execution failed: `main' threw exception: java.lang.RuntimeException: 'java.lang.OutOfMemoryError: Java heap space' missing from stdout/stderr (Issue with 64k Pages) gc/shenandoah/TestRetainObjects.java#no-tlab gc/shenandoah/TestSieveObjects.java#no-tlab Timeouts. gc/shenandoah/TestAllocObjects.java#generational gc/shenandoah/TestDynamicSoftMaxHeapSize.java#generational #? Internal Error src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp:664), pid=18434, tid=29955 #? assert(is_global() || ShenandoahHeap::heap()->is_full_gc_in_progress() || (_used + _humongous_waste <= _affiliated_region_count * ShenandoahHeapRegion::region_size_bytes())) failed: used cannot exceed regions ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1571947174 From adinn at openjdk.org Thu Jun 1 12:21:10 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Thu, 1 Jun 2023 12:21:10 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v4] In-Reply-To: References: Message-ID: On Wed, 24 May 2023 16:17:14 GMT, Andrew Haley wrote: >> This provides a solid speedup of about 3-4x over the Java implementation. >> >> I have a vectorized version of this which uses a bunch of tricks to speed it up, but it's complex and can still be improved. We're getting close to ramp down, so I'm submitting this simple intrinsic so that we can get it reviewed in time. >> >> Benchmarks: >> >> >> ThunderX (2, I think): >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 14078352.014 ? 4201407.966 ops/s >> Poly1305DigestBench.updateBytes 256 thrpt 3 5154958.794 ? 1717146.980 ops/s >> Poly1305DigestBench.updateBytes 1024 thrpt 3 1416563.273 ? 1311809.454 ops/s >> Poly1305DigestBench.updateBytes 16384 thrpt 3 94059.570 ? 2913.021 ops/s >> Poly1305DigestBench.updateBytes 1048576 thrpt 3 1441.024 ? 164.443 ops/s >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 4516486.795 ? 419624.224 ops/s >> Poly1305DigestBench.updateBytes 256 thrpt 3 1228542.774 ? 202815.694 ops/s >> Poly1305DigestBench.updateBytes 1024 thrpt 3 316051.912 ? 23066.449 ops/s >> Poly1305DigestBench.updateBytes 16384 thrpt 3 20649.561 ? 1094.687 ops/s >> Poly1305DigestBench.updateBytes 1048576 thrpt 3 310.564 ? 31.053 ops/s >> >> Apple M1: >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 33551968.946 ? 849843.905 ops/s >> Poly1305DigestBench.updateBytes 256 thrpt 3 9911637.214 ? 63417.224 ops/s >> Poly1305DigestBench.updateBytes 1024 thrpt 3 2604370.740 ? 29208.265 ops/s >> Poly1305DigestBench.updateBytes 16384 thrpt 3 165183.633 ? 1975.998 ops/s >> Poly1305DigestBench.updateBytes 1048576 thrpt 3 2587.132 ? 40.240 ops/s >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 12373649.589 ? 184757.721 ops/s >> Poly1305DigestBench.upd... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Review comments src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7135: > 7133: regs = (regs.remaining() + U_0HI + U_1HI).begin(); > 7134: > 7135: // U_2:U_1:U_0 += (U_1HI >> 2) This comment and the next one both need correcting. They mention U_0HI and U_1HI and, as the previous comment says, those registers are dead. What actually happens here is best summarized as // U_2:U_1:U_0 += (U2 >> 2) * 5 or, if we actually want to be clearer about the current encoding which does it in several steps // rscratch1 = (U2 >> 2) // U2 = U2[1:0] // U_2:U_1:U_0 += rscratch1 // U_2:U_1:U_0 += (rscratch1 << 2) i.e. any bits that are set from 130 upwards are masked off, treated as an integer in their own right, multiplied by 5 and the result added back in at the bottom to update the 130 bit result U2[1:0]:U1[63:0]:U0[63:0]. I'm not sure whether this provides an opportunity for you to optimize this by doing the multiply by five earlier i.e. replace the code with this version // rscratch1 = (U2 >> 2) * 5 __ lsr(rscratch1, U_2, 2); __ add(rscratch1, rscratch1, scratch1, Assembler::LSL, 2); // U2 = U2[1:0] __ andr(U_2, U_2, (u8)3); // U2:U1:U0 += rscratch1 __ adds(U_0, U_0, rscratch1); __ adcs(U_1, U_1, zr); __ adc(U_2, U_2, zr); The obvious concern is that the multiply of rscratch1 by 5 might overflow 64 bits. Is that why you have implemented two add and carry steps? If so then why is it legitimate to do the multiply by 5 up front in the final reduction that follows the loop? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14085#discussion_r1213062966 From stuefe at openjdk.org Thu Jun 1 13:01:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 1 Jun 2023 13:01:21 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) In-Reply-To: References: Message-ID: On Fri, 26 May 2023 20:46:29 GMT, Kelvin Nilsen wrote: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Hi Kevin, First off, kudos. This is impressive work by you and your Amazon colleagues! I have one particular worry, though, how to verify that this experimental feature does not cause regressions with traditional Shenandoah? The PR is massive (+18kloc) and targeted for JDK 21. Rampdown P1 is in a week. By all accounts, JDK 21 will be a massive release, so we will all have our hands full, fixing stuff and plugging holes. Oracle did put the sources for the Generational ZGC beside the old sources, thereby somewhat guaranteeing traditional ZGC does not regress. Could we follow the same cautionary process here? I am not a Shenandoah expert, but to me the new feature seems intertwined with normal code paths. It's difficult to ensure, via review, that traditional Shenandoah will not suffer regressions. So close to rampdown this is a bit scary. The JEP mentions several "Risks and Assumptions", but it is unclear whether these risks also affect traditional Shenandoah. Cheers, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1572008074 From duke at openjdk.org Thu Jun 1 13:11:26 2023 From: duke at openjdk.org (JoKern65) Date: Thu, 1 Jun 2023 13:11:26 GMT Subject: Withdrawn: JDK-8308288: Fix xlc17 clang warnings in shared code In-Reply-To: References: Message-ID: On Thu, 25 May 2023 09:14:14 GMT, JoKern65 wrote: > When using the new xlc17 compiler (based on a recent clang) to build OpenJDk on AIX , we run into various "warnings as errors". > Some of those are in shared codebase and could be addressed by small adjustments. > A lot of those changes are in hotspot, some might be somewhere else in the OpenJDK C/C++ code. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/14146 From duke at openjdk.org Thu Jun 1 13:11:24 2023 From: duke at openjdk.org (JoKern65) Date: Thu, 1 Jun 2023 13:11:24 GMT Subject: RFR: JDK-8308288: Fix xlc17 clang warnings in shared code [v2] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 08:31:46 GMT, JoKern65 wrote: >> When using the new xlc17 compiler (based on a recent clang) to build OpenJDk on AIX , we run into various "warnings as errors". >> Some of those are in shared codebase and could be addressed by small adjustments. >> A lot of those changes are in hotspot, some might be somewhere else in the OpenJDK C/C++ code. > > JoKern65 has updated the pull request incrementally with one additional commit since the last revision: > > forgotton _ Hi, As this PR is big and spans several components I split off the java.base, java.desktop and the sercivability/security issues into extra JBS issues. https://bugs.openjdk.org/browse/JDK-8309219 Fix xlc17 clang 1.5 warnings in java.base https://bugs.openjdk.org/browse/JDK-8309224 Fix xlc17 clang 1.5 warnings in java.desktop https://bugs.openjdk.org/browse/JDK-8309225 Fix xlc17 clang 1.5 warnings in security and servicability I?ll move the changes from this pull request into new pull requests. I will incorporate the requested changes right in the new PRs. I will reuse this issue 8308388 for the hotspot changes but come up with a new, smaller PR. @colleenp, I will move alloca.h to the globalDefinitions_xlc.hpp. @prrace, I will come up with an identical PR for the client files (java.desktop), but improve the comment as @kimbarrett proposed @mbaesken, I will use AIX and take up some of the other fixes you proposed. I guess we need to find a way to fix the issue with the malloc in globalDefinitions_xlc.hpp in the upcoming PR for hotspot. Thanks for your help so far! Hi, As this PR is big and spans several components I split off the java.base, java.desktop and the sercivability/security issues into extra JBS issues. https://bugs.openjdk.org/browse/JDK-8309219 Fix xlc17 clang 1.5 warnings in java.base https://bugs.openjdk.org/browse/JDK-8309224 Fix xlc17 clang 1.5 warnings in java.desktop https://bugs.openjdk.org/browse/JDK-8309225 Fix xlc17 clang 1.5 warnings in security and servicability I?ll move the changes from this pull request into new pull requests. I will incorporate the requested changes right in the new PRs. I will reuse this issue 8308388 for the hotspot changes but come up with a new, smaller PR. @colleenp, I will move alloca.h to the globalDefinitions_xlc.hpp. @prrace, I will come up with an identical PR for the client files (java.desktop), but improve the comment as @kimbarrett proposed @MBaesken, I will use AIX and take up some of the other fixes you proposed. I guess we need to find a way to fix the issue with the malloc in globalDefinitions_xlc.hpp in the upcoming PR for hotspot. Thanks for your help so far! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14146#issuecomment-1572023812 PR Comment: https://git.openjdk.org/jdk/pull/14146#issuecomment-1572024628 From sgibbons at openjdk.org Thu Jun 1 13:43:11 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 1 Jun 2023 13:43:11 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v2] In-Reply-To: References: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> Message-ID: On Thu, 1 Jun 2023 01:05:52 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 134: > >> 132: >> 133: // q = DP_DIV_RZ(a, b); >> 134: __ vmovsd(xmm5, xmm18, xmm1); > > This and other usage of vmovsd with blending two registers could be avoided. I don't know what you mean. Can you elaborate please? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1213174220 From alanb at openjdk.org Thu Jun 1 13:43:33 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 1 Jun 2023 13:43:33 GMT Subject: RFR: 8306647: Implementation of Structured Concurrency (Preview) [v4] In-Reply-To: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> References: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> Message-ID: > This is the implementation of: > > - JEP 453: Structured Concurrency (Preview) > - JEP 446: Scoped Values (Preview) > > For the most part, this is just moving code and tests. StructuredTaskScope moves to j.u.concurrent as a preview API, ScopedValue moves to j.lang as a preview API, and module jdk.incubator.concurrent has been removed. The significant API changes since incubator are: > > - StructuredTaskScope.fork returns Subtask instead of Future (JEP 453 has a section on this) > - ScopedValue.where methods are replaced with runWhere, callWhere and getWhere Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: - Sync up from loom repo - Merge - Sync with loom repo, re-work ScopedValue class description - Sync up from loom repo - Remove csm.Threads - Merge - Test should not be in update for main line - Sync with loom repo - Sync up tests frmo loom repo - Sync up with loom repo - ... and 5 more: https://git.openjdk.org/jdk/compare/a46b5acc...cc902ce6 ------------- Changes: https://git.openjdk.org/jdk/pull/13932/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13932&range=03 Stats: 9267 lines in 40 files changed: 4880 ins; 4325 del; 62 mod Patch: https://git.openjdk.org/jdk/pull/13932.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13932/head:pull/13932 PR: https://git.openjdk.org/jdk/pull/13932 From stuefe at openjdk.org Thu Jun 1 14:33:27 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 1 Jun 2023 14:33:27 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) In-Reply-To: References: Message-ID: On Fri, 26 May 2023 20:46:29 GMT, Kelvin Nilsen wrote: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. I did a first read through the tests to check if any test changes affect traditional Shenandoah. To see if regression tests for non-generational are unchanged. All good, I did not find anything noteworthy. test/hotspot/jtreg/gc/shenandoah/TestEvilSyncBug.java line 33: > 31: * @modules java.base/jdk.internal.misc > 32: * java.management > 33: * @run driver/timeout=480 TestEvilSyncBug -XX:ShenandoahGCHeuristics=aggressive Probably fine, but why this change to non-generational testing? Will aggressive heuristic sharpen the test? test/hotspot/jtreg/gc/shenandoah/mxbeans/TestChurnNotifications.java line 169: > 167: > 168: MemoryUsage before = getUsage(mapBefore); > 169: MemoryUsage after = getUsage(mapAfter); This also changes test logic for traditional Shenandoah, but its harmless. Nit: more precise would be to require "Young Gen" pool to only exist for -XX:ShenandoahGCMode=generational. test/hotspot/jtreg/gc/shenandoah/oom/TestAllocOutOfMemory.java line 23: > 21: * questions. > 22: * > 23: */ Three tests folded into one, but it does not look like functionality changed for testing traditional Shenandoah. Okay. test/hotspot/jtreg/gc/shenandoah/oom/TestAllocOutOfMemory.java line 92: > 90: expectFailure("-Xmx16m", > 91: "-XX:+UnlockExperimentalVMOptions", > 92: "-XX:+UseShenandoahGC", Nit: should not need UnlockExperimentalVMOptions anymore. test/hotspot/jtreg/gc/shenandoah/oom/TestClassLoaderLeak.java line 132: > 130: {{"iu"}, {"adaptive", "aggressive"}}, > 131: {{"passive"}, {"passive"}}, > 132: {{"generational"}, {"adaptive"}} Curious, here and in similar places, why only test adaptive heuristic for generational, if we test satb with all variants? ------------- PR Review: https://git.openjdk.org/jdk/pull/14185#pullrequestreview-1455490699 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213164085 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213232252 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213243280 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213242053 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213244749 From sgibbons at openjdk.org Thu Jun 1 14:40:10 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 1 Jun 2023 14:40:10 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v2] In-Reply-To: References: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> Message-ID: On Thu, 1 Jun 2023 11:40:21 GMT, Jatin Bhateja wrote: > Hi @asgibbons , Kindly also include the results for following benchmark test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java > > Best Regards, Jatin Benchmark Mode Cnt Score Error Units DremFrem.calcDoubleJava avgt 25 16.551 ? 0.025 ns/op DremFrem.calcFloatJava avgt 25 17.197 ? 0.166 ns/op DremFrem.cornercaseDoubleJava avgt 25 5.469 ? 0.005 ns/op DremFrem.cornercaseFloatJava avgt 25 5.472 ? 0.004 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/14224#issuecomment-1572179085 From bulasevich at openjdk.org Thu Jun 1 14:51:42 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Thu, 1 Jun 2023 14:51:42 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v3] In-Reply-To: References: Message-ID: > Async profiler shows that applications spend up to 10% in itable_stubs. > > The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. > > This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: Apply suggestions from code review Co-authored-by: Aleksey Shipil?v ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13460/files - new: https://git.openjdk.org/jdk/pull/13460/files/7a259831..0ef7fd9c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=01-02 Stats: 5 lines in 1 file changed: 1 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/13460.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13460/head:pull/13460 PR: https://git.openjdk.org/jdk/pull/13460 From aph at openjdk.org Thu Jun 1 15:03:13 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 1 Jun 2023 15:03:13 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v4] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 12:16:45 GMT, Andrew Dinn wrote: > This comment and the next one both need correcting. They mention U_0HI and U_1HI and, as the previous comment says, those registers are dead. > > What actually happens here is best summarized as > > // U_2:U_1:U_0 += (U2 >> 2) * 5 > > or, if we actually want to be clearer about the current encoding which does it in several steps > > // rscratch1 = (U2 >> 2) // U2 = U2[1:0] // U_2:U_1:U_0 += rscratch1 // U_2:U_1:U_0 += (rscratch1 << 2) > > i.e. any bits that are set from 130 upwards are masked off, treated as an integer in their own right, multiplied by 5 and the result added back in at the bottom to update the 130 bit result U2[1:0]:U1[63:0]:U0[63:0]. OK. > I'm not sure whether this provides an opportunity for you to optimize this by doing the multiply by five earlier i.e. replace the code with this version I'm not sure either, which is why it's done in two separate steps. I think you may be right, but it's a bit late to be optimizing this version any further. That would require careful analysis and a redo of all the testing. > The obvious concern is that the multiply of rscratch1 by 5 might overflow 64 bits. Is that why you have implemented two add and carry steps? Indeed. > If so then why is it legitimate to do the multiply by 5 up front in the final reduction that follows the loop? I assume that you're referring to the multiply by 5 in // Further reduce modulo 2^130 - 5 __ lsr(rscratch1, U_2, 2); __ add(rscratch1, rscratch1, rscratch1, Assembler::LSL, 2); // rscratch1 = U_2 * 5 ` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14085#discussion_r1213294906 From gziemski at openjdk.org Thu Jun 1 15:45:22 2023 From: gziemski at openjdk.org (Gerard Ziemski) Date: Thu, 1 Jun 2023 15:45:22 GMT Subject: RFR: 8308341: JNI_GetCreatedJavaVMs returns a partially initialized JVM [v2] In-Reply-To: References: <70EdOsdu-XZJwsuEg2Paw_AztjzAlaxJnJ1KX7QOh_s=.a55430d0-ce46-40e5-92e4-23f305006d01@github.com> Message-ID: On Wed, 31 May 2023 23:01:54 GMT, David Holmes wrote: >> > You have chosen to use an OS independent mechanism, at the cost of exposing the implementation to the outside world, >by introducing a new stage (needs CSR). >> >> That is not an accurate characterisation of this change. The CSR request is needed because of the change in behaviour it introduces (by only returning a fully initialized VM), and has nothing at all to do with the implementation. We still need to be able to distinguish when VM creation has started (to prevent concurrent attempt) and when it has completed (so GetCreatedJavaVMs can return a valid VM) - the mechanism by which that is achieved is immaterial. Sorry, I should have read https://bugs.openjdk.org/browse/JDK-8308816 instead of assuming the impact of this change on the outside. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14139#issuecomment-1572295716 From sgibbons at openjdk.org Thu Jun 1 16:03:13 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 1 Jun 2023 16:03:13 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v2] In-Reply-To: References: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> Message-ID: <2m8KCrlZkRTg4pBAwASL4FKc_MFtL-POWmhG0ebAiwQ=.36cdb946-abb1-4938-aaa3-327775b26d65@github.com> On Thu, 1 Jun 2023 11:40:21 GMT, Jatin Bhateja wrote: > Hi @asgibbons , Kindly also include the results for following benchmark test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java > > Best Regards, Jatin Current top-of-tree results: Benchmark Mode Cnt Score Error Units DremFrem.calcDoubleJava avgt 25 7.034 ? 0.001 ns/op DremFrem.calcFloatJava avgt 25 7.011 ? 0.001 ns/op DremFrem.cornercaseDoubleJava avgt 25 5.514 ? 0.006 ns/op DremFrem.cornercaseFloatJava avgt 25 5.510 ? 0.003 ns/op My changes: Benchmark Mode Cnt Score Error Units DremFrem.calcDoubleJava avgt 25 3.165 ? 0.001 ns/op DremFrem.calcFloatJava avgt 25 4.381 ? 0.001 ns/op DremFrem.cornercaseDoubleJava avgt 25 5.512 ? 0.002 ns/op DremFrem.cornercaseFloatJava avgt 25 5.524 ? 0.009 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/14224#issuecomment-1572324290 From lucy at openjdk.org Thu Jun 1 16:04:07 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Thu, 1 Jun 2023 16:04:07 GMT Subject: RFR: 8308469: [PPC64] Implement alternative fast-locking scheme [v3] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 11:02:32 GMT, Martin Doerr wrote: >> New alternative fast-locking scheme for PPC64. Mostly implemented like on other platforms. >> Differences (also explained by comments in code): >> - Not using C2HandleAnonOMOwnerStub because the C2 code is reused for native wrappers. >> - Implemented a helper function `MacroAssembler::atomically_flip_locked_state` which makes it much easier to implement fast_lock/unlock for PPC64 (mainly because of register constraints in C1). >> - Using acquire/release barriers only for locking/unlocking. >> >> I have changed the C2 code to use ConditionRegister CR0 which fits better to the new locking code. Therefore, I have adapted the other modes to work with that, too. >> Note that we don't support RTM with new locking modes. That feature will probably get removed in a future JDK version. (Already unsupported with Power10.) > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove ObjectMonitor::ANONYMOUS_OWNER handling from fast path. We take the slow path instead. LGTM. I tried hard but could not find anything to complain about. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14069#pullrequestreview-1455826263 From aph at openjdk.org Thu Jun 1 16:09:07 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 1 Jun 2023 16:09:07 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v4] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 15:00:26 GMT, Andrew Haley wrote: > This comment and the next one both need correcting. They mention U_0HI and U_1HI and, as the previous comment says, those registers are dead. > > What actually happens here is best summarized as > > // U_2:U_1:U_0 += (U2 >> 2) * 5 > > or, if we actually want to be clearer about the current encoding which does it in several steps > > // rscratch1 = (U2 >> 2) // U2 = U2[1:0] // U_2:U_1:U_0 += rscratch1 // U_2:U_1:U_0 += (rscratch1 << 2) > > i.e. any bits that are set from 130 upwards are masked off, treated as an integer in their own right, multiplied by 5 and the result added back in at the bottom to update the 130 bit result U2[1:0]:U1[63:0]:U0[63:0]. OK. > I'm not sure whether this provides an opportunity for you to optimize this by doing the multiply by five earlier i.e. replace the code with this version I'm not sure either, which is why it's done in two separate steps. I think you may be right, but it's a bit late to be optimizing this version any further. That would require careful analysis and a redo of all the testing. > The obvious concern is that the multiply of rscratch1 by 5 might overflow 64 bits. Is that why you have implemented two add and carry steps? Indeed. > If so then why is it legitimate to do the multiply by 5 up front in the final reduction that follows the loop? I assume that you're referring to the multiply by 5 in // Further reduce modulo 2^130 - 5 __ lsr(rscratch1, U_2, 2); __ add(rscratch1, rscratch1, rscratch1, Assembler::LSL, 2); // rscratch1 = U_2 * 5 `U_2`, at this point, has only a few lower set bits. This is because `U_2` was previously ANDed with 3, and subsequently was the target of adc(U_2, U_2, zr). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14085#discussion_r1213382344 From aph at openjdk.org Thu Jun 1 16:16:32 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 1 Jun 2023 16:16:32 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v5] In-Reply-To: References: Message-ID: > This provides a solid speedup of about 3-4x over the Java implementation. > > I have a vectorized version of this which uses a bunch of tricks to speed it up, but it's complex and can still be improved. We're getting close to ramp down, so I'm submitting this simple intrinsic so that we can get it reviewed in time. > > Benchmarks: > > > ThunderX (2, I think): > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 14078352.014 ? 4201407.966 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 5154958.794 ? 1717146.980 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 1416563.273 ? 1311809.454 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 94059.570 ? 2913.021 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 1441.024 ? 164.443 ops/s > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 4516486.795 ? 419624.224 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 1228542.774 ? 202815.694 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 316051.912 ? 23066.449 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 20649.561 ? 1094.687 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 310.564 ? 31.053 ops/s > > Apple M1: > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 33551968.946 ? 849843.905 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 9911637.214 ? 63417.224 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 2604370.740 ? 29208.265 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 165183.633 ? 1975.998 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 2587.132 ? 40.240 ops/s > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 12373649.589 ? 184757.721 ops/s > Poly1305DigestBench.updateBytes 256 th... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14085/files - new: https://git.openjdk.org/jdk/pull/14085/files/93a03c62..87c1eff7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14085&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14085&range=03-04 Stats: 6 lines in 1 file changed: 2 ins; 1 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14085.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14085/head:pull/14085 PR: https://git.openjdk.org/jdk/pull/14085 From mdoerr at openjdk.org Thu Jun 1 17:28:21 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 1 Jun 2023 17:28:21 GMT Subject: RFR: 8308469: [PPC64] Implement alternative fast-locking scheme [v3] In-Reply-To: References: Message-ID: On Fri, 26 May 2023 11:02:32 GMT, Martin Doerr wrote: >> New alternative fast-locking scheme for PPC64. Mostly implemented like on other platforms. >> Differences (also explained by comments in code): >> - Not using C2HandleAnonOMOwnerStub because the C2 code is reused for native wrappers. >> - Implemented a helper function `MacroAssembler::atomically_flip_locked_state` which makes it much easier to implement fast_lock/unlock for PPC64 (mainly because of register constraints in C1). >> - Using acquire/release barriers only for locking/unlocking. >> >> I have changed the C2 code to use ConditionRegister CR0 which fits better to the new locking code. Therefore, I have adapted the other modes to work with that, too. >> Note that we don't support RTM with new locking modes. That feature will probably get removed in a future JDK version. (Already unsupported with Power10.) > > Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: > > Remove ObjectMonitor::ANONYMOUS_OWNER handling from fast path. We take the slow path instead. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14069#issuecomment-1572486916 From mdoerr at openjdk.org Thu Jun 1 17:28:23 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 1 Jun 2023 17:28:23 GMT Subject: Integrated: 8308469: [PPC64] Implement alternative fast-locking scheme In-Reply-To: References: Message-ID: On Sat, 20 May 2023 15:32:29 GMT, Martin Doerr wrote: > New alternative fast-locking scheme for PPC64. Mostly implemented like on other platforms. > Differences (also explained by comments in code): > - Not using C2HandleAnonOMOwnerStub because the C2 code is reused for native wrappers. > - Implemented a helper function `MacroAssembler::atomically_flip_locked_state` which makes it much easier to implement fast_lock/unlock for PPC64 (mainly because of register constraints in C1). > - Using acquire/release barriers only for locking/unlocking. > > I have changed the C2 code to use ConditionRegister CR0 which fits better to the new locking code. Therefore, I have adapted the other modes to work with that, too. > Note that we don't support RTM with new locking modes. That feature will probably get removed in a future JDK version. (Already unsupported with Power10.) This pull request has now been integrated. Changeset: 0ab09630 Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/0ab09630c6af42cb4d65a79a2ddd7799443e73ee Stats: 390 lines in 7 files changed: 233 ins; 33 del; 124 mod 8308469: [PPC64] Implement alternative fast-locking scheme Reviewed-by: rrich, lucy ------------- PR: https://git.openjdk.org/jdk/pull/14069 From ysr at openjdk.org Thu Jun 1 17:55:20 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Thu, 1 Jun 2023 17:55:20 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 12:15:37 GMT, Martin Doerr wrote: > Issues already reported to GenShen engineers: > > gc/shenandoah/TestElasticTLAB.java#generational #? Internal Error (src\hotspot\share\gc\shenandoah\shenandoahFreeSet.cpp:695), pid=23288, tid=23784 #? assert(size % CardTable::card_size_in_words() == 0) failed: size must be multiple of card table size, was 258 > > gc/stress/gcold/TestGCOldWithShenandoah.java#generational #? Internal Error (src\hotspot\share\gc\shenandoah\heuristics\shenandoahOldHeuristics.cpp:82), pid=20828, tid=5836 #? assert(_old_generation->available() > old_evacuation_budget) failed: Cannot budget more than is available > > gc/shenandoah/oom/TestAllocOutOfMemory.java#large Execution failed: `main' threw exception: java.lang.RuntimeException: 'java.lang.OutOfMemoryError: Java heap space' missing from stdout/stderr (Issue with 64k Pages) > > gc/shenandoah/TestRetainObjects.java#no-tlab gc/shenandoah/TestSieveObjects.java#no-tlab Timeouts. > > gc/shenandoah/TestAllocObjects.java#generational gc/shenandoah/TestDynamicSoftMaxHeapSize.java#generational #? Internal Error src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp:664), pid=18434, tid=29955 #? assert(is_global() || ShenandoahHeap::heap()->is_full_gc_in_progress() || (_used + _humongous_waste <= _affiliated_region_count * ShenandoahHeapRegion::region_size_bytes())) failed: used cannot exceed regions Thanks @TheRealMDoerr ; could you specify the platforms on which you see these failures, so we have a better chance at reproducing them? Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1572528277 From kdnilsen at openjdk.org Thu Jun 1 18:07:53 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Thu, 1 Jun 2023 18:07:53 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v2] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Make the order of young/old collector checks consistent (#1) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/aa85a907..eb656ec2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=00-01 Stats: 16 lines in 1 file changed: 7 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From wkemper at openjdk.org Thu Jun 1 18:42:25 2023 From: wkemper at openjdk.org (William Kemper) Date: Thu, 1 Jun 2023 18:42:25 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v2] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 10:12:02 GMT, Stefan Karlsson wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Make the order of young/old collector checks consistent (#1) > > src/hotspot/share/gc/shared/gcConfiguration.cpp line 88: > >> 86: } >> 87: #endif >> 88: return NA; > > You moved the order between Shenandoah and ZGC in `young_collector()`, so you should probably do the same here. Fixed. Thank you for the review. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213538282 From duke at openjdk.org Thu Jun 1 20:05:19 2023 From: duke at openjdk.org (duke) Date: Thu, 1 Jun 2023 20:05:19 GMT Subject: Withdrawn: 8284196: RISC-V: Detect supported ISA extensions over cpuinfo In-Reply-To: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> References: <5epXxFbcRxLAHDle_rbwDntEcXBZYs4U5hjDCoCIqBg=.fc56c175-f28d-45a9-962f-f9120e96a8a4@github.com> Message-ID: On Tue, 31 Jan 2023 13:51:59 GMT, Feilong Jiang wrote: > Currently, `elf_hwcap` for RISC-V only sets single-letter extension bit (e.g. IMAFD). > As many standard multi-letter ISA extensions are ratified (e.g. Zba/Zbb/Zbc/Zbs), > we should find a stable way to detect these supported ISA extensions in JVM. > [1] has proposed a way to parse supported extensions through /proc/cpuinfo > or "riscv,isa" string of /sys/firmware/devicetree, we could detect supported extensions > in the same way. > > Here is an example of /proc/cpuinfo with multi-letter extensions from Ubuntu 20.04 in QEMU-SYSTEM: > > > ubuntu at ubuntu:~$ uname -a > Linux ubuntu 5.8.0-14-generic #16~20.04.3-Ubuntu SMP Mon Feb 1 16:33:19 UTC 2021 riscv64 riscv64 riscv64 GNU/Linux > ubuntu at ubuntu:~$ cat /proc/cpuinfo > processor : 0 > hart : 2 > isa : rv64imafdch_zicsr_zifencei_zihintpause_zba_zbb_zbc_zbs_sstc > mmu : sv48 > > > 1: http://lists.infradead.org/pipermail/linux-riscv/2021-November/010252.html > > > Testing: > - [x] `jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseZihintpause -XX:+UseRVV -XX:+UseZicbop -XX:+UseZba -XX:+UseZbb -XX:+UseZbs -XX:+PrintFlagsFinal -version` with release build This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/12343 From sgibbons at openjdk.org Thu Jun 1 21:18:52 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 1 Jun 2023 21:18:52 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v3] In-Reply-To: References: Message-ID: <3-C-x5eRi42jFZRHfL4euEAMoZowoqGtkU7E1DOIc2Q=.007976ae-2010-4ae7-aadb-ec14d1d0930a@github.com> > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Change to more efficient algorithm for AVX512 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/351afa38..e1131955 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=01-02 Stats: 257 lines in 2 files changed: 240 ins; 6 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From amenkov at openjdk.org Fri Jun 2 00:01:07 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 2 Jun 2023 00:01:07 GMT Subject: RFR: 8308978: regression with a deadlock involving FollowReferences [v3] In-Reply-To: <6xcKqU3mLr9TocEUpoXXzcNWSnKijSlhcyxIfXXrFD0=.2987b8de-827e-43d0-907b-7ef2016ddab4@github.com> References: <2J1qItzUgmfjRPS0xUbHgXZQ-b12JBxe8XPRftU2GyA=.025e7855-5df4-413a-bea7-585a53832025@github.com> <6xcKqU3mLr9TocEUpoXXzcNWSnKijSlhcyxIfXXrFD0=.2987b8de-827e-43d0-907b-7ef2016ddab4@github.com> Message-ID: On Wed, 31 May 2023 21:33:13 GMT, Alex Menkov wrote: >> The change fixes regression from JDK-8299414. >> There is a deadlock between JvmtiVTMSTransitionDisabler and EscapeBarrier when virtual threads are in mount/unmount transition: >> EscapeBarrier requests deoptimization which requires thread suspension. >> JvmtiVTMSTransitionDisabler ctor waits until all in progress VTMS transitions complete, but they cannot be completed as thread is suspended. >> To avoid the deadlock mount/unmount transitions should be completed before EscapeBarrier stuff. > > Alex Menkov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - Merge branch 'follow_ref_deadlock' of github.com:alexmenkov/jdk into follow_ref_deadlock > - Merge branch 'openjdk:master' into follow_ref_deadlock > - Merge branch 'follow_ref_deadlock' of github.com:alexmenkov/jdk into follow_ref_deadlock > - fix > - unproblem-list tests > - fix > - unproblem-list tests > - fix tier 1-5 passed (there are 3 failed tests, but they are not related to FollowReferences) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14233#issuecomment-1572934722 From kdnilsen at openjdk.org Fri Jun 2 02:30:21 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 2 Jun 2023 02:30:21 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v3] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Assert bounds only when allocations succeed, increase test timeouts (#2) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/eb656ec2..5bf6e7e0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=01-02 Stats: 3 lines in 3 files changed: 0 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From kdnilsen at openjdk.org Fri Jun 2 02:49:25 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 2 Jun 2023 02:49:25 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: References: Message-ID: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Force PLAB sizes to align on card-table size ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/5bf6e7e0..d4d2f1cf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=02-03 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From kdnilsen at openjdk.org Fri Jun 2 02:56:23 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 2 Jun 2023 02:56:23 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 18:39:05 GMT, William Kemper wrote: >> src/hotspot/share/gc/shared/gcConfiguration.cpp line 88: >> >>> 86: } >>> 87: #endif >>> 88: return NA; >> >> You moved the order between Shenandoah and ZGC in `young_collector()`, so you should probably do the same here. > > Fixed. Thank you for the review. Thanks. We've made your suggested change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213862862 From kdnilsen at openjdk.org Fri Jun 2 02:56:25 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Fri, 2 Jun 2023 02:56:25 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: References: Message-ID: <3lkkOfu3WxdRPQ4Y0uiOl6znrpRNDrcMn4ecDhQAcuU=.d742a62e-6ec7-444f-a925-f70dfeaf7df9@github.com> On Thu, 1 Jun 2023 14:27:12 GMT, Thomas Stuefe wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Force PLAB sizes to align on card-table size > > test/hotspot/jtreg/gc/shenandoah/oom/TestClassLoaderLeak.java line 132: > >> 130: {{"iu"}, {"adaptive", "aggressive"}}, >> 131: {{"passive"}, {"passive"}}, >> 132: {{"generational"}, {"adaptive"}} > > Curious, here and in similar places, why only test adaptive heuristic for generational, if we test satb with all variants? Generational mode only works with adaptive heuristic. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1213863860 From duke at openjdk.org Fri Jun 2 03:14:06 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 2 Jun 2023 03:14:06 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:49:23 GMT, Ashutosh Mehra wrote: >> This patch is the first step towards having a single set of GC APIs for allocating heap space for the archived objects (See https://bugs.openjdk.org/browse/JDK-8296263). >> It moves some of the G1 specific logic from CDS to G1 gc without changing the functionality. >> >> Changes that add/update GC APIs for handling archive heap would be introduced in upcoming patches. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused method FileMapInfo::heap_region_mapped_address > > Signed-off-by: Ashutosh Mehra @tschatzl > I'm not convinced not giving a preferred location is a good idea. That seems to reduce the opportunity to directly map archives significantly. Previously, with only heap size changes, the archive could be mapped in still. I am not sure I get this. The patch does not change the ability to map the archive. It just moved the calculation to map the archive region from CDS to G1. Before this patch CDS code would determine the address for mapping the archive space towards the top of the heap and pass that address to G1. This patch just moves that calculation to G1. So it should be at-par with the current state. If it is not, please point out and I can work on fixing that. > Since this change is an intermediate step, could you provide an overview of the final API/change too? It is hard to comment on this without knowing where you are going with that. Ok, I have updated the description of the main issue [JDK-8296263](https://bugs.openjdk.org/browse/JDK-8296263) with some details on the expected changes in the future patches. My main aim with this work is mainly code reorganization to avoid using different GC APIs in CDS depending on the GC policy in use. When this is completed I expect it to provide same functionality as today. Any enhancement, like passing preferred location to map archive heap, can be built on top of this. Hope this helps. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1573075190 From iklam at openjdk.org Fri Jun 2 03:39:16 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 2 Jun 2023 03:39:16 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: Message-ID: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> On Fri, 2 Jun 2023 03:11:03 GMT, Ashutosh Mehra wrote: > @tschatzl > > > I'm not convinced not giving a preferred location is a good idea. That seems to reduce the opportunity to directly map archives significantly. Previously, with only heap size changes, the archive could be mapped in still. > > I am not sure I get this. The patch does not change the ability to map the archive. It just moved the calculation to map the archive region from CDS to G1. Before this patch CDS code would determine the address for mapping the archive space towards the top of the heap and pass that address to G1. This patch just moves that calculation to G1. So it should be at-par with the current state. If it is not, please point out and I can work on fixing that. > > > Since this change is an intermediate step, could you provide an overview of the final API/change too? It is hard to comment on this without knowing where you are going with that. > > Ok, I have updated the description of the main issue [JDK-8296263](https://bugs.openjdk.org/browse/JDK-8296263) with some details on the expected changes in the future patches. My main aim with this work is mainly code reorganization to avoid using different GC APIs in CDS depending on the GC policy in use. When this is completed I expect it to provide same functionality as today. Any enhancement, like passing preferred location to map archive heap, can be built on top of this. > > Hope this helps. Hi Ashutosh, You are right that in the existing code, although filemap.cpp finds out where the requested range is, it doesn't actually pass that to G1. It just requests G1 to reserve a range at the end of the runtime heap. So your PR preserves this behavior. I think Thomas's point is, the requested range should be passed to the `alloc_archive_regions()` API, even though the collector may simply ignore it. In the past, the archive G1 regions were not movable, so it was preferable to put them at the end of the heap, even though that might cause relocation. Now that the archive regions are just regular "old" regions, which can move, it may be preferable to reserve them at the requested range. BTW, perhaps `alloc_archive_regions()` should be renamed to `alloc_archive_range()` going forward. The plural form of "regions" sounds odd for non-region based collectors. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1573096066 From thartmann at openjdk.org Fri Jun 2 05:23:23 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 2 Jun 2023 05:23:23 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: On Tue, 30 May 2023 19:15:38 GMT, Johan Sj?len wrote: >> A final sweep of Hotspot to remove all re-added NULLs. With only 110 changes I'd appreciate if this was considered trivial. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Align > - Suggestions I think if we just rely on reviews, NULLs will slip through again and we would need to have regular cleanup PRs. Doug's idea seems simple enough to implement in Skara/jcheck. An alternative to whitelisting would be a warning in the offending PR or a requirement for "special approvement" of such changes (for example, via a Skara command). ------------- PR Comment: https://git.openjdk.org/jdk/pull/14198#issuecomment-1573169963 From thartmann at openjdk.org Fri Jun 2 05:53:08 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Fri, 2 Jun 2023 05:53:08 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v3] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 14:51:42 GMT, Boris Ulasevich wrote: >> Async profiler shows that applications spend up to 10% in itable_stubs. >> >> The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. >> >> This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > Apply suggestions from code review > > Co-authored-by: Aleksey Shipil?v I'm seeing build failures: [2023-06-02T05:46:54,104Z] /opt/mach5/mesos/work_dir/slaves/741e9afd-8c02-45c3-b2e2-9db1450d0832-S20155/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c5ef4fb5-87e1-424f-aecb-edeeb47e527a/runs/036a923e-0df3-49ab-81a9-2ab6e104728d/workspace/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp: In member function 'void MacroAssembler::lookup_interface_method_stub(Register, Register, Register, Register, Register, Register, Register, int, Label&)': [2023-06-02T05:46:54,104Z] /opt/mach5/mesos/work_dir/slaves/741e9afd-8c02-45c3-b2e2-9db1450d0832-S20155/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c5ef4fb5-87e1-424f-aecb-edeeb47e527a/runs/036a923e-0df3-49ab-81a9-2ab6e104728d/workspace/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp:4324:40: error: 'method_offset_in_bytes' is not a member of 'itableMethodEntry' [2023-06-02T05:46:54,104Z] 4324 | int itentry_off = itableMethodEntry::method_offset_in_bytes(); [2023-06-02T05:46:54,104Z] | ^~~~~~~~~~~~~~~~~~~~~~ [2023-06-02T05:46:54,104Z] /opt/mach5/mesos/work_dir/slaves/741e9afd-8c02-45c3-b2e2-9db1450d0832-S20155/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c5ef4fb5-87e1-424f-aecb-edeeb47e527a/runs/036a923e-0df3-49ab-81a9-2ab6e104728d/workspace/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp:4327:36: error: 'interface_offset_in_bytes' is not a member of 'itableOffsetEntry' [2023-06-02T05:46:54,105Z] 4327 | int ioffset = itableOffsetEntry::interface_offset_in_bytes(); [2023-06-02T05:46:54,105Z] | ^~~~~~~~~~~~~~~~~~~~~~~~~ [2023-06-02T05:46:54,105Z] /opt/mach5/mesos/work_dir/slaves/741e9afd-8c02-45c3-b2e2-9db1450d0832-S20155/frameworks/1735e8a2-a1db-478c-8104-60c8b0af87dd-0196/executors/c5ef4fb5-87e1-424f-aecb-edeeb47e527a/runs/036a923e-0df3-49ab-81a9-2ab6e104728d/workspace/open/src/hotspot/cpu/x86/macroAssembler_x86.cpp:4328:36: error: 'offset_offset_in_bytes' is not a member of 'itableOffsetEntry' [2023-06-02T05:46:54,105Z] 4328 | int ooffset = itableOffsetEntry::offset_offset_in_bytes(); [2023-06-02T05:46:54,105Z] | ^~~~~~~~~~~~~~~~~~~~~~ ------------- PR Comment: https://git.openjdk.org/jdk/pull/13460#issuecomment-1573189154 From sspitsyn at openjdk.org Fri Jun 2 07:14:23 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 2 Jun 2023 07:14:23 GMT Subject: RFR: 8309044: Replace NULL with nullptr, final sweep of hotspot code [v2] In-Reply-To: References: <3FoMnGeBp8DqkpVb6YGXKxdPsgGz6ej-jrf2U2stVfU=.56a11e19-38dd-420a-a07d-3b025120f194@github.com> Message-ID: On Tue, 30 May 2023 19:15:38 GMT, Johan Sj?len wrote: >> A final sweep of Hotspot to remove all re-added NULLs. With only 110 changes I'd appreciate if this was considered trivial. > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Align > - Suggestions Looks good. Thanks, Serguei ------------- PR Review: https://git.openjdk.org/jdk/pull/14198#pullrequestreview-1456781527 From tschatzl at openjdk.org Fri Jun 2 07:30:10 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 2 Jun 2023 07:30:10 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: On Fri, 2 Jun 2023 03:36:38 GMT, Ioi Lam wrote: > Hi Ashutosh, > > You are right that in the existing code, although filemap.cpp finds out where the requested range is, it doesn't actually pass that to G1. It just requests G1 to reserve a range at the end of the runtime heap. So your PR preserves this behavior. > > I think Thomas's point is, the requested range should be passed to the `alloc_archive_regions()` API, even though the collector may simply ignore it. Exactly, sorry if I wasn't clear enough. > > In the past, the archive G1 regions were not movable, so it was preferable to put them at the end of the heap, even though that might cause relocation. Now that the archive regions are just regular "old" regions, which can move, it may be preferable to reserve them at the requested range. > > BTW, perhaps `alloc_archive_regions()` should be renamed to `alloc_archive_range()` going forward. The plural form of "regions" sounds odd for non-region based collectors. +1 Thanks, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1573282235 From stefank at openjdk.org Fri Jun 2 07:50:21 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 2 Jun 2023 07:50:21 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size I've reviewed the shared code and think that looks good. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14185#pullrequestreview-1456831456 From eosterlund at openjdk.org Fri Jun 2 08:58:06 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Fri, 2 Jun 2023 08:58:06 GMT Subject: RFR: 8309210: Extend VM Operations hs_err logging In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:35:03 GMT, Stefan Karlsson wrote: > We have a section in the hs_err file, which prints the most recently run VM operations. Sometimes a VM operation type is used from multiple places in our code and it's not obvious why the VM operation was run. For example, HandshakeAllThreads doesn't tell us why we are running the handshake. I propose that we add an option for the VM operations to tell more about why they are used. > > The proposed patch enhances the Handshake VM operation and the ZGC pause VM Operations. This will make debugging much easier. Thank you. Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14248#pullrequestreview-1456938967 From fjiang at openjdk.org Fri Jun 2 09:35:22 2023 From: fjiang at openjdk.org (Feilong Jiang) Date: Fri, 2 Jun 2023 09:35:22 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: <9zx2L2htHAAmC0dInZaMaMESDiueZJAn7Tv9XzznJJc=.c3aadf60-97f6-46fe-8bee-a55baeb9bc67@github.com> On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size Hi, I have built this pr based on aa85a90, Tier1 tests failed on `gc/TestAllocHumongousFragment.java#generational` on Linux/RISC-V with the following output: # # A fatal error has been detected by the Java Runtime Environment: # # Internal Error (shenandoahVerifier.cpp:1244), pid=2951116, tid=2951124 # Error: Verify init-mark remembered set violation; clean card should be dirty # # JRE version: OpenJDK Runtime Environment (21.0) (build 21-internal-adhoc.ubuntu.jdk) # Java VM: OpenJDK 64-Bit Server VM (21-internal-adhoc.ubuntu.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, shenandoah gc, linux-riscv64) Looks like Generational Shenandoah does not fully support RISC-V port, should we disable this test on RISC-V port for now? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1573436275 From bulasevich at openjdk.org Fri Jun 2 09:40:14 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 2 Jun 2023 09:40:14 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v4] In-Reply-To: References: Message-ID: > Async profiler shows that applications spend up to 10% in itable_stubs. > > The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. > > This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). Boris Ulasevich has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'openjdk:master' into improve_itable_stub - Apply suggestions from code review Co-authored-by: Aleksey Shipil?v - readability rework - cleanup - 8305959: x86: Improve itable_stub ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13460/files - new: https://git.openjdk.org/jdk/pull/13460/files/0ef7fd9c..b06688cb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=02-03 Stats: 227335 lines in 3780 files changed: 166899 ins; 30675 del; 29761 mod Patch: https://git.openjdk.org/jdk/pull/13460.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13460/head:pull/13460 PR: https://git.openjdk.org/jdk/pull/13460 From sjohanss at openjdk.org Fri Jun 2 09:48:05 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Fri, 2 Jun 2023 09:48:05 GMT Subject: RFR: 8309210: Extend VM Operations hs_err logging In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:35:03 GMT, Stefan Karlsson wrote: > We have a section in the hs_err file, which prints the most recently run VM operations. Sometimes a VM operation type is used from multiple places in our code and it's not obvious why the VM operation was run. For example, HandshakeAllThreads doesn't tell us why we are running the handshake. I propose that we add an option for the VM operations to tell more about why they are used. > > The proposed patch enhances the Handshake VM operation and the ZGC pause VM Operations. Looks good! File file an enhancement to add this for all `VM_GC_Operations` ------------- Marked as reviewed by sjohanss (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14248#pullrequestreview-1457016997 From adinn at openjdk.org Fri Jun 2 09:57:06 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 2 Jun 2023 09:57:06 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v5] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 16:16:32 GMT, Andrew Haley wrote: >> This provides a solid speedup of about 3-4x over the Java implementation. >> >> I have a vectorized version of this which uses a bunch of tricks to speed it up, but it's complex and can still be improved. We're getting close to ramp down, so I'm submitting this simple intrinsic so that we can get it reviewed in time. >> >> Benchmarks: >> >> >> ThunderX (2, I think): >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 14078352.014 ? 4201407.966 ops/s >> Poly1305DigestBench.updateBytes 256 thrpt 3 5154958.794 ? 1717146.980 ops/s >> Poly1305DigestBench.updateBytes 1024 thrpt 3 1416563.273 ? 1311809.454 ops/s >> Poly1305DigestBench.updateBytes 16384 thrpt 3 94059.570 ? 2913.021 ops/s >> Poly1305DigestBench.updateBytes 1048576 thrpt 3 1441.024 ? 164.443 ops/s >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 4516486.795 ? 419624.224 ops/s >> Poly1305DigestBench.updateBytes 256 thrpt 3 1228542.774 ? 202815.694 ops/s >> Poly1305DigestBench.updateBytes 1024 thrpt 3 316051.912 ? 23066.449 ops/s >> Poly1305DigestBench.updateBytes 16384 thrpt 3 20649.561 ? 1094.687 ops/s >> Poly1305DigestBench.updateBytes 1048576 thrpt 3 310.564 ? 31.053 ops/s >> >> Apple M1: >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 33551968.946 ? 849843.905 ops/s >> Poly1305DigestBench.updateBytes 256 thrpt 3 9911637.214 ? 63417.224 ops/s >> Poly1305DigestBench.updateBytes 1024 thrpt 3 2604370.740 ? 29208.265 ops/s >> Poly1305DigestBench.updateBytes 16384 thrpt 3 165183.633 ? 1975.998 ops/s >> Poly1305DigestBench.updateBytes 1048576 thrpt 3 2587.132 ? 40.240 ops/s >> >> Benchmark (dataSize) (provider) Mode Cnt Score Error Units >> Poly1305DigestBench.updateBytes 64 thrpt 3 12373649.589 ? 184757.721 ops/s >> Poly1305DigestBench.upd... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Review comments This is ok to push now the comments have been corrected. ------------- Marked as reviewed by adinn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14085#pullrequestreview-1457031979 From adinn at openjdk.org Fri Jun 2 09:57:09 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 2 Jun 2023 09:57:09 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v4] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 16:06:40 GMT, Andrew Haley wrote: >> src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp line 7135: >> >>> 7133: regs = (regs.remaining() + U_0HI + U_1HI).begin(); >>> 7134: >>> 7135: // U_2:U_1:U_0 += (U_1HI >> 2) >> >> This comment and the next one both need correcting. They mention U_0HI and U_1HI and, as the previous comment says, those registers are dead. >> >> What actually happens here is best summarized as >> >> // U_2:U_1:U_0 += (U2 >> 2) * 5 >> >> or, if we actually want to be clearer about the current encoding which does it in several steps >> >> // rscratch1 = (U2 >> 2) >> // U2 = U2[1:0] >> // U_2:U_1:U_0 += rscratch1 >> // U_2:U_1:U_0 += (rscratch1 << 2) >> >> i.e. any bits that are set from 130 upwards are masked off, treated as an integer in their own right, multiplied by 5 and the result added back in at the bottom to update the 130 bit result U2[1:0]:U1[63:0]:U0[63:0]. >> >> I'm not sure whether this provides an opportunity for you to optimize this by doing the multiply by five earlier i.e. replace the code with this version >> >> // rscratch1 = (U2 >> 2) * 5 >> __ lsr(rscratch1, U_2, 2); >> __ add(rscratch1, rscratch1, scratch1, Assembler::LSL, 2); >> // U2 = U2[1:0] >> __ andr(U_2, U_2, (u8)3); >> // U2:U1:U0 += rscratch1 >> __ adds(U_0, U_0, rscratch1); >> __ adcs(U_1, U_1, zr); >> __ adc(U_2, U_2, zr); >> >> The obvious concern is that the multiply of rscratch1 by 5 might overflow 64 bits. Is that why you have implemented two add and carry steps? If so then why is it legitimate to do the multiply by 5 up front in the final reduction that follows the loop? > >> This comment and the next one both need correcting. They mention U_0HI and U_1HI and, as the previous comment says, those registers are dead. >> >> What actually happens here is best summarized as >> >> // U_2:U_1:U_0 += (U2 >> 2) * 5 >> >> or, if we actually want to be clearer about the current encoding which does it in several steps >> >> // rscratch1 = (U2 >> 2) // U2 = U2[1:0] // U_2:U_1:U_0 += rscratch1 // U_2:U_1:U_0 += (rscratch1 << 2) >> >> i.e. any bits that are set from 130 upwards are masked off, treated as an integer in their own right, multiplied by 5 and the result added back in at the bottom to update the 130 bit result U2[1:0]:U1[63:0]:U0[63:0]. > > OK. > >> I'm not sure whether this provides an opportunity for you to optimize this by doing the multiply by five earlier i.e. replace the code with this version > > I'm not sure either, which is why it's done in two separate steps. I think you may be right, but it's a bit late to be optimizing this version any further. That would require careful analysis and a redo of all the testing. > >> The obvious concern is that the multiply of rscratch1 by 5 might overflow 64 bits. Is that why you have implemented two add and carry steps? > > Indeed. > >> If so then why is it legitimate to do the multiply by 5 up front in the final reduction that follows the loop? > > I assume that you're referring to the multiply by 5 in > > > // Further reduce modulo 2^130 - 5 > __ lsr(rscratch1, U_2, 2); > __ add(rscratch1, rscratch1, rscratch1, Assembler::LSL, 2); // rscratch1 = U_2 * 5 > > > `U_2`, at this point, has only a few lower set bits. This is because `U_2` was previously ANDed with 3, and subsequently twice was the target of adc(U_2, U_2, zr). So I think that `U_2 <= 6`. Yes, of course, you are right that 0<= U_2 < 6 at the point where that second multiply by 5 occurs (i.e. after the loop). I believe it is safe to use the same optimization inside the loop for reasons given below. Of course it is a bit late to change this now and retest but if my reasoning is correct then we could consider updating this post release and, maybe, a backport. The only thing that needs to be determined is what value could sit in U2 when we enter the loop. That's the only important case because we already agreed that at the loop back edge that 0 <= U2 < 6. The incoming value for U2 at loop entry is derived by the following subsequence of the instruction stream __ adcs(S_1, U_1, S_1); __ adc(S_2, U_2, zr); // A.1 __ add(S_2, S_2, 1); // A.2 . . . wide_mul(U_1, U_1HI, S_0, R_1); wide_madd(U_1, U_1HI, S_1, R_0); wide_madd(U_1, U_1HI, S_2, RR_1); // B . . . __ andr(U_2, R_0, 3); // C __ mul(U_2, S_2, U_2); // D . . . __ adc(U_2, U_1HI, U_2); // E At A.1 we know that 0 <= U_2 <= 3 (since it was initialized by unpack26) So, at A.2 we know that 0 <= S2 <= 5 At B we know that 0 <= RR_1 <= (2^60 - 2^2) = FFFFFFF_FFFFFFFC (top 4 and bottom 2 bits of RR_1 are clear) So 0 <= U1_HI < 5 * FFFFFFF_FFFFFFFC = 4FFFFFFF_FFFFFFEC At C we know 0 <= U_2 <= 3 At D we know 0 <= U_2 <= 15 So at E we know that 0 <= U_2 <= 4FFFFFFF_FFFFFFEC + 15 + 1 So, the highest possible value for U_2 at loop entry is 50000000_00000002. Clearly we can shift this down by two and add without any danger of overflowing 50000000_00000002 >> 2 + 50000000_00000002 = 64000000_00000002 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14085#discussion_r1214167215 From adinn at openjdk.org Fri Jun 2 10:02:09 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 2 Jun 2023 10:02:09 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v4] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 09:51:57 GMT, Andrew Dinn wrote: >>> This comment and the next one both need correcting. They mention U_0HI and U_1HI and, as the previous comment says, those registers are dead. >>> >>> What actually happens here is best summarized as >>> >>> // U_2:U_1:U_0 += (U2 >> 2) * 5 >>> >>> or, if we actually want to be clearer about the current encoding which does it in several steps >>> >>> // rscratch1 = (U2 >> 2) // U2 = U2[1:0] // U_2:U_1:U_0 += rscratch1 // U_2:U_1:U_0 += (rscratch1 << 2) >>> >>> i.e. any bits that are set from 130 upwards are masked off, treated as an integer in their own right, multiplied by 5 and the result added back in at the bottom to update the 130 bit result U2[1:0]:U1[63:0]:U0[63:0]. >> >> OK. >> >>> I'm not sure whether this provides an opportunity for you to optimize this by doing the multiply by five earlier i.e. replace the code with this version >> >> I'm not sure either, which is why it's done in two separate steps. I think you may be right, but it's a bit late to be optimizing this version any further. That would require careful analysis and a redo of all the testing. >> >>> The obvious concern is that the multiply of rscratch1 by 5 might overflow 64 bits. Is that why you have implemented two add and carry steps? >> >> Indeed. >> >>> If so then why is it legitimate to do the multiply by 5 up front in the final reduction that follows the loop? >> >> I assume that you're referring to the multiply by 5 in >> >> >> // Further reduce modulo 2^130 - 5 >> __ lsr(rscratch1, U_2, 2); >> __ add(rscratch1, rscratch1, rscratch1, Assembler::LSL, 2); // rscratch1 = U_2 * 5 >> >> >> `U_2`, at this point, has only a few lower set bits. This is because `U_2` was previously ANDed with 3, and subsequently twice was the target of adc(U_2, U_2, zr). So I think that `U_2 <= 6`. > > Yes, of course, you are right that 0<= U_2 < 6 at the point where that second multiply by 5 occurs (i.e. after the loop). > > I believe it is safe to use the same optimization inside the loop for reasons given below. Of course it is a bit late to change this now and retest but if my reasoning is correct then we could consider updating this post release and, maybe, a backport. > > The only thing that needs to be determined is what value could sit in U2 when we enter the loop. That's the only important case because we already agreed that at the loop back edge that 0 <= U2 < 6. > > The incoming value for U2 at loop entry is derived by the following subsequence of the instruction stream > > __ adcs(S_1, U_1, S_1); > __ adc(S_2, U_2, zr); // A.1 > __ add(S_2, S_2, 1); // A.2 > . . . > wide_mul(U_1, U_1HI, S_0, R_1); wide_madd(U_1, U_1HI, S_1, R_0); wide_madd(U_1, U_1HI, S_2, RR_1); // B > . . . > __ andr(U_2, R_0, 3); // C > __ mul(U_2, S_2, U_2); // D > . . . > > __ adc(U_2, U_1HI, U_2); // E > > At A.1 we know that 0 <= U_2 <= 3 (since it was initialized by unpack26) > So, at A.2 we know that 0 <= S2 <= 5 > > At B we know that 0 <= RR_1 <= (2^60 - 2^2) = FFFFFFF_FFFFFFFC (top 4 and bottom 2 bits of RR_1 are clear) > So 0 <= U1_HI < 5 * FFFFFFF_FFFFFFFC = 4FFFFFFF_FFFFFFEC > > At C we know 0 <= U_2 <= 3 > > At D we know 0 <= U_2 <= 15 > > So at E we know that 0 <= U_2 <= 4FFFFFFF_FFFFFFEC + 15 + 1 > > So, the highest possible value for U_2 at loop entry is 50000000_00000002. > > Clearly we can shift this down by two and add without any danger of overflowing > > 50000000_00000002 >> 2 + 50000000_00000002 = 64000000_00000002 Ah, no scratch that. I have made a wrong assumption at B. The value of U1_HI is bounded by the sum of the 3 multiplies. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14085#discussion_r1214174121 From bulasevich at openjdk.org Fri Jun 2 10:09:41 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Fri, 2 Jun 2023 10:09:41 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v5] In-Reply-To: References: Message-ID: > Async profiler shows that applications spend up to 10% in itable_stubs. > > The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. > > This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). Boris Ulasevich has updated the pull request incrementally with two additional commits since the last revision: - update benchmark, restore comments - merge issue fix: in_bytes, 8308396 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13460/files - new: https://git.openjdk.org/jdk/pull/13460/files/b06688cb..8f36e437 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=03-04 Stats: 31 lines in 4 files changed: 7 ins; 3 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/13460.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13460/head:pull/13460 PR: https://git.openjdk.org/jdk/pull/13460 From adinn at openjdk.org Fri Jun 2 11:05:06 2023 From: adinn at openjdk.org (Andrew Dinn) Date: Fri, 2 Jun 2023 11:05:06 GMT Subject: RFR: 8296411: AArch64: Accelerated Poly1305 intrinsics [v4] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 09:58:59 GMT, Andrew Dinn wrote: >> Yes, of course, you are right that 0<= U_2 < 6 at the point where that second multiply by 5 occurs (i.e. after the loop). >> >> I believe it is safe to use the same optimization inside the loop for reasons given below. Of course it is a bit late to change this now and retest but if my reasoning is correct then we could consider updating this post release and, maybe, a backport. >> >> The only thing that needs to be determined is what value could sit in U2 when we enter the loop. That's the only important case because we already agreed that at the loop back edge that 0 <= U2 < 6. >> >> The incoming value for U2 at loop entry is derived by the following subsequence of the instruction stream >> >> __ adcs(S_1, U_1, S_1); >> __ adc(S_2, U_2, zr); // A.1 >> __ add(S_2, S_2, 1); // A.2 >> . . . >> wide_mul(U_1, U_1HI, S_0, R_1); wide_madd(U_1, U_1HI, S_1, R_0); wide_madd(U_1, U_1HI, S_2, RR_1); // B >> . . . >> __ andr(U_2, R_0, 3); // C >> __ mul(U_2, S_2, U_2); // D >> . . . >> >> __ adc(U_2, U_1HI, U_2); // E >> >> At A.1 we know that 0 <= U_2 <= 3 (since it was initialized by unpack26) >> So, at A.2 we know that 0 <= S2 <= 5 >> >> At B we know that 0 <= RR_1 <= (2^60 - 2^2) = FFFFFFF_FFFFFFFC (top 4 and bottom 2 bits of RR_1 are clear) >> So 0 <= U1_HI < 5 * FFFFFFF_FFFFFFFC = 4FFFFFFF_FFFFFFEC >> >> At C we know 0 <= U_2 <= 3 >> >> At D we know 0 <= U_2 <= 15 >> >> So at E we know that 0 <= U_2 <= 4FFFFFFF_FFFFFFEC + 15 + 1 >> >> So, the highest possible value for U_2 at loop entry is 50000000_00000002. >> >> Clearly we can shift this down by two and add without any danger of overflowing >> >> 50000000_00000002 >> 2 + 50000000_00000002 = 64000000_00000002 > > Ah, no scratch that. I have made a wrong assumption at B. The value of U1_HI is bounded by the sum of the 3 64 bit * 64 bit multiplies. I think there is still a proof of validity to be salvaged though. We compute a 128 bit product: U1_HI:U1 = S_0 * R_1 + S_1 * R_0 + S_2 * RR_1 We know that R_0 and R_1 have four top bits clear and S2 <= 5. So, I think we can guarantee that the top word of the 128 bit product is small enough to not to overflow when we do the shift. Even if we assume S_0, S_1 and RR_1 have the maximum possible value we have S_0 * R_1 <= (2^64 - 1) * (2^60 - 1) S_1 * R_0 <= (2^64 - 1) * (2^60 - 1) S_2 * RR_1 <= 5 * (2^64 - 1) So, the top word U1_HI is bounded by at most (2 * (2^60 - 1)) + 5. That leaves more than enough room to guarantee that the shift and add will not overflow. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14085#discussion_r1214230004 From duke at openjdk.org Fri Jun 2 11:35:14 2023 From: duke at openjdk.org (JoKern65) Date: Fri, 2 Jun 2023 11:35:14 GMT Subject: RFR: JDK-8308288: Fix xlc17 clang warnings and build errors in hotspot Message-ID: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> This pr is a split off from JDK-8308288 : Fix xlc17 clang warnings in shared code https://github.com/openjdk/jdk/pull/14146 It handles the part in hotspot. It handles the error introduced by a redefine of malloc in stdlib.h resulting in the following build error: /data/d042520/pr/jdk/src/hotspot/share/runtime/os.cpp:616:5: error: no member named '_vec_malloc' in 'LogTag'; did you mean 'vec_malloc'? log_warning(malloc, free)("ptr caught: " PTR_FORMAT, p2i(ptr)); ^~~~~~~~~~~~~~~~~~~~~~~~~ /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:46:28: note: expanded from macro 'log_warning' #define log_warning(...) (!log_is_enabled(Warning, __VA_ARGS__)) ? (void)0 : LogImpl::write ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:68:45: note: expanded from macro 'log_is_enabled' #define log_is_enabled(level, ...) (LogImpl::is_level(LogLevel::level)) ^~~~~~~~~~~~~~~~~~~~~ /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:221:38: note: expanded from macro 'LOG_TAGS' #define LOG_TAGS(...) EXPAND_VARARGS(LOG_TAGS_EXPANDED(__VA_ARGS__, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:217:57: note: expanded from macro 'LOG_TAGS_EXPANDED' #define LOG_TAGS_EXPANDED(T0, T1, T2, T3, T4, T5, ...) PREFIX_LOG_TAG(T0), PREFIX_LOG_TAG(T1), PREFIX_LOG_TAG(T2), \ ^~~~~~~~~~~~~~~~~~ ... (rest of output omitted) Additionally it solves the need for an #include on AIX for any usage of the alloca function, by adding the include to globalDefinitions_xlc.hpp ------------- Commit messages: - JDK-8308288 Changes: https://git.openjdk.org/jdk/pull/14283/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14283&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8308288 Stats: 13 lines in 2 files changed: 10 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14283.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14283/head:pull/14283 PR: https://git.openjdk.org/jdk/pull/14283 From stefank at openjdk.org Fri Jun 2 12:10:19 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 2 Jun 2023 12:10:19 GMT Subject: RFR: 8309210: Extend VM Operations hs_err logging In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:35:03 GMT, Stefan Karlsson wrote: > We have a section in the hs_err file, which prints the most recently run VM operations. Sometimes a VM operation type is used from multiple places in our code and it's not obvious why the VM operation was run. For example, HandshakeAllThreads doesn't tell us why we are running the handshake. I propose that we add an option for the VM operations to tell more about why they are used. > > The proposed patch enhances the Handshake VM operation and the ZGC pause VM Operations. Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14248#issuecomment-1573629924 From stefank at openjdk.org Fri Jun 2 12:10:20 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 2 Jun 2023 12:10:20 GMT Subject: Integrated: 8309210: Extend VM Operations hs_err logging In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:35:03 GMT, Stefan Karlsson wrote: > We have a section in the hs_err file, which prints the most recently run VM operations. Sometimes a VM operation type is used from multiple places in our code and it's not obvious why the VM operation was run. For example, HandshakeAllThreads doesn't tell us why we are running the handshake. I propose that we add an option for the VM operations to tell more about why they are used. > > The proposed patch enhances the Handshake VM operation and the ZGC pause VM Operations. This pull request has now been integrated. Changeset: e8268d91 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/e8268d916340e0ab2fe78a67c73b6b26713c0109 Stats: 49 lines in 4 files changed: 42 ins; 0 del; 7 mod 8309210: Extend VM Operations hs_err logging Reviewed-by: dholmes, stuefe, eosterlund, sjohanss ------------- PR: https://git.openjdk.org/jdk/pull/14248 From duke at openjdk.org Fri Jun 2 13:21:24 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 2 Jun 2023 13:21:24 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: On Fri, 2 Jun 2023 07:27:17 GMT, Thomas Schatzl wrote: > I think Thomas's point is, the requested range should be passed to the alloc_archive_regions() API, even though the collector may simply ignore it. Ok. I have initially planned to do that after completing this work. But lets add it in this patch. > BTW, perhaps alloc_archive_regions() should be renamed to alloc_archive_range() going forward. That's a typo. It should be alloc_archive_space(). I hope that is acceptable. Feel free to suggest changes to the APIs if it doesn't sound correct. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1573719653 From duke at openjdk.org Fri Jun 2 13:21:25 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 2 Jun 2023 13:21:25 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: On Fri, 2 Jun 2023 13:16:04 GMT, Ashutosh Mehra wrote: > That's a typo. It should be alloc_archive_space(). Oh, I realized its not a typo but the change to alloc_archive_space() would happen in the next patch. So I didn't bother changing the name of existing G1 api. I will update it to alloc_archive_region() as suggested. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1573722188 From stefank at openjdk.org Fri Jun 2 13:52:33 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 2 Jun 2023 13:52:33 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS Message-ID: Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: # Create a JFR recording $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp # Extract the data from that recording $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr Resident Set Size Time Resident Set Size Resident Set Size Peak Value ---------------- ------------------------- ------------------------------------ 11:56:07 1.1 GB 1.2 GB 11:56:08 333.7 MB 1.2 GB 11:56:09 432.4 MB 1.2 GB 11:56:10 695.9 MB 1.2 GB 11:56:11 1.0 GB 1.2 GB 11:56:12 1.3 GB 1.3 GB 11:56:13 1.3 GB 1.3 GB 11:56:14 1.3 GB 1.3 GB 11:56:15 1.3 GB 1.3 GB 11:56:16 1.3 GB 1.3 GB 11:56:17 1.4 GB 1.4 GB 11:56:18 1.8 GB 1.8 GB 11:56:19 2.0 GB 2.0 GB The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. ------------- Commit messages: - Update comments in test - Fix ifdef INCLUDE_JFR - Expose RSS in JFR recordings (windows) and tweaks - Expose RSS in JFR recordings (macos) - Expose RSS in JFR recordings (linux) Changes: https://git.openjdk.org/jdk/pull/14285/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14285&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307374 Stats: 214 lines in 11 files changed: 214 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14285/head:pull/14285 PR: https://git.openjdk.org/jdk/pull/14285 From duke at openjdk.org Fri Jun 2 13:54:08 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 2 Jun 2023 13:54:08 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: On Fri, 2 Jun 2023 13:16:04 GMT, Ashutosh Mehra wrote: > I think Thomas's point is, the requested range should be passed to the alloc_archive_regions() API, even though the collector may simply ignore it. @tschatzl @iklam Other collectors may ignore the requested addr, but G1 can just allocate the regions at the requested addr, right? Do we also need the fallback to allocating at the top of heap if for some reason allocation at requested addr fails? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1573773258 From aph at openjdk.org Fri Jun 2 14:32:18 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 2 Jun 2023 14:32:18 GMT Subject: Integrated: 8296411: AArch64: Accelerated Poly1305 intrinsics In-Reply-To: References: Message-ID: On Mon, 22 May 2023 14:23:15 GMT, Andrew Haley wrote: > This provides a solid speedup of about 3-4x over the Java implementation. > > I have a vectorized version of this which uses a bunch of tricks to speed it up, but it's complex and can still be improved. We're getting close to ramp down, so I'm submitting this simple intrinsic so that we can get it reviewed in time. > > Benchmarks: > > > ThunderX (2, I think): > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 14078352.014 ? 4201407.966 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 5154958.794 ? 1717146.980 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 1416563.273 ? 1311809.454 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 94059.570 ? 2913.021 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 1441.024 ? 164.443 ops/s > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 4516486.795 ? 419624.224 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 1228542.774 ? 202815.694 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 316051.912 ? 23066.449 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 20649.561 ? 1094.687 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 310.564 ? 31.053 ops/s > > Apple M1: > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 33551968.946 ? 849843.905 ops/s > Poly1305DigestBench.updateBytes 256 thrpt 3 9911637.214 ? 63417.224 ops/s > Poly1305DigestBench.updateBytes 1024 thrpt 3 2604370.740 ? 29208.265 ops/s > Poly1305DigestBench.updateBytes 16384 thrpt 3 165183.633 ? 1975.998 ops/s > Poly1305DigestBench.updateBytes 1048576 thrpt 3 2587.132 ? 40.240 ops/s > > Benchmark (dataSize) (provider) Mode Cnt Score Error Units > Poly1305DigestBench.updateBytes 64 thrpt 3 12373649.589 ? 184757.721 ops/s > Poly1305DigestBench.updateBytes 256 th... This pull request has now been integrated. Changeset: dc21e8aa Author: Andrew Haley URL: https://git.openjdk.org/jdk/commit/dc21e8aa8321abb161bbbc02ca379eda27a4984c Stats: 195 lines in 4 files changed: 194 ins; 0 del; 1 mod 8296411: AArch64: Accelerated Poly1305 intrinsics Reviewed-by: redestad, adinn ------------- PR: https://git.openjdk.org/jdk/pull/14085 From rcastanedalo at openjdk.org Fri Jun 2 15:08:05 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Fri, 2 Jun 2023 15:08:05 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS In-Reply-To: References: Message-ID: <895T2Y56Ehfr8AK5tN6SerO9-BSTbk2TkCFHd0Kc9dQ=.a24de58e-fe91-44ff-a906-9f1af7d9144f@github.com> On Fri, 2 Jun 2023 13:41:18 GMT, Stefan Karlsson wrote: > Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) > > You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: > > > # Create a JFR recording > $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp > > # Extract the data from that recording > $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr > > Resident Set Size > > Time Resident Set Size Resident Set Size Peak Value > ---------------- ------------------------- ------------------------------------ > 11:56:07 1.1 GB 1.2 GB > 11:56:08 333.7 MB 1.2 GB > 11:56:09 432.4 MB 1.2 GB > 11:56:10 695.9 MB 1.2 GB > 11:56:11 1.0 GB 1.2 GB > 11:56:12 1.3 GB 1.3 GB > 11:56:13 1.3 GB 1.3 GB > 11:56:14 1.3 GB 1.3 GB > 11:56:15 1.3 GB 1.3 GB > 11:56:16 1.3 GB 1.3 GB > 11:56:17 1.4 GB 1.4 GB > 11:56:18 1.8 GB 1.8 GB > 11:56:19 2.0 GB 2.0 GB > > > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? > > I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. Thanks for proposing this enhancement, Stefan! I tried it out (on Linux) and it worked just fine. > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? I agree with naming it ResidentSetSize, even some popular operating system textbooks such as "the dinosaur book" use this term as a general concept. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14285#issuecomment-1573885783 From dnsimon at openjdk.org Fri Jun 2 16:20:49 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Fri, 2 Jun 2023 16:20:49 GMT Subject: RFR: 8309136: [JVMCI] add -XX:+UseGraalJIT flag [v4] In-Reply-To: References: Message-ID: > Use of the Graal-based JIT in OpenJDK currently requires the following flag: `-XX:+EnableJVMCIProduct` > > This has no direct association with Graal. If the JDK image happens to include a non-Graal JVMCI implementation, it will be automatically selected. This would come as a surprise to users who equate JVMCI with Graal. > > This PR introduces a new flag, `-XX:+UseGraalJIT` to address these shortcomings. It is an alias for `-XX:+EnableJVMCIProduct -Djvmci.Compiler=graal`. > > When `-XX:+UseGraalJIT` is specified, the VM fails fast at startup if there is a non-Graal JVMCI implementation or no JVMCI implementation in the JDK image. Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: - [skip-ci] Merge remote-tracking branch 'openjdk-jdk/master' into JDK-8309136 - improve error message when UseGraalJIT is used without -XX:+UnlockExperimentalVMOptions - use strncmp instead of strcmp - fix date in copyright header - set UseGraalJIT value in enable_jvmci_product_mode - added missing test of UseJVMCICompiler when adjusting JVMCI flags under -Xint - review based fixes - add UseGraalJIT VM flag ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14231/files - new: https://git.openjdk.org/jdk/pull/14231/files/28f0a8b8..bcbab075 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14231&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14231&range=02-03 Stats: 12358 lines in 279 files changed: 10127 ins; 1114 del; 1117 mod Patch: https://git.openjdk.org/jdk/pull/14231.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14231/head:pull/14231 PR: https://git.openjdk.org/jdk/pull/14231 From shade at openjdk.org Fri Jun 2 16:34:25 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 2 Jun 2023 16:34:25 GMT Subject: RFR: 8297967: Make frame::safe_for_sender safer [v6] In-Reply-To: References: Message-ID: <4BjensS14MJdq2qsJ3ajbPeU-xCoJpWRAQ_Uu8kSdio=.ee48bb54-b406-486e-92fd-8f5cfa618c70@github.com> On Mon, 24 Apr 2023 09:52:05 GMT, Johannes Bechberger wrote: >> Makes `frame::safe_for_sender` safer by checking that the location of the return address, sender stack pointer, and link address is accessible. This makes the method safer in the case of broken frames. > > Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Remove errorneously added check > - Remove check for value that might be null > - More SafeFetch > - Make frame::safe_for_sender safer with SafeFetch Some drive-by comments here. src/hotspot/cpu/aarch64/frame_aarch64.cpp line 139: > 137: if (SafeFetchN(this->fp() + return_addr_offset, 0) == 0 || > 138: SafeFetchN(this->fp() + interpreter_frame_sender_sp_offset, 0) == 0 || > 139: SafeFetchN(this->fp() + link_offset, 0) == 0 So these three are `sender_sp`, `sender_unextended_sp` and `saved_fp` a few lines below, right? So we can just SafeFetch-check those vars after we got them? This also highlights we want to check that those _pointers_ are safe to fetch from. I.e.: // for interpreted frames, the value below is the sender "raw" sp, // which can be different from the sender unextended sp (the sp seen // by the sender) because of current frame local variables sender_sp = (intptr_t*) addr_at(sender_sp_offset); sender_unextended_sp = (intptr_t*) this->fp()[interpreter_frame_sender_sp_offset]; saved_fp = (intptr_t*) this->fp()[link_offset]; saved_fp = (intptr_t*) this->fp()[link_offset]; if ((SafeFetchN(sender_sp, 0) == 0) || (SafeFetchN(sender_unextended_sp) == 0) || (SafeFetchN(saved_fp, 0) == 0)) { return false; } src/hotspot/cpu/arm/frame_arm.cpp line 219: > 217: // Will the pc we fetch be non-zero (which we'll find at the oldest frame) > 218: > 219: if ((address) this->fp()[return_addr_offset] == NULL) return false; Accidental revert: "nullptr" -> "NULL". src/hotspot/cpu/riscv/frame_riscv.cpp line 260: > 258: // Will the pc we fetch be non-zero (which we'll find at the oldest frame) > 259: > 260: if ((address) this->fp()[return_addr_offset] == NULL) return false; Excess newline. Accidental revert: "nullptr" -> "NULL". src/hotspot/cpu/x86/frame_x86.cpp line 265: > 263: // Will the pc we fetch be non-zero (which we'll find at the oldest frame) > 264: > 265: if ((address) this->fp()[return_addr_offset] == NULL) return false; Accidental revert: "nullptr" -> "NULL". ------------- PR Review: https://git.openjdk.org/jdk/pull/11461#pullrequestreview-1457965340 PR Review Comment: https://git.openjdk.org/jdk/pull/11461#discussion_r1214583007 PR Review Comment: https://git.openjdk.org/jdk/pull/11461#discussion_r1214584013 PR Review Comment: https://git.openjdk.org/jdk/pull/11461#discussion_r1214584272 PR Review Comment: https://git.openjdk.org/jdk/pull/11461#discussion_r1214584344 From jbechberger at openjdk.org Fri Jun 2 16:50:16 2023 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Fri, 2 Jun 2023 16:50:16 GMT Subject: RFR: 8297967: Make frame::safe_for_sender safer [v6] In-Reply-To: <4BjensS14MJdq2qsJ3ajbPeU-xCoJpWRAQ_Uu8kSdio=.ee48bb54-b406-486e-92fd-8f5cfa618c70@github.com> References: <4BjensS14MJdq2qsJ3ajbPeU-xCoJpWRAQ_Uu8kSdio=.ee48bb54-b406-486e-92fd-8f5cfa618c70@github.com> Message-ID: On Fri, 2 Jun 2023 16:30:17 GMT, Aleksey Shipilev wrote: >> Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Remove errorneously added check >> - Remove check for value that might be null >> - More SafeFetch >> - Make frame::safe_for_sender safer with SafeFetch > > Some drive-by comments here. @shipilev do you think that bringing this PR is worthwhile? It only hardens the API when it is slightly misused (being passed in broken ucontexts). > src/hotspot/cpu/aarch64/frame_aarch64.cpp line 139: > >> 137: if (SafeFetchN(this->fp() + return_addr_offset, 0) == 0 || >> 138: SafeFetchN(this->fp() + interpreter_frame_sender_sp_offset, 0) == 0 || >> 139: SafeFetchN(this->fp() + link_offset, 0) == 0 > > So these three are `sender_sp`, `sender_unextended_sp` and `saved_fp` a few lines below, right? So we can just SafeFetch-check those vars after we got them? This also highlights we want to check that those _pointers_ are safe to fetch from. > > I.e.: > > > // for interpreted frames, the value below is the sender "raw" sp, > // which can be different from the sender unextended sp (the sp seen > // by the sender) because of current frame local variables > sender_sp = (intptr_t*) addr_at(sender_sp_offset); > sender_unextended_sp = (intptr_t*) this->fp()[interpreter_frame_sender_sp_offset]; > saved_fp = (intptr_t*) this->fp()[link_offset]; > saved_fp = (intptr_t*) this->fp()[link_offset]; > > if ((SafeFetchN(sender_sp, 0) == 0) || > (SafeFetchN(sender_unextended_sp) == 0) || > (SafeFetchN(saved_fp, 0) == 0)) { > return false; > } you're right, that looks much better > src/hotspot/cpu/arm/frame_arm.cpp line 219: > >> 217: // Will the pc we fetch be non-zero (which we'll find at the oldest frame) >> 218: >> 219: if ((address) this->fp()[return_addr_offset] == NULL) return false; > > Accidental revert: "nullptr" -> "NULL". I started the PR before the whole move, thanks for noticing this :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/11461#issuecomment-1574031359 PR Review Comment: https://git.openjdk.org/jdk/pull/11461#discussion_r1214598440 PR Review Comment: https://git.openjdk.org/jdk/pull/11461#discussion_r1214598756 From sgibbons at openjdk.org Fri Jun 2 17:20:48 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 2 Jun 2023 17:20:48 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v4] In-Reply-To: References: Message-ID: > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Code cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/e1131955..904d6d94 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=02-03 Stats: 427 lines in 1 file changed: 92 ins; 326 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From pchilanomate at openjdk.org Fri Jun 2 17:42:16 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 2 Jun 2023 17:42:16 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite [v2] In-Reply-To: References: Message-ID: > Please review the following fix. Runtime methods called through the SharedRuntime::generate_resolve_blob() stub always return the value stored in _from_compiled_entry as the entry point to the callee method. This will either be the entry point to the compiled version of callee if there is one or the c2i adapter entry point. But this doesn't consider the case where an EnterInterpOnlyModeClosure handshake catches the JavaThread in the transition back to Java on those methods. In that case we should return the c2i adapter entry point even if there is a compiled entry point. Otherwise the JavaThread will continue calling the compiled versions of methods without noticing it's in interpreted only mode until it either calls a method that hasn't been compiled yet or it returns to the caller of that resolved callee where the change to interpreter only mode happened (since the EnterInterpOnlyModeClosure handshake marked all the frames on the stack for deoptimization). > > This is a long standing bug but has been made visible with the assert added as part of 8288949 where a related issue was fixed. There are more details in the bug comments about how this specific crash happens and its relation with 8288949. I also attached a reproducer. > > These runtime methods are already using JRT_BLOCK_ENTRY/JRT_BLOCK so that the entry point to the callee is fetched only after the last possible safepoint in JRT_BLOCK_END. This guarantees that we will not return an entry point to compiled code that has already been removed. So the fix just adds a check to verify if the JavaThread entered into interpreted only mode in that transition back to Java and to return the c2i entry point instead. > > I tested the patch in mach5 tiers 1-6. I also verified it with the reproducer I attached to the bug. I didn't include it as an additional test but I can do that if desired. > > Thanks, > Patricio Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: refactor code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14108/files - new: https://git.openjdk.org/jdk/pull/14108/files/54dee960..e2209ac1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14108&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14108&range=00-01 Stats: 51 lines in 3 files changed: 7 ins; 20 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/14108.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14108/head:pull/14108 PR: https://git.openjdk.org/jdk/pull/14108 From pchilanomate at openjdk.org Fri Jun 2 17:42:34 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 2 Jun 2023 17:42:34 GMT Subject: RFR: 8302351: "assert(!JavaThread::current()->is_interp_only_mode() || !nm->method()->is_continuation_enter_intrinsic() || ContinuationEntry::is_interpreted_call(return_pc)) failed: interp_only_mode but not in enterSpecial interpreted entry" in fixup_callers_callsite In-Reply-To: References: Message-ID: <3VBIxsnt4YQkOjzXx0tAO1canyk5tMjdgTdNpl738qw=.c54916ae-2f98-4232-8fc0-21968b87c6b9@github.com> On Wed, 31 May 2023 07:49:13 GMT, Dean Long wrote: > This patch seems like a step in the right direction, so let's keep the assert and special-case _linkToNative. It is the only special case I know of. > I have updated the PR with the suggested refactoring. Now, I have stress tested the patch by simulating always switching to interpreter only mode when going through this resolve methods in SharedRuntime and I found there is also an issue with the other method handle intrinsics, besides _linkToNative. If we return the c2i entry when resolving any of this methods we will end up jumping to the code generated by generate_method_handle_interpreter_entry(), but that code has an implicit null check for the MH receiver in generate_method_handle_dispatch(). If that is indeed null the signal handler will redirect us to generate_throw_exception() where we will call SharedRuntime::throw_NullPointerException_at_call() to create the exception. But the stack has already been changed in the c2i adapter to pass the arguments to the callee and also in that MH interpreter entry [1]. So when walking the stack we will crash when getting the sender of this compiled method that went through the c2i + MH in terpreter entry. I found this issue with test compiler/jsr292/NullConstantReceiver.java. The bottom line issue is that we jumped to the interpreter target from the c2i entry but we never actually created an interpreter frame because we throw an exception before that, and since we modified the stack already we cannot walk it anymore. Today we never hit this issue because it seems the _from_compiled_entry for method handle intrinsics never points to the c2i, but always to the compiled entry generated by gen_special_dispatch() (there has been an interesting discussion about this in 8302320). One way to keep this behavior and still catch the switch to interpreted only mode when resolving this methods could be to still return the _from_compiled_entry for method handle intrinsics, but modify jump_from_method_handle() to check for interpreted only mode in both cases, not just when coming from interpreted code [2]. So if we came from compiled code and we are in interpreted only mode jump to the c2i instead. I have tested that and seems to work but I'm open to suggestions on this. I'm going away on vacations for a few days though so I'll resume work on this once I'm back. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/methodHandles_x86.cpp#L311 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/methodHandles_x86.cpp#L138 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14108#issuecomment-1574091628 From ysr at openjdk.org Fri Jun 2 17:59:26 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 2 Jun 2023 17:59:26 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1285: > 1283: if (unalignment != 0) { > 1284: word_size = word_size - unalignment + CardTable::card_size_in_words(); > 1285: } Probably not a big deal since this is only used when refilling a PLAB, which is an infrequent operation, but `mod` is an expensive operation, in general, and best to avoid in our code except in assertion checks (or even there given recent experiences with debug tests timing out). Since card size is a power of 2, may be we could use addition and masking instead. Something like defining the following inline in the CardTable class and using it everywhere where card alignment granularity is sought. There may even be a macro or method defined for this already perhaps: (FOO + CardSize - 1) & ~((1 << LogCardSize) - 1) One could even store the mask to avoid the arithmetic to produce the mask although it's pretty cheap. This may turn out to be less expensive than mod, test, and branch, but as I said probably not a big deal here. We should make sure we don't overuse mods in our allocation paths much. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1214658808 From mdoerr at openjdk.org Fri Jun 2 18:13:24 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 2 Jun 2023 18:13:24 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size > # Internal Error (shenandoahVerifier.cpp:1244), pid=2951116, tid=2951124 > # Error: Verify init-mark remembered set violation; clean card should be dirty I've seen the same issue on PPC64: https://bugs.openjdk.org/browse/JDK-8309371 ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1574125489 From ysr at openjdk.org Fri Jun 2 18:27:21 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 2 Jun 2023 18:27:21 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: <7uARcGDHOuSUugc2zRg7JQgC2dSPBDOjeWGPjBPO2qs=.a09479b8-ba9b-4596-bc5a-7ace0968fe31@github.com> On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size src/hotspot/cpu/riscv/gc/shenandoah/c1/shenandoahBarrierSetC1_riscv.cpp line 4: > 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright (c) 2020, 2021, Huawei Technologies Co., Ltd. All rights reserved. > 4: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. This should be backed out, since it seems that there is no (other) change to this fie. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1214682175 From sviswanathan at openjdk.org Fri Jun 2 18:34:13 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 2 Jun 2023 18:34:13 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v4] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 17:20:48 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Code cleanup src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 430: > 428: __ jcc(Assembler::aboveEqual, L_112a); > 429: // res = y + y; > 430: __ vaddsd(xmm0, xmm0, xmm1); Should this be: __ vaddsd(xmm0, xmm1, xmm1); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1214687840 From shade at openjdk.org Fri Jun 2 18:36:17 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 2 Jun 2023 18:36:17 GMT Subject: RFR: 8297967: Make frame::safe_for_sender safer [v6] In-Reply-To: <4BjensS14MJdq2qsJ3ajbPeU-xCoJpWRAQ_Uu8kSdio=.ee48bb54-b406-486e-92fd-8f5cfa618c70@github.com> References: <4BjensS14MJdq2qsJ3ajbPeU-xCoJpWRAQ_Uu8kSdio=.ee48bb54-b406-486e-92fd-8f5cfa618c70@github.com> Message-ID: On Fri, 2 Jun 2023 16:30:17 GMT, Aleksey Shipilev wrote: >> Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: >> >> - Remove errorneously added check >> - Remove check for value that might be null >> - More SafeFetch >> - Make frame::safe_for_sender safer with SafeFetch > > Some drive-by comments here. > @shipilev do you think that bringing this PR is worthwhile? It only hardens the API when it is slightly misused (being passed in broken ucontexts). In my mind, if this have a non-zero probability of happening during the normal non-fuzzing execution, then we should be on the safe side and check. A profiler that crashes the app is never a good thing. I got here through the [bug report here](https://github.com/corretto/corretto-11/issues/328) -- so it might be a signal that problems like these happen IRL. ------------- PR Comment: https://git.openjdk.org/jdk/pull/11461#issuecomment-1574152626 From mdoerr at openjdk.org Fri Jun 2 18:41:25 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 2 Jun 2023 18:41:25 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <7uARcGDHOuSUugc2zRg7JQgC2dSPBDOjeWGPjBPO2qs=.a09479b8-ba9b-4596-bc5a-7ace0968fe31@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> <7uARcGDHOuSUugc2zRg7JQgC2dSPBDOjeWGPjBPO2qs=.a09479b8-ba9b-4596-bc5a-7ace0968fe31@github.com> Message-ID: <95yaqYTYoGnlkrDMbvZ-NTyVbGmHrL4DUYYIlV3wkwQ=.b6d787d2-3858-4486-b3dc-428a87969109@github.com> On Fri, 2 Jun 2023 18:24:16 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Force PLAB sizes to align on card-table size > > src/hotspot/cpu/riscv/gc/shenandoah/c1/shenandoahBarrierSetC1_riscv.cpp line 4: > >> 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. >> 3: * Copyright (c) 2020, 2021, Huawei Technologies Co., Ltd. All rights reserved. >> 4: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > This should be backed out, since it seems that there is no (other) change to this fie. Yes. And also from files which were changed by non-Amazon employees only, please. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1214693296 From sgibbons at openjdk.org Fri Jun 2 18:43:10 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 2 Jun 2023 18:43:10 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v4] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 18:31:24 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Code cleanup > > src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 430: > >> 428: __ jcc(Assembler::aboveEqual, L_112a); >> 429: // res = y + y; >> 430: __ vaddsd(xmm0, xmm0, xmm1); > > Should this be: __ vaddsd(xmm0, xmm1, xmm1); This is correct as written according to the disassembly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1214694297 From shade at openjdk.org Fri Jun 2 18:44:10 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Fri, 2 Jun 2023 18:44:10 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 11:47:58 GMT, Aleksey Shipilev wrote: >> Boris Ulasevich has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains three new commits since the last revision: >> >> - readability rework >> - cleanup >> - 8305959: x86: Improve itable_stub > > src/hotspot/cpu/x86/vtableStubs_x86_32.cpp line 203: > >> 201: >> 202: start_pc = __ pc(); >> 203: __ push(temp_reg); > > Why do we need to save this one? Do we care if this "tmp" is clobbered? This one is still not addressed, in case you missed it, @bulasevich. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1214695501 From iklam at openjdk.org Fri Jun 2 19:02:05 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 2 Jun 2023 19:02:05 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: On Fri, 2 Jun 2023 13:50:51 GMT, Ashutosh Mehra wrote: > > I think Thomas's point is, the requested range should be passed to the alloc_archive_regions() API, even though the collector may simply ignore it. > > @tschatzl @iklam Other collectors may ignore the requested addr, but G1 can just allocate the regions at the requested addr, right? Do we also need the fallback to allocating at the top of heap if for some reason allocation at requested addr fails? Yes, I think we need the fallback. However, I think we can keep the behavior exactly the same in this PR (only allocate at the heap end), and implement the more optimal allocation in a subsequent PR. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1574179898 From sgibbons at openjdk.org Fri Jun 2 19:27:58 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 2 Jun 2023 19:27:58 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v5] In-Reply-To: References: Message-ID: > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Correct transliteration issue ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/904d6d94..26a821f9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=03-04 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From sgibbons at openjdk.org Fri Jun 2 19:28:00 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 2 Jun 2023 19:28:00 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v4] In-Reply-To: References: Message-ID: <4LrL31I4nHlqJ1Ab3tWxYveKMx-eQJm_nyymf3Kz_f0=.b4f9ad46-1781-4cf2-afc2-00810b9aa630@github.com> On Fri, 2 Jun 2023 18:39:59 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 430: >> >>> 428: __ jcc(Assembler::aboveEqual, L_112a); >>> 429: // res = y + y; >>> 430: __ vaddsd(xmm0, xmm0, xmm1); >> >> Should this be: __ vaddsd(xmm0, xmm1, xmm1); > > This is correct as written according to the disassembly. After further investigation, you are correct. Changed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1214738035 From duke at openjdk.org Fri Jun 2 19:46:31 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 2 Jun 2023 19:46:31 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: > This patch is the first step towards having a single set of GC APIs for allocating heap space for the archived objects (See https://bugs.openjdk.org/browse/JDK-8296263). > It moves some of the G1 specific logic from CDS to G1 gc without changing the functionality. > > Changes that add/update GC APIs for handling archive heap would be introduced in upcoming patches. Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: Review comments - updates to alloc_archive_regions() api Signed-off-by: Ashutosh Mehra ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14208/files - new: https://git.openjdk.org/jdk/pull/14208/files/caccaa12..82ddb97c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14208&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14208&range=01-02 Stats: 6 lines in 3 files changed: 2 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14208.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14208/head:pull/14208 PR: https://git.openjdk.org/jdk/pull/14208 From duke at openjdk.org Fri Jun 2 19:46:32 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 2 Jun 2023 19:46:32 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: Message-ID: On Wed, 31 May 2023 14:49:23 GMT, Ashutosh Mehra wrote: >> This patch is the first step towards having a single set of GC APIs for allocating heap space for the archived objects (See https://bugs.openjdk.org/browse/JDK-8296263). >> It moves some of the G1 specific logic from CDS to G1 gc without changing the functionality. >> >> Changes that add/update GC APIs for handling archive heap would be introduced in upcoming patches. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused method FileMapInfo::heap_region_mapped_address > > Signed-off-by: Ashutosh Mehra Pushed a commit with suggested changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1574225609 From iklam at openjdk.org Fri Jun 2 20:13:15 2023 From: iklam at openjdk.org (Ioi Lam) Date: Fri, 2 Jun 2023 20:13:15 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: <0_uC7OzsWkpKayYwBA4Q3h1ZJuyD3GNukoB8B_6jSdE=.a549624f-75b7-4c5f-abc2-ccdf4954d013@github.com> On Fri, 2 Jun 2023 19:46:31 GMT, Ashutosh Mehra wrote: >> This patch is the first step towards having a single set of GC APIs for allocating heap space for the archived objects (See https://bugs.openjdk.org/browse/JDK-8296263). >> It moves some of the G1 specific logic from CDS to G1 gc without changing the functionality. >> >> Changes that add/update GC APIs for handling archive heap would be introduced in upcoming patches. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Review comments - updates to alloc_archive_regions() api > > Signed-off-by: Ashutosh Mehra src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 712: > 710: // the location of the archive space in the heap. The returned address may or may > 711: // not be same as the preferred address. > 712: HeapWord* alloc_archive_region(size_t word_size, HeapWord* preferred_addr); Sorry to be picky, but I think `region` implies a single region, but G1 could allocate one or more regions to satisfy the request. I think it's better to use `allocate_archive_range(MemRegion requested_range)` to be more neutral. Passing the range in a `MemRegion` will also look similar to the API right above this one. Maybe we should also change `populate_archive_regions_bot_part` and `dealloc_archive_regions` to use `_range` as well. What do you think, @tschatzl ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1214790412 From ysr at openjdk.org Fri Jun 2 20:15:29 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Fri, 2 Jun 2023 20:15:29 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: <9O5Q9i_7sOIZLqYlRGDMhGOkhCS2aZ2_gFObV9LbjdY=.310b660d-a48e-4ba8-b61a-126ef483f5bc@github.com> On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size Thanks for issue reports, for which we are filing JBS tickets as we receive reports, and will be resolving them as we go. The issues reported to us have been filed as issues linked to the JBS ticket for this PR. Reviewers and testers should please feel free to file tickets as they find other issues, given testing notes in the PR, and platforms on which we have inadequate or no testing or coverage. We appreciate the help! We will be incrementally resolving as many of the reported issues as we are able to, including in follow up PRs as bug fixes as appropriate. There were some issues reported with copyright headers, and those will be fixed soon in this PR. Please follow the tickets for further updates. Many thanks to all the reviewers so far. I have been a reviewer at the project level for several of the commits that comprise this PR, so I am happy to review and approve this pull request as a reviewer, as well as a project participant and partial implementor for some fixes. Onwards to the brave new generation of Shenandoah! ;-) ? ? ------------- Marked as reviewed by ysr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14185#pullrequestreview-1458425985 From egahlin at openjdk.org Fri Jun 2 20:42:05 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Fri, 2 Jun 2023 20:42:05 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS In-Reply-To: <895T2Y56Ehfr8AK5tN6SerO9-BSTbk2TkCFHd0Kc9dQ=.a24de58e-fe91-44ff-a906-9f1af7d9144f@github.com> References: <895T2Y56Ehfr8AK5tN6SerO9-BSTbk2TkCFHd0Kc9dQ=.a24de58e-fe91-44ff-a906-9f1af7d9144f@github.com> Message-ID: On Fri, 2 Jun 2023 15:05:44 GMT, Roberto Casta?eda Lozano wrote: > Thanks for proposing this enhancement, Stefan! I tried it out (on Linux) and it worked just fine. > > > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? > I agree. What's important is that end-users understand the concept. If there are differences between OS:es (worth mentioning) they can be stated in the description of the event, i.e. "On MacOs X is included, but not Y" ------------- PR Comment: https://git.openjdk.org/jdk/pull/14285#issuecomment-1574289669 From duke at openjdk.org Fri Jun 2 20:45:07 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Fri, 2 Jun 2023 20:45:07 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: <0_uC7OzsWkpKayYwBA4Q3h1ZJuyD3GNukoB8B_6jSdE=.a549624f-75b7-4c5f-abc2-ccdf4954d013@github.com> References: <0_uC7OzsWkpKayYwBA4Q3h1ZJuyD3GNukoB8B_6jSdE=.a549624f-75b7-4c5f-abc2-ccdf4954d013@github.com> Message-ID: On Fri, 2 Jun 2023 20:10:06 GMT, Ioi Lam wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments - updates to alloc_archive_regions() api >> >> Signed-off-by: Ashutosh Mehra > > src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 712: > >> 710: // the location of the archive space in the heap. The returned address may or may >> 711: // not be same as the preferred address. >> 712: HeapWord* alloc_archive_region(size_t word_size, HeapWord* preferred_addr); > > Sorry to be picky, but I think `region` implies a single region, but G1 could allocate one or more regions to satisfy the request. I think it's better to use `allocate_archive_range(MemRegion requested_range)` to be more neutral. Passing the range in a `MemRegion` will also look similar to the API right above this one. > > Maybe we should also change `populate_archive_regions_bot_part` and `dealloc_archive_regions` to use `_range` as well. What do you think, @tschatzl How about replacing `allocate_archive_range` with `allocate_archive_space` which is actually planned for the next patch? I am more inclined to keep size and address as separate parameters because the address is just a hint for the collectors, while size is not. With MemRegion as the type this information is not conveyed to the reader without reading the comments/code. This is also the reason why I prefer using "preferred_addr" rather than "requested_addr". But if you feel otherwise I will update the API to use MemRegion. > Maybe we should also change populate_archive_regions_bot_part and dealloc_archive_regions to use _range as well I am not sure if it is necessary to update these APIs, as I would anyway be replacing them with fixup_archive_space() and handle_archive_space_failure() as mentioned in the description of main issue [JDK-8296263](https://bugs.openjdk.org/browse/JDK-8296263). IMO these names are more generic. If you want I can update the code to use these new names in this patch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1214819384 From amenkov at openjdk.org Fri Jun 2 22:05:16 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 2 Jun 2023 22:05:16 GMT Subject: Integrated: 8308978: regression with a deadlock involving FollowReferences In-Reply-To: <2J1qItzUgmfjRPS0xUbHgXZQ-b12JBxe8XPRftU2GyA=.025e7855-5df4-413a-bea7-585a53832025@github.com> References: <2J1qItzUgmfjRPS0xUbHgXZQ-b12JBxe8XPRftU2GyA=.025e7855-5df4-413a-bea7-585a53832025@github.com> Message-ID: On Tue, 30 May 2023 22:58:58 GMT, Alex Menkov wrote: > The change fixes regression from JDK-8299414. > There is a deadlock between JvmtiVTMSTransitionDisabler and EscapeBarrier when virtual threads are in mount/unmount transition: > EscapeBarrier requests deoptimization which requires thread suspension. > JvmtiVTMSTransitionDisabler ctor waits until all in progress VTMS transitions complete, but they cannot be completed as thread is suspended. > To avoid the deadlock mount/unmount transitions should be completed before EscapeBarrier stuff. This pull request has now been integrated. Changeset: 62c935d4 Author: Alex Menkov URL: https://git.openjdk.org/jdk/commit/62c935d4fa09ed557d301bc28d9bf1480b344989 Stats: 22 lines in 2 files changed: 6 ins; 16 del; 0 mod 8308978: regression with a deadlock involving FollowReferences Reviewed-by: sspitsyn, lmesnik ------------- PR: https://git.openjdk.org/jdk/pull/14233 From sgibbons at openjdk.org Fri Jun 2 22:52:48 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Fri, 2 Jun 2023 22:52:48 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v6] In-Reply-To: References: Message-ID: > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Indentation; spread source into assembly ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/26a821f9..85999cd1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=04-05 Stats: 447 lines in 1 file changed: 0 ins; 26 del; 421 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From sviswanathan at openjdk.org Fri Jun 2 23:25:10 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Fri, 2 Jun 2023 23:25:10 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v6] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 22:52:48 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Indentation; spread source into assembly @asgibbons Thanks for taking care of all the review comments. The PR looks good to me now. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14224#pullrequestreview-1458591764 From jbechberger at openjdk.org Sat Jun 3 08:46:24 2023 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Sat, 3 Jun 2023 08:46:24 GMT Subject: RFR: 8297967: Make frame::safe_for_sender safer [v6] In-Reply-To: References: Message-ID: On Mon, 24 Apr 2023 09:52:05 GMT, Johannes Bechberger wrote: >> Makes `frame::safe_for_sender` safer by checking that the location of the return address, sender stack pointer, and link address is accessible. This makes the method safer in the case of broken frames. > > Johannes Bechberger has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: > > - Remove errorneously added check > - Remove check for value that might be null > - More SafeFetch > - Make frame::safe_for_sender safer with SafeFetch This might possible. I'll work on it then next week. ------------- PR Comment: https://git.openjdk.org/jdk/pull/11461#issuecomment-1574791022 From dfuchs at openjdk.org Sat Jun 3 10:21:08 2023 From: dfuchs at openjdk.org (Daniel Fuchs) Date: Sat, 3 Jun 2023 10:21:08 GMT Subject: RFR: 8306647: Implementation of Structured Concurrency (Preview) [v4] In-Reply-To: References: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> Message-ID: On Thu, 1 Jun 2023 13:43:33 GMT, Alan Bateman wrote: >> This is the implementation of: >> >> - JEP 453: Structured Concurrency (Preview) >> - JEP 446: Scoped Values (Preview) >> >> For the most part, this is just moving code and tests. StructuredTaskScope moves to j.u.concurrent as a preview API, ScopedValue moves to j.lang as a preview API, and module jdk.incubator.concurrent has been removed. The significant API changes since incubator are: >> >> - StructuredTaskScope.fork returns Subtask instead of Future (JEP 453 has a section on this) >> - ScopedValue.where methods are replaced with runWhere, callWhere and getWhere > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Sync up from loom repo > - Merge > - Sync with loom repo, re-work ScopedValue class description > - Sync up from loom repo > - Remove csm.Threads > - Merge > - Test should not be in update for main line > - Sync with loom repo > - Sync up tests frmo loom repo > - Sync up with loom repo > - ... and 5 more: https://git.openjdk.org/jdk/compare/a46b5acc...cc902ce6 I reviewed the API documentation of `ScopedValue` (mostly the class level API documentation, but I had a look at the methods API too) and it looks good. It reads well, and when I had question they were usually answered in the next paragraph that followed what I was reading. I didn't look at Structured Concurrency. ------------- Marked as reviewed by dfuchs (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13932#pullrequestreview-1459172232 From sspitsyn at openjdk.org Sat Jun 3 11:25:23 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sat, 3 Jun 2023 11:25:23 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING Message-ID: When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. The fix adds: - a special case for passive carrier threads - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. Testing: - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` - submitted mach5 tiers 1-5 - TBD: to submit mach5 tier 6 ------------- Commit messages: - minor tweaks in libThreadStateTest.cpp - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING Changes: https://git.openjdk.org/jdk/pull/14298/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8307153 Stats: 71 lines in 4 files changed: 61 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/14298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14298/head:pull/14298 PR: https://git.openjdk.org/jdk/pull/14298 From dnsimon at openjdk.org Sat Jun 3 11:40:32 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Sat, 3 Jun 2023 11:40:32 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal Message-ID: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> This PR improves the startup time for libgraal by speeding up how `VM.savedProps` is copied into libgraal. This data structure is now serialized to a native buffer directly from C++ and the native buffer is then directly decoded by libgraal. ## Times The basic benchmarking below shows that this change brings the time for a nop Java app with eager libgraal initialization (2) down to almost the same time as lazy libgraal initialization (1). The latter typically means no libgraal initialization happens as a top tier JIT compilation is never scheduled in such a short running app. public class Nop { public static void main(String[] args) {} } (1) Baseline (no options): > for i in (seq 10); java Nop; end 0.05 real 0.04 user 0.01 sys 0.04 real 0.03 user 0.01 sys 0.04 real 0.03 user 0.01 sys 0.04 real 0.03 user 0.01 sys 0.03 real 0.03 user 0.00 sys 0.04 real 0.03 user 0.01 sys 0.04 real 0.03 user 0.00 sys 0.03 real 0.03 user 0.00 sys 0.04 real 0.03 user 0.01 sys 0.03 real 0.03 user 0.00 sys (2) Eagerly initialize libgraal (with PR): > for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end 0.06 real 0.04 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys 0.05 real 0.03 user 0.01 sys (3) Eagerly initialize libgraal (without PR): > for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end 0.11 real 0.08 user 0.02 sys 0.08 real 0.06 user 0.01 sys 0.08 real 0.07 user 0.01 sys 0.10 real 0.07 user 0.01 sys 0.08 real 0.06 user 0.01 sys 0.10 real 0.07 user 0.01 sys 0.08 real 0.07 user 0.01 sys 0.08 real 0.07 user 0.01 sys 0.08 real 0.06 user 0.01 sys 0.08 real 0.06 user 0.01 sys ------------- Commit messages: - more efficient copying of system properties into libjvmci Changes: https://git.openjdk.org/jdk/pull/14291/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14291&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309390 Stats: 242 lines in 8 files changed: 172 ins; 30 del; 40 mod Patch: https://git.openjdk.org/jdk/pull/14291.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14291/head:pull/14291 PR: https://git.openjdk.org/jdk/pull/14291 From dnsimon at openjdk.org Sat Jun 3 11:40:32 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Sat, 3 Jun 2023 11:40:32 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal In-Reply-To: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> References: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> Message-ID: On Fri, 2 Jun 2023 20:32:14 GMT, Doug Simon wrote: > This PR improves the startup time for libgraal by speeding up how `VM.savedProps` is copied into libgraal. This data structure is now serialized to a native buffer directly from C++ and the native buffer is then directly decoded by libgraal. > > ## Times > > The basic benchmarking below shows that this change brings the time for a nop Java app with eager libgraal initialization (2) down to almost the same time as lazy libgraal initialization (1). The latter typically means no libgraal initialization happens as a top tier JIT compilation is never scheduled in such a short running app. > > > public class Nop { > public static void main(String[] args) {} > } > > > (1) Baseline (no options): > >> for i in (seq 10); java Nop; end > 0.05 real 0.04 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.00 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > > > (2) Eagerly initialize libgraal (with PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.06 real 0.04 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > > > (3) Eagerly initialize libgraal (without PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.11 real 0.08 user 0.02 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real ... src/java.base/share/classes/jdk/internal/util/SystemProps.java line 62: > 60: * are initialized by VersionProps.java-template. > 61: * > 62: * @return a Properties instance initialized with all of the properties I took the liberty of correcting this javadoc to reflect the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14291#discussion_r1215453699 From jwaters at openjdk.org Sat Jun 3 13:45:21 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 3 Jun 2023 13:45:21 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v16] In-Reply-To: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: > C++11 added the alignas attribute, for the purpose of specifying alignment on types, much like compiler specific syntax such as gcc's __attribute__((aligned(x))) or Visual C++'s __declspec(align(x)). > > We can phase out the use of the macro in favor of the standard attribute. In the meantime, we can replace the compiler specific definitions of ATTRIBUTE_ALIGNED with a portable definition. We might deprecate the use of the macro but changing its implementation quickly and cleanly applies the feature where the macro is being used. > > Note: With certain parts of HotSpot using ATTRIBUTE_ALIGNED so indiscriminately, this commit will likely take some time to get right > > This will require adding the alignas attribute to the list of language features approved for use in HotSpot code. (Completed with [8297912](https://github.com/openjdk/jdk/pull/11446)) Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: - Merge branch 'master' into alignas - Merge branch 'openjdk:master' into alignas - alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - Merge branch 'openjdk:master' into alignas - ... and 6 more: https://git.openjdk.org/jdk/compare/6edd786b...48d816d7 ------------- Changes: https://git.openjdk.org/jdk/pull/11431/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=11431&range=15 Stats: 9 lines in 3 files changed: 0 ins; 7 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/11431.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/11431/head:pull/11431 PR: https://git.openjdk.org/jdk/pull/11431 From stuefe at openjdk.org Sat Jun 3 14:07:07 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 3 Jun 2023 14:07:07 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 13:41:18 GMT, Stefan Karlsson wrote: > Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) > > You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: > > > # Create a JFR recording > $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp > > # Extract the data from that recording > $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr > > Resident Set Size > > Time Resident Set Size Resident Set Size Peak Value > ---------------- ------------------------- ------------------------------------ > 11:56:07 1.1 GB 1.2 GB > 11:56:08 333.7 MB 1.2 GB > 11:56:09 432.4 MB 1.2 GB > 11:56:10 695.9 MB 1.2 GB > 11:56:11 1.0 GB 1.2 GB > 11:56:12 1.3 GB 1.3 GB > 11:56:13 1.3 GB 1.3 GB > 11:56:14 1.3 GB 1.3 GB > 11:56:15 1.3 GB 1.3 GB > 11:56:16 1.3 GB 1.3 GB > 11:56:17 1.4 GB 1.4 GB > 11:56:18 1.8 GB 1.8 GB > 11:56:19 2.0 GB 2.0 GB > > > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? > > I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. Useful, but the user still needs to know how to interpret that number. Interpreting RSS is not straightforward. Therefore I like the name "ResidentSetSize", any other name would just add obfuscation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14285#issuecomment-1574969674 From stuefe at openjdk.org Sat Jun 3 14:11:05 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 3 Jun 2023 14:11:05 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 13:41:18 GMT, Stefan Karlsson wrote: > Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) > > You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: > > > # Create a JFR recording > $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp > > # Extract the data from that recording > $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr > > Resident Set Size > > Time Resident Set Size Resident Set Size Peak Value > ---------------- ------------------------- ------------------------------------ > 11:56:07 1.1 GB 1.2 GB > 11:56:08 333.7 MB 1.2 GB > 11:56:09 432.4 MB 1.2 GB > 11:56:10 695.9 MB 1.2 GB > 11:56:11 1.0 GB 1.2 GB > 11:56:12 1.3 GB 1.3 GB > 11:56:13 1.3 GB 1.3 GB > 11:56:14 1.3 GB 1.3 GB > 11:56:15 1.3 GB 1.3 GB > 11:56:16 1.3 GB 1.3 GB > 11:56:17 1.4 GB 1.4 GB > 11:56:18 1.8 GB 1.8 GB > 11:56:19 2.0 GB 2.0 GB > > > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? > > I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. Good. Want to add vsize while you are at it :) ? ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14285#pullrequestreview-1459461733 From jwaters at openjdk.org Sat Jun 3 14:27:11 2023 From: jwaters at openjdk.org (Julian Waters) Date: Sat, 3 Jun 2023 14:27:11 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v15] In-Reply-To: References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: On Wed, 12 Apr 2023 01:36:01 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 15 additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into alignas >> - alignas >> - Merge branch 'openjdk:master' into alignas >> - Merge branch 'openjdk:master' into alignas >> - Merge branch 'openjdk:master' into alignas >> - Merge branch 'openjdk:master' into alignas >> - Merge branch 'openjdk:master' into alignas >> - Merge branch 'openjdk:master' into alignas >> - Merge branch 'openjdk:master' into alignas >> - Merge branch 'openjdk:master' into alignas >> - ... and 5 more: https://git.openjdk.org/jdk/compare/11cb8426...a621bb62 > > I've been meaning to review this but have been swamped. Sorry. > > I don't think this change to HotSpot should be combined with JDK-8305341 / PR#13258. > > I'm concerned there might be uses of ATTRIBUTE_ALIGNED in other places than at > the front of the declaration (like the fixed offset_of macro in the proposed changes). > Obviously there aren't any that break compilation. But is alignas in other > places valid but with a different meaning? For a discussion of the kind of > thing I'm concerned about, see > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108796 @kimbarrett Sorry for such a long period of absence, I was dealing with some things. I've written down all the usage sites excluding the globalDefinitions_gcc.hpp offset_of case as such (included the code affected for easy grepping as well): macroAssembler_aarch64_log.cpp: `ATTRIBUTE_ALIGNED(64) juint _L_tbl[] =` stubRoutines_aarch64.cpp: ATTRIBUTE_ALIGNED(4096) juint StubRoutines::aarch64::_crc_table[] = ATTRIBUTE_ALIGNED(64) jubyte StubRoutines::aarch64::_adler_table[] = { ATTRIBUTE_ALIGNED(64) juint StubRoutines::aarch64::_npio2_hw[] = { ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::aarch64::_dsin_coef[] = { ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::aarch64::_dcos_coef[] = { ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::aarch64::_two_over_pi[] = { ATTRIBUTE_ALIGNED(64) jdouble StubRoutines::aarch64::_pio2[] = { macroAssembler_x86_32_constants.cpp: ATTRIBUTE_ALIGNED(16) static const juint _ONES[] = { ATTRIBUTE_ALIGNED(16) static const juint _PI4_INV[] = { ATTRIBUTE_ALIGNED(16) static const juint _PI4X3[] = { ATTRIBUTE_ALIGNED(16) static const juint _PI4X4[] = { ATTRIBUTE_ALIGNED(16) static const juint _L_2IL0FLOATPACKET_0[] = { macroAssembler_x86_32_cos.cpp: `ATTRIBUTE_ALIGNED(16) static const juint _static_const_table_cos[] =` macroAssembler_x86_32_exp.cpp: `ATTRIBUTE_ALIGNED(16) static const juint _static_const_table[] =` macroAssembler_x86_32_log.cpp: `ATTRIBUTE_ALIGNED(16) static const juint _static_const_table_log[] =` macroAssembler_x86_32_log10.cpp: `ATTRIBUTE_ALIGNED(16) static const juint _static_const_table_log10[] =` macroAssembler_x86_32_pow.cpp: ATTRIBUTE_ALIGNED(16) static const juint _static_const_table_pow[] = ATTRIBUTE_ALIGNED(8) static const double _DOUBLE2 = 2.0; ATTRIBUTE_ALIGNED(8) static const double _DOUBLE0 = 0.0; ATTRIBUTE_ALIGNED(8) static const double _DOUBLE0DOT5 = 0.5; macroAssembler_x86_32_sin.cpp: ATTRIBUTE_ALIGNED(8) static const juint _zero_none[] = ATTRIBUTE_ALIGNED(4) static const juint __4onpi_d[] = ATTRIBUTE_ALIGNED(4) static const juint _TWO_32H[] = ATTRIBUTE_ALIGNED(4) static const juint _pi04_3d[] = ATTRIBUTE_ALIGNED(4) static const juint _pi04_5d[] = ATTRIBUTE_ALIGNED(4) static const juint _SCALE[] = ATTRIBUTE_ALIGNED(4) static const juint _zeros[] = ATTRIBUTE_ALIGNED(4) static const juint _pi04_2d[] = ATTRIBUTE_ALIGNED(4) static const juint _TWO_12H[] = ATTRIBUTE_ALIGNED(2) static const jushort __4onpi_31l[] = ATTRIBUTE_ALIGNED(16) static const jushort _SP[] = ATTRIBUTE_ALIGNED(16) static const jushort _CP[] = ATTRIBUTE_ALIGNED(16) static const juint _static_const_table_sin[] = macroAssembler_x86_32_tan.cpp: ATTRIBUTE_ALIGNED(16) static const jushort _TP[] = ATTRIBUTE_ALIGNED(16) static const jushort _TQ[] = ATTRIBUTE_ALIGNED(16) static const jushort _GP[] = ATTRIBUTE_ALIGNED(16) static const juint _static_const_table_tan[] = stubGenerator_x86_32.cpp: ATTRIBUTE_ALIGNED(16) static const uint32_t KEY_SHUFFLE_MASK[] = { ATTRIBUTE_ALIGNED(16) static const uint32_t COUNTER_SHUFFLE_MASK[] = { ATTRIBUTE_ALIGNED(16) static const uint32_t GHASH_BYTE_SWAP_MASK[] = { ATTRIBUTE_ALIGNED(16) static const uint32_t GHASH_LONG_SWAP_MASK[] = { stubGenerator_x86_64_adler.cpp: ATTRIBUTE_ALIGNED(64) static const juint ADLER32_ASCALE_TABLE[] = { ATTRIBUTE_ALIGNED(32) static const juint ADLER32_SHUF0_TABLE[] = { ATTRIBUTE_ALIGNED(32) static const juint ADLER32_SHUF1_TABLE[] = { stubGenerator_x86_64_aes.cpp: ATTRIBUTE_ALIGNED(16) static const uint64_t KEY_SHUFFLE_MASK[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t COUNTER_SHUFFLE_MASK[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t COUNTER_MASK_LINC0[] = { ATTRIBUTE_ALIGNED(16) static const uint64_t COUNTER_MASK_LINC1[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t COUNTER_MASK_LINC4[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t COUNTER_MASK_LINC8[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t COUNTER_MASK_LINC16[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t COUNTER_MASK_LINC32[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t GHASH_POLYNOMIAL_REDUCTION[] = { ATTRIBUTE_ALIGNED(16) static const uint64_t GHASH_POLYNOMIAL_TWO_ONE[] = { stubGenerator_x86_64_chacha.cpp: ATTRIBUTE_ALIGNED(64) static const uint64_t CC20_COUNTER_ADD_AVX[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t CC20_COUNTER_ADD_AVX512[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t CC20_LROT_CONSTS[] = { stubGenerator_x86_64_constants.cpp: ATTRIBUTE_ALIGNED(8) static const juint _ONE[] = { ATTRIBUTE_ALIGNED(16) static const juint _ONEHALF[] = { ATTRIBUTE_ALIGNED(8) static const juint _SIGN_MASK[] = { ATTRIBUTE_ALIGNED(8) static const juint _TWO_POW_55[] = { ATTRIBUTE_ALIGNED(8) static const juint _TWO_POW_M55[] = { ATTRIBUTE_ALIGNED(16) static const juint _SHIFTER[] = { ATTRIBUTE_ALIGNED(4) static const juint _ZERO[] = { ATTRIBUTE_ALIGNED(16) static const juint _SC_1[] = { ATTRIBUTE_ALIGNED(16) static const juint _SC_2[] = { ATTRIBUTE_ALIGNED(16) static const juint _SC_3[] = { ATTRIBUTE_ALIGNED(16) static const juint _SC_4[] = { ATTRIBUTE_ALIGNED(8) static const juint _PI_4[] = { ATTRIBUTE_ALIGNED(8) static const juint _PI32INV[] = { ATTRIBUTE_ALIGNED(8) static const juint _NEG_ZERO[] = { ATTRIBUTE_ALIGNED(8) static const juint _P_1[] = { ATTRIBUTE_ALIGNED(16) static const juint _P_2[] = { ATTRIBUTE_ALIGNED(8) static const juint _P_3[] = { ATTRIBUTE_ALIGNED(16) static const juint _PI_INV_TABLE[] = { ATTRIBUTE_ALIGNED(16) static const juint _Ctable[] = { stubGenerator_x86_64_exp.cpp: ATTRIBUTE_ALIGNED(16) static const juint _cv[] = ATTRIBUTE_ALIGNED(16) static const juint _mmask[] = ATTRIBUTE_ALIGNED(16) static const juint _bias[] = ATTRIBUTE_ALIGNED(16) static const juint _Tbl_addr[] = ATTRIBUTE_ALIGNED(16) static const juint _ALLONES[] = ATTRIBUTE_ALIGNED(16) static const juint _ebias[] = ATTRIBUTE_ALIGNED(4) static const juint _XMAX[] = ATTRIBUTE_ALIGNED(4) static const juint _XMIN[] = ATTRIBUTE_ALIGNED(4) static const juint _INF[] = stubGenerator_x86_64_ghash.cpp: ATTRIBUTE_ALIGNED(16) static const uint64_t GHASH_SHUFFLE_MASK[] = { ATTRIBUTE_ALIGNED(16) static const uint64_t GHASH_LONG_SWAP_MASK[] = { ATTRIBUTE_ALIGNED(16) static const uint64_t GHASH_BYTE_SWAP_MASK[] = { ATTRIBUTE_ALIGNED(16) static const uint64_t GHASH_POLYNOMIAL[] = { stubGenerator_x86_64_log.cpp: ATTRIBUTE_ALIGNED(16) static const juint _L_tbl[] = ATTRIBUTE_ALIGNED(16) static const juint _log2[] = ATTRIBUTE_ALIGNED(16) static const juint _coeff[] = ATTRIBUTE_ALIGNED(16) static const juint _HIGHSIGMASK_log10[] = { ATTRIBUTE_ALIGNED(16) static const juint _LOG10_E[] = { ATTRIBUTE_ALIGNED(16) static const juint _L_tbl_log10[] = { ATTRIBUTE_ALIGNED(16) static const juint _log2_log10[] = ATTRIBUTE_ALIGNED(16) static const juint _coeff_log10[] = stubGenerator_x86_64_poly.cpp: ATTRIBUTE_ALIGNED(64) static const uint64_t POLY1305_PAD_MSG[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t POLY1305_MASK42[] = { ATTRIBUTE_ALIGNED(64) static const uint64_t POLY1305_MASK44[] = { stubGenerator_x86_64_pow.cpp: ATTRIBUTE_ALIGNED(16) static const juint _HIGHSIGMASK[] = { ATTRIBUTE_ALIGNED(16) static const juint _LOG2_E[] = { ATTRIBUTE_ALIGNED(16) static const juint _HIGHMASK_Y[] = { ATTRIBUTE_ALIGNED(16) static const juint _T_exp[] = { ATTRIBUTE_ALIGNED(16) static const juint _e_coeff[] = { ATTRIBUTE_ALIGNED(16) static const juint _coeff_h[] = { ATTRIBUTE_ALIGNED(16) static const juint _HIGHMASK_LOG_X[] = { ATTRIBUTE_ALIGNED(8) static const juint _HALFMASK[] = { ATTRIBUTE_ALIGNED(16) static const juint _coeff_pow[] = { ATTRIBUTE_ALIGNED(16) static const juint _L_tbl_pow[] = { ATTRIBUTE_ALIGNED(8) static const juint _log2_pow[] = { ATTRIBUTE_ALIGNED(8) static const juint _DOUBLE2[] = { ATTRIBUTE_ALIGNED(8) static const juint _DOUBLE0[] = { ATTRIBUTE_ALIGNED(8) static const juint _DOUBLE0DOT5[] = { stubGenerator_x86_64_sin.cpp: `ATTRIBUTE_ALIGNED(8) static const juint _ALL_ONES[] =` stubGenerator_x86_64_tan.cpp: ATTRIBUTE_ALIGNED(16) static const juint _MUL16[] = ATTRIBUTE_ALIGNED(16) static const juint _sign_mask_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _PI32INV_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _P_1_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _P_2_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _P_3_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _Ctable_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _MASK_35_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _Q_11_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _Q_9_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _Q_7_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _Q_5_tan[] = ATTRIBUTE_ALIGNED(16) static const juint _Q_3_tan[] = ATTRIBUTE_ALIGNED(8) static const juint _PI_4_tan[] = ATTRIBUTE_ALIGNED(8) static const juint _QQ_2_tan[] = stubRoutines_x86.cpp: ATTRIBUTE_ALIGNED(64) const juint StubRoutines::x86::_k256[] = ATTRIBUTE_ALIGNED(64) juint StubRoutines::x86::_k256_W[2*sizeof(StubRoutines::x86::_k256)]; ATTRIBUTE_ALIGNED(64) const julong StubRoutines::x86::_k512_W[] = For gc code that uses ATTRIBUTE_ALIGNED, since they define their own macros that use it: xGlobals.hpp: `#define XCACHE_ALIGNED ATTRIBUTE_ALIGNED(XCacheLineSize)` zGlobals.hpp: `#define ZCACHE_ALIGNED ATTRIBUTE_ALIGNED(ZCacheLineSize)` xMarkStack.hpp: XCACHE_ALIGNED XMarkStackList _published; XCACHE_ALIGNED XMarkStackList _overflowed; xMarkStackAllocator.hpp: XCACHE_ALIGNED XMarkStackMagazineList _freelist; XCACHE_ALIGNED XMarkStackSpace _space; xMarkTerminate.hpp `XCACHE_ALIGNED volatile uint _nworking_stage0;` xNMethodTableIteration.hpp: `XCACHE_ALIGNED volatile size_t _claimed;` zMarkStack.hpp: ZCACHE_ALIGNED ZMarkStackList _published; ZCACHE_ALIGNED ZMarkStackList _overflowed; zMarkStackAllocator.hpp: ZCACHE_ALIGNED ZMarkStackSpace _space; ZCACHE_ALIGNED ZMarkStackMagazineList _freelist; ZCACHE_ALIGNED volatile bool _expanded_recently; zNMethodTableIteration.hpp: `ZCACHE_ALIGNED volatile size_t _claimed;` Fortunately, it seems like that mismatched offset_of macro in the globalDefinitions file really is the only code in HotSpot where the alignas specifier is in an area that is semantically different in the C++ Language ------------- PR Comment: https://git.openjdk.org/jdk/pull/11431#issuecomment-1574985090 From ysr at openjdk.org Sat Jun 3 15:21:22 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Sat, 3 Jun 2023 15:21:22 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <95yaqYTYoGnlkrDMbvZ-NTyVbGmHrL4DUYYIlV3wkwQ=.b6d787d2-3858-4486-b3dc-428a87969109@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> <7uARcGDHOuSUugc2zRg7JQgC2dSPBDOjeWGPjBPO2qs=.a09479b8-ba9b-4596-bc5a-7ace0968fe31@github.com> <95yaqYTYoGnlkrDMbvZ-NTyVbGmHrL4DUYYIlV3wkwQ=.b6d787d2-3858-4486-b3dc-428a87969109@github.com> Message-ID: <-jrieUm3r32vA5At0baw1nTndtNGoxG6EBrcEDjwyZw=.0a95dc08-1259-418d-a9bb-b2ba86b18c51@github.com> On Fri, 2 Jun 2023 18:38:41 GMT, Martin Doerr wrote: >> src/hotspot/cpu/riscv/gc/shenandoah/c1/shenandoahBarrierSetC1_riscv.cpp line 4: >> >>> 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. >>> 3: * Copyright (c) 2020, 2021, Huawei Technologies Co., Ltd. All rights reserved. >>> 4: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. >> >> This should be backed out, since it seems that there is no (other) change to this fie. > > Yes. And also from files which were changed by non-Amazon employees only, please. Thanks, Martin. Yes, we have noted that there were a few other files that were inadvertently caught in a copyright header dragnet. These will be reviewed and fixed in https://bugs.openjdk.org/browse/JDK-8309392 . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1215600873 From alanb at openjdk.org Sun Jun 4 08:08:04 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 4 Jun 2023 08:08:04 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING In-Reply-To: References: Message-ID: On Sat, 3 Jun 2023 10:53:04 GMT, Serguei Spitsyn wrote: > When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. > The fix adds: > - a special case for passive carrier threads > - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. > > Testing: > - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` > - submitted mach5 tiers 1-5 > - TBD: to submit mach5 tier 6 src/hotspot/share/prims/jvmtiEnvBase.cpp line 764: > 762: > 763: if (is_passive_carrier_thread(jt, thread_oop)) { > 764: state |= (JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY); This is testing if the jt is carrying thread_oop and it's okay for the JVMTI state to reported as WAITING when waiting for something other than Object.wait. One thing that is a bit confusing is the function name "is_passive_carrier_thread". A platform thread is either a carrier or not. Maybe for a different PR but I think is_passive_carrier_thread should be renamed to avoid the use of the word "passive". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1216368303 From sspitsyn at openjdk.org Sun Jun 4 08:30:04 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Sun, 4 Jun 2023 08:30:04 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 08:05:34 GMT, Alan Bateman wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 764: > >> 762: >> 763: if (is_passive_carrier_thread(jt, thread_oop)) { >> 764: state |= (JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY); > > This is testing if the jt is carrying thread_oop and it's okay for the JVMTI state to reported as WAITING when waiting for something other than Object.wait. > > One thing that is a bit confusing is the function name "is_passive_carrier_thread". A platform thread is either a carrier or not. Maybe for a different PR but I think is_passive_carrier_thread should be renamed to avoid the use of the word "passive". The lines 763-764 are to correct the state exactly for passive carrier thread, a carrier thread which can't progress until the execution control has not been returned from a virtual thread executed on the top. It is never for a platform thread which is not a carrier thread. "Passive" is the best word I was able to find for this meaning. Do you have any other word/suggestion in mind? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1216390108 From aph at openjdk.org Sun Jun 4 10:20:23 2023 From: aph at openjdk.org (Andrew Haley) Date: Sun, 4 Jun 2023 10:20:23 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size This patch was submitted 2023-05-26, two week before RDP1. It is large and complex. My biggest concern it that it doesn't seem to be well-isolated from the existing Shenandoah code. The biggest risk is that it breaks Trad (i.e. non-generational) Shenandoah. A patch like this takes a few weeks to review properly. I don't believe we should hurry the review process. While this looks interesting, and it has promise for the future, it's not worth breaking anything for. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1575504522 From alanb at openjdk.org Sun Jun 4 11:17:04 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 4 Jun 2023 11:17:04 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 08:26:06 GMT, Serguei Spitsyn wrote: > The lines 763-764 are to correct the state exactly for passive carrier thread, a carrier thread which can't progress until the execution control has not been returned from a virtual thread executed on the top. It is never for a platform thread which is not a carrier thread. "Passive" is the best word I was able to find for this meaning. Do you have any other word/suggestion in mind? It's just a carrier. A platform thread becomes a carrier when a virtual thread is mounted, it ceases to be a carrier once the virtual thread is unmounted. The mental model is that the carrier is blocked so reporting its state as waiting indefinitely is correct. Maybe you don't want to rename it in this PR but renaming this function to something like is_carrying would convey that it's asking the question if a given JavaThread is carrying the given virtual thread oop. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1216568827 From jbhateja at openjdk.org Sun Jun 4 15:36:21 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 4 Jun 2023 15:36:21 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v6] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 22:52:48 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Indentation; spread source into assembly src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3941: > 3939: generate_libm_stubs(); > 3940: > 3941: if ((UseAVX >= 1) && (VM_Version::supports_avx512vlbwdq() || VM_Version::supports_fma())) { We can relax this to supports_evex instead of vlbwdq. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 79: > 77: __ enter(); // required for proper stackwalking of RuntimeStub frame > 78: > 79: if (VM_Version::supports_avx512vlbwdq()) { // AVX512 version We can relax this to supports_evex(). src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 121: > 119: // // |x|, |y| > 120: // a = DP_AND(x, DP_CONST(7fffffffffffffff)); > 121: __ movq(xmm0, xmm0); Redundatn move. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 122: > 120: // a = DP_AND(x, DP_CONST(7fffffffffffffff)); > 121: __ movq(xmm0, xmm0); > 122: __ mov64(rax, 0x7FFFFFFFFFFFFFFF); ULL suffice missing long constant. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 123: > 121: __ movq(xmm0, xmm0); > 122: __ mov64(rax, 0x7FFFFFFFFFFFFFFF); > 123: __ evpbroadcastq(xmm3, rax, Assembler::AVX_128bit); Replace broadcast with cheaper PINSRQ. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 134: > 132: __ evdivsd(xmm0, xmm6, xmm5, Assembler::EVEX_RZ); > 133: // q = DP_ROUND_RZ(q); > 134: __ movq(xmm0, xmm0); Redundant movq src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 145: > 143: __ jcc(Assembler::equal, L_5280); > 144: // if (eq >= 0x7fefffffu) goto SPECIAL_FMOD; > 145: __ cmpl(rax, 0x7feffffe); Comment mention comparison against 0x7feffff. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 160: > 158: __ jcc(Assembler::below, L_5300); > 159: __ movsd(xmm0, ExternalAddress((address)CONST_INF), rax); > 160: // return DP_FNMA(b, q, a); // NaN Misplaced comment for NaN already present at 204 src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 168: > 166: __ jmp(L_exit); > 167: // if (!eq) return x + sgn_a; > 168: __ align32(); Redundant alignment ? src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 192: > 190: __ evdivsd(xmm2, xmm6, xmm5, Assembler::EVEX_RZ); > 191: // q = DP_ROUND_RZ(q); > 192: __ movq(xmm2, xmm2); Redundant move. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 215: > 213: __ evdivsd(xmm0, xmm6, xmm2, Assembler::EVEX_RZ); > 214: // q = DP_ROUND_RZ(q); > 215: __ movq(xmm0, xmm0); Redundant move. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 264: > 262: __ evdivsd(xmm0, xmm7, xmm2, Assembler::EVEX_RZ); > 263: // q = DP_ROUND_RZ(q); > 264: __ movq(xmm0, xmm0); Redundant move. src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 305: > 303: // // sign(x) > 304: // sgn_a = DP_XOR(x, a); > 305: __ mov64(rcx, 0x8000000000000000); ULL suffice in long constant. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216815105 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216810945 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216613005 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216612764 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216616520 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216617108 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216622400 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216733715 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216736764 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216756042 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216767150 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216779440 PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216630301 From jbhateja at openjdk.org Sun Jun 4 15:36:22 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 4 Jun 2023 15:36:22 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v2] In-Reply-To: References: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> Message-ID: On Thu, 1 Jun 2023 11:00:43 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Address review comments > > src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 306: > >> 304: >> 305: Label L_104a, L_11bd, L_10c1, L_1090, L_11b9, L_10e7, L_11af, L_111c, L_10f3, L_116e, L_112a; >> 306: Label L_1173, L_1157, L_117f, L_11a0; > > For the sake of clarity, can we segregate AVX2 functionality into a separate routine and indent the block. Will be good to have this part in a separate routines. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216632754 From jbhateja at openjdk.org Sun Jun 4 15:55:10 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Sun, 4 Jun 2023 15:55:10 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v2] In-Reply-To: <2m8KCrlZkRTg4pBAwASL4FKc_MFtL-POWmhG0ebAiwQ=.36cdb946-abb1-4938-aaa3-327775b26d65@github.com> References: <2NOPy1QG4rGLMmXNTv_6E6WCKdRCLg466z_tGqo3xeE=.183282f8-8068-4bc0-941b-81b9a29138be@github.com> <2m8KCrlZkRTg4pBAwASL4FKc_MFtL-POWmhG0ebAiwQ=.36cdb946-abb1-4938-aaa3-327775b26d65@github.com> Message-ID: On Thu, 1 Jun 2023 16:00:19 GMT, Scott Gibbons wrote: >> Hi @asgibbons , >> Kindly also include the results for following benchmark >> test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java >> >> Best Regards, >> Jatin > >> Hi @asgibbons , Kindly also include the results for following benchmark test/micro/org/openjdk/bench/vm/floatingpoint/DremFrem.java >> >> Best Regards, Jatin > > Current top-of-tree results: > > Benchmark Mode Cnt Score Error Units > DremFrem.calcDoubleJava avgt 25 7.034 ? 0.001 ns/op > DremFrem.calcFloatJava avgt 25 7.011 ? 0.001 ns/op > DremFrem.cornercaseDoubleJava avgt 25 5.514 ? 0.006 ns/op > DremFrem.cornercaseFloatJava avgt 25 5.510 ? 0.003 ns/op > > > My changes: > > Benchmark Mode Cnt Score Error Units > DremFrem.calcDoubleJava avgt 25 2.916 ? 0.001 ns/op > DremFrem.calcFloatJava avgt 25 4.011 ? 0.001 ns/op > DremFrem.cornercaseDoubleJava avgt 25 5.518 ? 0.008 ns/op > DremFrem.cornercaseFloatJava avgt 25 5.515 ? 0.007 ns/op Hi @asgibbons , It will be good to back the special case handlings in the patch with a test case. Best Regards, Jatin ------------- PR Comment: https://git.openjdk.org/jdk/pull/14224#issuecomment-1575618276 From sgibbons at openjdk.org Sun Jun 4 17:21:11 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sun, 4 Jun 2023 17:21:11 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v6] In-Reply-To: References: Message-ID: <6AEj5wZji9ONqIxU-fTfIn6TF9bEttbQDGUerF79u-U=.b779a37d-45d0-445e-9ab9-b74f412f2038@github.com> On Sun, 4 Jun 2023 12:05:52 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Indentation; spread source into assembly > > src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 145: > >> 143: __ jcc(Assembler::equal, L_5280); >> 144: // if (eq >= 0x7fefffffu) goto SPECIAL_FMOD; >> 145: __ cmpl(rax, 0x7feffffe); > > Comment mention comparison against 0x7feffff. This is an artifact of block reordering by the compiler and should be correct. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216929285 From sgibbons at openjdk.org Sun Jun 4 17:34:11 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sun, 4 Jun 2023 17:34:11 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v6] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 11:56:48 GMT, Jatin Bhateja wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Indentation; spread source into assembly > > src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 121: > >> 119: // // |x|, |y| >> 120: // a = DP_AND(x, DP_CONST(7fffffffffffffff)); >> 121: __ movq(xmm0, xmm0); > > Redundatn move. I do not believe these are redundant, as the upper quadword of the register is cleared as a side-effect of the vmovq. I do not believe the icx compiler would insert random redundant vmovq instructions at this optimization level. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1216942237 From sgibbons at openjdk.org Sun Jun 4 17:42:43 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Sun, 4 Jun 2023 17:42:43 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v7] In-Reply-To: References: Message-ID: > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Review comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/85999cd1..1b44cd62 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From kdnilsen at openjdk.org Sun Jun 4 21:39:58 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Sun, 4 Jun 2023 21:39:58 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove three asserts making comparisons between atomic volatile variables Though changes to the volatile variables are individually protected by Atomic load and store operations, these asserts were not assuring atomic access to multiple volatile variables, each of which could be modified independently of the others. The asserts were therefore not trustworthy, as has been confirmed by more extensive testing. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/d4d2f1cf..8d80780a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=03-04 Stats: 18 lines in 2 files changed: 0 ins; 18 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From alanb at openjdk.org Mon Jun 5 06:19:27 2023 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 5 Jun 2023 06:19:27 GMT Subject: RFR: 8309408: Thread.sleep cleanup Message-ID: Thread.sleep has had quite a bit of churn recently to support virtual threads, add sleep(Duration), a JFR event, and the change the underlying implementation to support sub-millis precision. I think the changes have settled down now so we can do some small cleanups that came up in PR discussions. The cleanups were kicked down the road as it requires tracking down faraway tests that depend on the stack depth and the names of internal methods. The two cleanups proposed here are: 1. Add a private sleepNanos method that creates/commits the JFR event around the sleep, this avoids duplicate code in the 3 sleep methods. 2. Rename JVM_Sleep to JVM_SleepNanos to make it clear that it takes the sleep time in nanoseconds, esp. when Thread.sleep's parameter is milliseconds. ------------- Commit messages: - Initial commit Changes: https://git.openjdk.org/jdk/pull/14303/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14303&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309408 Stats: 75 lines in 10 files changed: 27 ins; 32 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/14303.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14303/head:pull/14303 PR: https://git.openjdk.org/jdk/pull/14303 From dzhang at openjdk.org Mon Jun 5 06:20:18 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Mon, 5 Jun 2023 06:20:18 GMT Subject: RFR: 8309418: RISC-V: Make use of vl1r_v & vfabs_v pseudo-instructions where appropriate Message-ID: Hi all, We should add assembler functions for two pseudo-instructions vl1r_v [1] & vfabs_v [2] and use them when appropriate for better readability. At the same time, we removed a few unused assembly instructions. Please take a look and have some reviews. Thanks a lot. [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#79-vector-loadstore-whole-register-instructions [2] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1312-vector-floating-point-sign-injection-instructions ## Testing: qemu w/ UseRVV: - [x] Tier1 tests (release) - [x] Tier2 tests (release) - [ ] Tier3 tests (release) - [x] test/jdk/jdk/incubator/vector (release/fastdebug) ------------- Commit messages: - 8309418: RISC-V: Make use of vl1r_v & vfabs_v pseudo-instructions where appropriate Changes: https://git.openjdk.org/jdk/pull/14309/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14309&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309418 Stats: 19 lines in 5 files changed: 8 ins; 7 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14309.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14309/head:pull/14309 PR: https://git.openjdk.org/jdk/pull/14309 From fyang at openjdk.org Mon Jun 5 06:54:04 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 5 Jun 2023 06:54:04 GMT Subject: RFR: 8309418: RISC-V: Make use of vl1r_v & vfabs_v pseudo-instructions where appropriate In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 06:13:08 GMT, Dingli Zhang wrote: > Hi all, > We should add assembler functions for two pseudo-instructions vl1r_v [1] & > vfabs_v [2] and use them when appropriate for better readability. > > At the same time, we removed a few unused assembly instructions. Please take a look > and have some reviews. Thanks a lot. > > [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#79-vector-loadstore-whole-register-instructions > [2] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1312-vector-floating-point-sign-injection-instructions > > ## Testing: > qemu w/ UseRVV: > - [x] Tier1 tests (release) > - [x] Tier2 tests (release) > - [ ] Tier3 tests (release) > - [x] test/jdk/jdk/incubator/vector (release/fastdebug) Looks good. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14309#pullrequestreview-1461827176 From rcastanedalo at openjdk.org Mon Jun 5 07:06:15 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Mon, 5 Jun 2023 07:06:15 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 13:41:18 GMT, Stefan Karlsson wrote: > Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) > > You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: > > > # Create a JFR recording > $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp > > # Extract the data from that recording > $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr > > Resident Set Size > > Time Resident Set Size Resident Set Size Peak Value > ---------------- ------------------------- ------------------------------------ > 11:56:07 1.1 GB 1.2 GB > 11:56:08 333.7 MB 1.2 GB > 11:56:09 432.4 MB 1.2 GB > 11:56:10 695.9 MB 1.2 GB > 11:56:11 1.0 GB 1.2 GB > 11:56:12 1.3 GB 1.3 GB > 11:56:13 1.3 GB 1.3 GB > 11:56:14 1.3 GB 1.3 GB > 11:56:15 1.3 GB 1.3 GB > 11:56:16 1.3 GB 1.3 GB > 11:56:17 1.4 GB 1.4 GB > 11:56:18 1.8 GB 1.8 GB > 11:56:19 2.0 GB 2.0 GB > > > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? > > I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. Thanks again for this contribution, the Linux and OS-independent parts look good! test/jdk/jdk/jfr/event/runtime/TestResidentSetSizeEvent.java line 2: > 1: /* > 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved. Is the `2022, 2023` intentional or should it just be `2023`? test/jdk/jdk/jfr/event/runtime/TestResidentSetSizeEvent.java line 31: > 29: import static jdk.test.lib.Asserts.assertEquals; > 30: > 31: import java.time.Instant; These three imports are unused. ------------- Marked as reviewed by rcastanedalo (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14285#pullrequestreview-1461839575 PR Review Comment: https://git.openjdk.org/jdk/pull/14285#discussion_r1217614714 PR Review Comment: https://git.openjdk.org/jdk/pull/14285#discussion_r1217615487 From luhenry at openjdk.org Mon Jun 5 07:10:05 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 5 Jun 2023 07:10:05 GMT Subject: RFR: 8309418: RISC-V: Make use of vl1r.v & vfabs.v pseudo-instructions where appropriate In-Reply-To: References: Message-ID: <9DNe_WzsYmitTEHPsIidt-bMTjVtDGa4b1uIn_WAh04=.672f310f-83e6-4e9d-bbe2-be1fdd446ebc@github.com> On Mon, 5 Jun 2023 06:13:08 GMT, Dingli Zhang wrote: > Hi all, > We should add assembler functions for two pseudo-instructions vl1r.v [1] & > vfabs.v [2] and use them when appropriate for better readability. > > At the same time, we removed a few unused assembly instructions. Please take a look > and have some reviews. Thanks a lot. > > [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#79-vector-loadstore-whole-register-instructions > [2] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1312-vector-floating-point-sign-injection-instructions > > ## Testing: > qemu w/ UseRVV: > - [x] Tier1 tests (release) > - [x] Tier2 tests (release) > - [ ] Tier3 tests (release) > - [x] test/jdk/jdk/incubator/vector (release/fastdebug) Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14309#pullrequestreview-1461848363 From dholmes at openjdk.org Mon Jun 5 07:31:12 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 5 Jun 2023 07:31:12 GMT Subject: RFR: 8309408: Thread.sleep cleanup In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 11:28:33 GMT, Alan Bateman wrote: > Thread.sleep has had quite a bit of churn recently to support virtual threads, add sleep(Duration), a JFR event, and the change the underlying implementation to support sub-millis precision. I think the changes have settled down now so we can do some small cleanups that came up in PR discussions. The cleanups were kicked down the road as it requires tracking down faraway tests that depend on the stack depth and the names of internal methods. The two cleanups proposed here are: > > 1. Add a private sleepNanos method that creates/commits the JFR event around the sleep, this avoids duplicate code in the 3 sleep methods. > 2. Rename JVM_Sleep to JVM_SleepNanos to make it clear that it takes the sleep time in nanoseconds, esp. when Thread.sleep's parameter is milliseconds. Looks good! Keeping these tests up-to-date is painful, but as you note this is hopefully stabilized now. There is one potential, pre-existing, test omission noted below. Thanks. test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/ThreadController.java line 660: > 658: expectedMethods.add(Thread.class.getName() + ".sleep"); > 659: expectedMethods.add(Thread.class.getName() + ".sleepNanos"); > 660: expectedMethods.add(Thread.class.getName() + ".sleepNanos0"); I'm surprised this test doesn't list `beforeSleep` and `afterSleep`. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14303#pullrequestreview-1461876843 PR Review Comment: https://git.openjdk.org/jdk/pull/14303#discussion_r1217641284 From stefank at openjdk.org Mon Jun 5 08:04:08 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 5 Jun 2023 08:04:08 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 07:01:05 GMT, Roberto Casta?eda Lozano wrote: >> Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) >> >> You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: >> >> >> # Create a JFR recording >> $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp >> >> # Extract the data from that recording >> $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr >> >> Resident Set Size >> >> Time Resident Set Size Resident Set Size Peak Value >> ---------------- ------------------------- ------------------------------------ >> 11:56:07 1.1 GB 1.2 GB >> 11:56:08 333.7 MB 1.2 GB >> 11:56:09 432.4 MB 1.2 GB >> 11:56:10 695.9 MB 1.2 GB >> 11:56:11 1.0 GB 1.2 GB >> 11:56:12 1.3 GB 1.3 GB >> 11:56:13 1.3 GB 1.3 GB >> 11:56:14 1.3 GB 1.3 GB >> 11:56:15 1.3 GB 1.3 GB >> 11:56:16 1.3 GB 1.3 GB >> 11:56:17 1.4 GB 1.4 GB >> 11:56:18 1.8 GB 1.8 GB >> 11:56:19 2.0 GB 2.0 GB >> >> >> The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? >> >> I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. > > test/jdk/jdk/jfr/event/runtime/TestResidentSetSizeEvent.java line 2: > >> 1: /* >> 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved. > > Is the `2022, 2023` intentional or should it just be `2023`? It was intentional since I copy-n-pasted the structure of the code from the NMT JFR test. > test/jdk/jdk/jfr/event/runtime/TestResidentSetSizeEvent.java line 31: > >> 29: import static jdk.test.lib.Asserts.assertEquals; >> 30: >> 31: import java.time.Instant; > > These three imports are unused. Thanks. Will remove them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14285#discussion_r1217681022 PR Review Comment: https://git.openjdk.org/jdk/pull/14285#discussion_r1217681215 From stefank at openjdk.org Mon Jun 5 08:14:12 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 5 Jun 2023 08:14:12 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS [v2] In-Reply-To: References: Message-ID: <5--iQRwxU0JJYMoywpwEEX7dWRhnuGn6d6p2Av9kYDI=.55e444c1-f375-4eac-bc90-503ec61177c8@github.com> > Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) > > You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: > > > # Create a JFR recording > $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp > > # Extract the data from that recording > $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr > > Resident Set Size > > Time Resident Set Size Resident Set Size Peak Value > ---------------- ------------------------- ------------------------------------ > 11:56:07 1.1 GB 1.2 GB > 11:56:08 333.7 MB 1.2 GB > 11:56:09 432.4 MB 1.2 GB > 11:56:10 695.9 MB 1.2 GB > 11:56:11 1.0 GB 1.2 GB > 11:56:12 1.3 GB 1.3 GB > 11:56:13 1.3 GB 1.3 GB > 11:56:14 1.3 GB 1.3 GB > 11:56:15 1.3 GB 1.3 GB > 11:56:16 1.3 GB 1.3 GB > 11:56:17 1.4 GB 1.4 GB > 11:56:18 1.8 GB 1.8 GB > 11:56:19 2.0 GB 2.0 GB > > > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? > > I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Remove unused test imports ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14285/files - new: https://git.openjdk.org/jdk/pull/14285/files/f4cc021c..be9fc091 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14285&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14285&range=00-01 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14285.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14285/head:pull/14285 PR: https://git.openjdk.org/jdk/pull/14285 From stefank at openjdk.org Mon Jun 5 08:14:12 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 5 Jun 2023 08:14:12 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS [v2] In-Reply-To: References: Message-ID: On Sat, 3 Jun 2023 14:08:43 GMT, Thomas Stuefe wrote: > Good. Want to add vsize while you are at it :) ? Thanks! :) I have a few other things that I need to do before RDP1. I can review a patch if someone publishes a PR ... ------------- PR Comment: https://git.openjdk.org/jdk/pull/14285#issuecomment-1576302879 From stefank at openjdk.org Mon Jun 5 08:23:05 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 5 Jun 2023 08:23:05 GMT Subject: RFR: 8309408: Thread.sleep cleanup In-Reply-To: References: Message-ID: <5QMR3WIYHEm2QUPqwMEb5fX_eHEYZv9qrv9BeJeeooI=.c53f6306-26cf-4d32-966a-c6f6795b760c@github.com> On Sun, 4 Jun 2023 11:28:33 GMT, Alan Bateman wrote: > Thread.sleep has had quite a bit of churn recently to support virtual threads, add sleep(Duration), a JFR event, and the change the underlying implementation to support sub-millis precision. I think the changes have settled down now so we can do some small cleanups that came up in PR discussions. The cleanups were kicked down the road as it requires tracking down faraway tests that depend on the stack depth and the names of internal methods. The two cleanups proposed here are: > > 1. Add a private sleepNanos method that creates/commits the JFR event around the sleep, this avoids duplicate code in the 3 sleep methods. > 2. Rename JVM_Sleep to JVM_SleepNanos to make it clear that it takes the sleep time in nanoseconds, esp. when Thread.sleep's parameter is milliseconds. I welcome this change. We were looking at this code on Friday and was thinking that it would have been good to rename sleep0 and JVM_Sleep. :) ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14303#pullrequestreview-1461982066 From gcao at openjdk.org Mon Jun 5 09:19:04 2023 From: gcao at openjdk.org (Gui Cao) Date: Mon, 5 Jun 2023 09:19:04 GMT Subject: RFR: 8309418: RISC-V: Make use of vl1r.v & vfabs.v pseudo-instructions where appropriate In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 06:13:08 GMT, Dingli Zhang wrote: > Hi all, > We should add assembler functions for two pseudo-instructions vl1r.v [1] & > vfabs.v [2] and use them when appropriate for better readability. > > At the same time, we removed a few unused assembly instructions. Please take a look > and have some reviews. Thanks a lot. > > [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#79-vector-loadstore-whole-register-instructions > [2] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1312-vector-floating-point-sign-injection-instructions > > ## Testing: > qemu w/ UseRVV: > - [x] Tier1 tests (release) > - [x] Tier2 tests (release) > - [ ] Tier3 tests (release) > - [x] test/jdk/jdk/incubator/vector (release/fastdebug) Look good for me, Thanks. ------------- Marked as reviewed by gcao (Author). PR Review: https://git.openjdk.org/jdk/pull/14309#pullrequestreview-1462096992 From bulasevich at openjdk.org Mon Jun 5 09:25:09 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Mon, 5 Jun 2023 09:25:09 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 18:41:38 GMT, Aleksey Shipilev wrote: >> src/hotspot/cpu/x86/vtableStubs_x86_32.cpp line 203: >> >>> 201: >>> 202: start_pc = __ pc(); >>> 203: __ push(temp_reg); >> >> Why do we need to save this one? Do we care if this "tmp" is clobbered? > > This one is still not addressed, in case you missed it, @bulasevich. Right. Thanks. On x86_32 we have register pressure. I have to push-pop rdx to avoid crash. The question must be caused by temp_reg name, which is similar to rscatch. On x86_32 we do not have any scratch register. Will it be better if I replace push/pop(temp_reg) with explicit push/pop(rdx)? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1217785799 From bulasevich at openjdk.org Mon Jun 5 09:25:07 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Mon, 5 Jun 2023 09:25:07 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v3] In-Reply-To: References: Message-ID: <-zecbauEmvdtVM1qOZSAAk9x0Em8_z2VfTfnQh78_uk=.b6c31b51-e5b2-4ae6-a085-f532b372f15c@github.com> On Fri, 2 Jun 2023 05:49:59 GMT, Tobias Hartmann wrote: > I'm seeing build failures: @TobiHartmann Yes, thanks. I rebased my change to the tip. Now it must be Ok. ------------- PR Comment: https://git.openjdk.org/jdk/pull/13460#issuecomment-1576444043 From aph at openjdk.org Mon Jun 5 09:35:08 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 5 Jun 2023 09:35:08 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v2] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 09:22:11 GMT, Boris Ulasevich wrote: >> This one is still not addressed, in case you missed it, @bulasevich. > > Right. Thanks. > On x86_32 we have register pressure. I have to push-pop rdx to avoid crash. The question must be caused by temp_reg name, which is similar to rscatch. On x86_32 we do not have any scratch register. > Will it be better if I replace push/pop(temp_reg) with explicit push/pop(rdx)? On x86_32 we do not have any scratch register. Will it be better if I replace push/pop(temp_reg) with explicit push/pop(rdx)? Yes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13460#discussion_r1217799527 From bulasevich at openjdk.org Mon Jun 5 10:10:22 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Mon, 5 Jun 2023 10:10:22 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v6] In-Reply-To: References: Message-ID: > Async profiler shows that applications spend up to 10% in itable_stubs. > > The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. > > This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: push/pop(temp_get) -> push/pop(rdx) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/13460/files - new: https://git.openjdk.org/jdk/pull/13460/files/8f36e437..268875aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=13460&range=04-05 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/13460.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13460/head:pull/13460 PR: https://git.openjdk.org/jdk/pull/13460 From kdnilsen at openjdk.org Mon Jun 5 12:50:26 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 5 Jun 2023 12:50:26 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 660: > 658: void ShenandoahGeneration::increase_used(size_t bytes) { > 659: Atomic::add(&_used, bytes); > 660: } Note that C++ subexpression evaluation order is undefined. Here is an example of what can go wrong with the removed assertion: ThisThread: fetches _affiliated_region_count * begins to multiply with region_size OtherThread: increases _used and increases _affiliated_region_count appropriately (for a large allocation) ThisThread: fetches _used and observes assert violation ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1218025633 From kdnilsen at openjdk.org Mon Jun 5 13:06:25 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 5 Jun 2023 13:06:25 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 12:46:58 GMT, Kelvin Nilsen wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Remove three asserts making comparisons between atomic volatile variables >> >> Though changes to the volatile variables are individually protected by >> Atomic load and store operations, these asserts were not assuring >> atomic access to multiple volatile variables, each of which could be >> modified independently of the others. The asserts were therefore not >> trustworthy, as has been confirmed by more extensive testing. > > src/hotspot/share/gc/shenandoah/shenandoahGeneration.cpp line 660: > >> 658: void ShenandoahGeneration::increase_used(size_t bytes) { >> 659: Atomic::add(&_used, bytes); >> 660: } > > Note that C++ subexpression evaluation order is undefined. Here is an example of what can go wrong with the removed assertion: > > ThisThread: fetches _affiliated_region_count * begins to multiply with region_size > OtherThread: increases _used and increases _affiliated_region_count appropriately (for a large allocation) > ThisThread: fetches _used and observes assert violation We have seen this race manifest in actual testing, which is what has motivated us to remove the assertions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1218045035 From kdnilsen at openjdk.org Mon Jun 5 13:12:26 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 5 Jun 2023 13:12:26 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.inline.hpp line 85: > 83: inline void ShenandoahHeapRegion::internal_increase_live_data(size_t s) { > 84: size_t new_live_data = Atomic::add(&_live_data, s, memory_order_relaxed); > 85: #ifdef ASSERT We have not observed violation of this assert during testing. However, it appears unreliable in that _live_data increases monotonically under Atomic volatile math, whereas used() increases monotonically under heap lock, which we do not hold at the point of this assertion. It is possible that another allocating thread increases both used() and _live_data. We will see the increase in _live_data because _live_data changes are volatile Atomic. However, we may not see the increase in used() because we did not acquire the heap lock and the value of _top() that contributes to calculation of used() is not volatile. This situation can lead to an assertion failure. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1218051609 From alanb at openjdk.org Mon Jun 5 15:13:06 2023 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 5 Jun 2023 15:13:06 GMT Subject: RFR: 8309408: Thread.sleep cleanup In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 07:26:57 GMT, David Holmes wrote: >> Thread.sleep has had quite a bit of churn recently to support virtual threads, add sleep(Duration), a JFR event, and the change the underlying implementation to support sub-millis precision. I think the changes have settled down now so we can do some small cleanups that came up in PR discussions. The cleanups were kicked down the road as it requires tracking down faraway tests that depend on the stack depth and the names of internal methods. The two cleanups proposed here are: >> >> 1. Add a private sleepNanos method that creates/commits the JFR event around the sleep, this avoids duplicate code in the 3 sleep methods. >> 2. Rename JVM_Sleep to JVM_SleepNanos to make it clear that it takes the sleep time in nanoseconds, esp. when Thread.sleep's parameter is milliseconds. > > test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/ThreadController.java line 660: > >> 658: expectedMethods.add(Thread.class.getName() + ".sleep"); >> 659: expectedMethods.add(Thread.class.getName() + ".sleepNanos"); >> 660: expectedMethods.add(Thread.class.getName() + ".sleepNanos0"); > > I'm surprised this test doesn't list `beforeSleep` and `afterSleep`. > There is one potential, pre-existing, test omission noted below. > I'm surprised this test doesn't list `beforeSleep` and `afterSleep`. The monitoring/stress/thread tests will fail if they observe an unexpected method name in the stack trace. I don't think it can happen because the tests poll the thread state and for SleepingThread, it will sample the stack trace when the thread state is timed-wait. The beforeSleep/afterSleep methods won't in the stack trace when sleeping. It would be harmless to add them in that they aren't going to cause these tests to fail but might help with any further changes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14303#discussion_r1218215964 From goetz at openjdk.org Mon Jun 5 15:39:06 2023 From: goetz at openjdk.org (Goetz Lindenmaier) Date: Mon, 5 Jun 2023 15:39:06 GMT Subject: RFR: JDK-8308288: Fix xlc17 clang warnings and build errors in hotspot In-Reply-To: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> References: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> Message-ID: On Fri, 2 Jun 2023 11:28:45 GMT, JoKern65 wrote: > This pr is a split off from JDK-8308288 : Fix xlc17 clang warnings in shared code https://github.com/openjdk/jdk/pull/14146 > It handles the part in hotspot. > > It handles the error introduced by a redefine of malloc in stdlib.h resulting in the following build error: > > /data/d042520/pr/jdk/src/hotspot/share/runtime/os.cpp:616:5: error: no member named '_vec_malloc' in 'LogTag'; did you mean 'vec_malloc'? > log_warning(malloc, free)("ptr caught: " PTR_FORMAT, p2i(ptr)); > ^~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:46:28: note: expanded from macro 'log_warning' > #define log_warning(...) (!log_is_enabled(Warning, __VA_ARGS__)) ? (void)0 : LogImpl::write > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:68:45: note: expanded from macro 'log_is_enabled' > #define log_is_enabled(level, ...) (LogImpl::is_level(LogLevel::level)) > ^~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:221:38: note: expanded from macro 'LOG_TAGS' > #define LOG_TAGS(...) EXPAND_VARARGS(LOG_TAGS_EXPANDED(__VA_ARGS__, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG)) > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:217:57: note: expanded from macro 'LOG_TAGS_EXPANDED' > #define LOG_TAGS_EXPANDED(T0, T1, T2, T3, T4, T5, ...) PREFIX_LOG_TAG(T0), PREFIX_LOG_TAG(T1), PREFIX_LOG_TAG(T2), \ > ^~~~~~~~~~~~~~~~~~ > ... (rest of output omitted) > > > Additionally it solves the need for an #include on AIX for any usage of the alloca function, by adding the include to globalDefinitions_xlc.hpp os.cpp / timezone I leave it to you that it's the correct behaviour for AIX, for the other platforms it looks good. os.cpp / print_function_and_library_name() I don't like assignements in if(), but the fix here should not change this. Good. globalDefinitions_xlc.hpp / alloca.h Good. globalDefinitions_xlc.hpp / malloc Having looked at the discussion about this in #14146, I think this is the solution we want to have here. I don't think we should harden the code wrt. the fact that malloc and others can be macros. At least as long as none of the compilers of the main platforms detects this (e.g. running in GHA). Also, I know that Joachim asked IBM to remove the macro from the header. After all, it worked with xlc16. So let's fix this here in the _xlc file. ------------- Marked as reviewed by goetz (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14283#pullrequestreview-1462844698 From duke at openjdk.org Mon Jun 5 16:22:28 2023 From: duke at openjdk.org (Christine Flood) Date: Mon, 5 Jun 2023 16:22:28 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: On Fri, 2 Jun 2023 02:49:25 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Force PLAB sizes to align on card-table size There are some applications where generational collectors are effective, but there are some applications where they aren't, like LRU caches. I would like to see the changes made to Shenandoah to make it generational cleanly isolated from traditional Shenandoah so that both options remain available to our customers moving forward. It seems unlikely that such changes can be completed before the deadline for RDP1. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1575572311 From jbhateja at openjdk.org Mon Jun 5 17:06:16 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Mon, 5 Jun 2023 17:06:16 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v6] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 17:31:06 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_fmod.cpp line 121: >> >>> 119: // // |x|, |y| >>> 120: // a = DP_AND(x, DP_CONST(7fffffffffffffff)); >>> 121: __ movq(xmm0, xmm0); >> >> Redundatn move. > > I do not believe these are redundant, as the upper quadword of the register is cleared as a side-effect of the vmovq. I do not believe the icx compiler would insert random redundant vmovq instructions at this optimization level. Subsequent uses of xmm0 operate on 128 bit vector and eventually it feed into DIVSD instruction operating on fist 64 bit data. Given that we are clearing upper 64 bit it may be issued to execution port and consume 1 cycle. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14224#discussion_r1218352040 From shade at openjdk.org Mon Jun 5 17:28:08 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 Jun 2023 17:28:08 GMT Subject: RFR: 8309408: Thread.sleep cleanup In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 11:28:33 GMT, Alan Bateman wrote: > Thread.sleep has had quite a bit of churn recently to support virtual threads, add sleep(Duration), a JFR event, and the change the underlying implementation to support sub-millis precision. I think the changes have settled down now so we can do some small cleanups that came up in PR discussions. The cleanups were kicked down the road as it requires tracking down faraway tests that depend on the stack depth and the names of internal methods. The two cleanups proposed here are: > > 1. Add a private sleepNanos method that creates/commits the JFR event around the sleep, this avoids duplicate code in the 3 sleep methods. > 2. Rename JVM_Sleep to JVM_SleepNanos to make it clear that it takes the sleep time in nanoseconds, esp. when Thread.sleep's parameter is milliseconds. I think we need to delay this until [JDK-8309361](https://bugs.openjdk.org/browse/JDK-8309361) is resolved, in case we would like to revert [JDK-8305092](https://bugs.openjdk.org/browse/JDK-8305092). ------------- PR Review: https://git.openjdk.org/jdk/pull/14303#pullrequestreview-1463041304 From sgibbons at openjdk.org Mon Jun 5 17:57:04 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 5 Jun 2023 17:57:04 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v8] In-Reply-To: References: Message-ID: <3V3eleKHO09NJ1RU7cfFK3mPOKa5ngtQYePCt8YAmWY=.9fc4c4bd-962f-445d-8a94-2be5fa654807@github.com> > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Finish review comments; add tests for corner cases ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/1b44cd62..624d1248 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=06-07 Stats: 238 lines in 4 files changed: 236 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From vlivanov at openjdk.org Mon Jun 5 18:22:43 2023 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 5 Jun 2023 18:22:43 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Thu, 25 May 2023 22:54:15 GMT, Cesar Soares Lucas wrote: >> Can I please get reviews for this PR? >> >> The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. >> >> With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) >> >> What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: >> >> ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) >> >> This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. >> >> The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. >> >> The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. >> >> I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Catching up with master branch. > > Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Address PR review 6: refactoring around rematerialization & improve test cases. > - Address PR review 5: refactor on rematerialization & add tests. > - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Address part of PR review 4 & fix a bug setting only_candidate > - Catching up with master > > Merge remote-tracking branch 'origin/master' into rematerialization-of-merges > - Fix tests. Remember previous reducible Phis. > - Address PR review 3. Some comments and be able to abort compilation. > - Merge with Master > - Addressing PR review 2: refactor & reuse MacroExpand::scalar_replacement method. > - ... and 5 more: https://git.openjdk.org/jdk/compare/46c4da7f...8f81a7c8 src/hotspot/share/code/debugInfo.cpp line 251: > 249: // Set it to true so that the object will get rematerialized > 250: if (!_selected->is_root()) { > 251: _selected->set_root(true); Why do you need `_selected` to be marked as root? src/hotspot/share/code/debugInfo.cpp line 301: > 299: void ObjectMergeValue::print_detailed(outputStream* st) const { > 300: st->print("merge: ID=%d", _id); > 301: #ifndef PRODUCT Can you post a sample of the output, please? Why is it limited to non-product builds? It's valuable irrespective of build flavor. As I see in `ObjectValue::print_on` and `ScopeDesc::print_on`, you mix `print_on` with `print_fields_on`. Any particular reason for that? You could add `is_object_merge` case in ObjectValue::print_on` instead and extend `ObjectValue::print_fields_on` to cover `ObjectMergeValue` case. I find it hard to reason about `ObjectValue::print_on` vs `ObjectMergeValue::print_on` since it's a non-virtual method. Also, formatting is broken. src/hotspot/share/opto/compile.cpp line 2332: > 2330: } > 2331: > 2332: NOT_PRODUCT(ConnectionGraph::verify_ram_nodes(this, root());) Why do you limit the check to non-product builds only? It won't fail the compilation with product builds. src/hotspot/share/opto/output.cpp line 1101: > 1099: > 1100: if (!is_root) { > 1101: for (int k = 0; k < monarray->length(); k++) { I suggest to turn the lookup over `monarray` into a helper method and call it along with `locarray` and `exparray` checks: bool is_root = locarray->contains(ov) || exparray->contains(ov) || contains_as_owner(monarray, ov); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1217488199 PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218419279 PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1217491794 PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218431285 From sgibbons at openjdk.org Mon Jun 5 18:36:29 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 5 Jun 2023 18:36:29 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v9] In-Reply-To: References: Message-ID: <9gyMwajVcShejHFe9dDwsiaGubd4z4x8jn67-q3YBQM=.4e27a912-e2e5-48ee-8a12-fdcb52dfdb61@github.com> > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fix test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/624d1248..9b2c1db5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=07-08 Stats: 12 lines in 1 file changed: 1 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From sspitsyn at openjdk.org Mon Jun 5 19:00:49 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 5 Jun 2023 19:00:49 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: References: Message-ID: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> > When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. > The fix adds: > - a special case for passive carrier threads > - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. > > Testing: > - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` > - submitted mach5 tiers 1-5 > - TBD: to submit mach5 tier 6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge - minor tweaks in libThreadStateTest.cpp - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14298/files - new: https://git.openjdk.org/jdk/pull/14298/files/84d8825f..e60da02e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=00-01 Stats: 10489 lines in 228 files changed: 8499 ins; 1035 del; 955 mod Patch: https://git.openjdk.org/jdk/pull/14298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14298/head:pull/14298 PR: https://git.openjdk.org/jdk/pull/14298 From never at openjdk.org Mon Jun 5 19:01:23 2023 From: never at openjdk.org (Tom Rodriguez) Date: Mon, 5 Jun 2023 19:01:23 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal In-Reply-To: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> References: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> Message-ID: <1vHZFp-j2AKjYTX2bd_1RxQ2Ix252OCbqJ0m-AGaVTs=.b64ea885-8b22-45ad-8a76-b041a764a5de@github.com> On Fri, 2 Jun 2023 20:32:14 GMT, Doug Simon wrote: > This PR improves the startup time for libgraal by speeding up how `VM.savedProps` is copied into libgraal. This data structure is now serialized to a native buffer directly from C++ and the native buffer is then directly decoded by libgraal. > > ## Times > > The basic benchmarking below shows that this change brings the time for a nop Java app with eager libgraal initialization (2) down to almost the same time as lazy libgraal initialization (1). The latter typically means no libgraal initialization happens as a top tier JIT compilation is never scheduled in such a short running app. > > > public class Nop { > public static void main(String[] args) {} > } > > > (1) Baseline (no options): > >> for i in (seq 10); java Nop; end > 0.05 real 0.04 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.00 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > > > (2) Eagerly initialize libgraal (with PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.06 real 0.04 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > > > (3) Eagerly initialize libgraal (without PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.11 real 0.08 user 0.02 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real ... I don't really love the hard code parsing of the HashMap. What properties are actually required for JVMCI? It seems to me that the contents of Arguments::system_properties() should contain all the properties we want to advertise to JVMCI. That would have avoid having to decode them after they've been converted into Java objects. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14291#issuecomment-1577305531 From shade at openjdk.org Mon Jun 5 19:07:24 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 5 Jun 2023 19:07:24 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v6] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 10:10:22 GMT, Boris Ulasevich wrote: >> Async profiler shows that applications spend up to 10% in itable_stubs. >> >> The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. >> >> This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > push/pop(temp_get) -> push/pop(rdx) I measured the last PR on `c6.8xlarge`: Benchmark Mode Cnt Score Error Units # Baseline InterfaceCalls.test1stInt2Types avgt 12 1.582 ? 0.098 ns/op InterfaceCalls.test1stInt3Types avgt 12 6.218 ? 0.001 ns/op InterfaceCalls.test1stInt5Types avgt 12 6.220 ? 0.004 ns/op InterfaceCalls.test2ndInt2Types avgt 12 2.215 ? 0.004 ns/op InterfaceCalls.test2ndInt3Types avgt 12 7.590 ? 0.008 ns/op InterfaceCalls.test2ndInt5Types avgt 12 7.591 ? 0.004 ns/op InterfaceCalls.testIfaceCall avgt 12 6.238 ? 0.006 ns/op InterfaceCalls.testIfaceExtCall avgt 12 8.389 ? 0.500 ns/op InterfaceCalls.testMonomorphic avgt 12 1.035 ? 0.001 ns/op # Patched InterfaceCalls.test1stInt2Types avgt 12 1.476 ? 0.001 ns/op ; +7.2% InterfaceCalls.test1stInt3Types avgt 12 5.848 ? 0.012 ns/op ; +6.3% InterfaceCalls.test1stInt5Types avgt 12 5.842 ? 0.009 ns/op ; +6.4% InterfaceCalls.test2ndInt2Types avgt 12 2.213 ? 0.001 ns/op ; InterfaceCalls.test2ndInt3Types avgt 12 6.548 ? 0.002 ns/op ; +15.9% InterfaceCalls.test2ndInt5Types avgt 12 6.549 ? 0.003 ns/op ; +15.9% InterfaceCalls.testIfaceCall avgt 12 5.872 ? 0.007 ns/op ; +6.2% InterfaceCalls.testIfaceExtCall avgt 12 6.589 ? 0.008 ns/op ; +27.3% (high noise) InterfaceCalls.testMonomorphic avgt 12 1.035 ? 0.001 ns/op ; ------------- PR Comment: https://git.openjdk.org/jdk/pull/13460#issuecomment-1577320318 From cslucas at openjdk.org Mon Jun 5 19:30:07 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 5 Jun 2023 19:30:07 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Mon, 5 Jun 2023 18:05:47 GMT, Vladimir Ivanov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: >> >> - Catching up with master branch. >> >> Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Address PR review 6: refactoring around rematerialization & improve test cases. >> - Address PR review 5: refactor on rematerialization & add tests. >> - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Address part of PR review 4 & fix a bug setting only_candidate >> - Catching up with master >> >> Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Fix tests. Remember previous reducible Phis. >> - Address PR review 3. Some comments and be able to abort compilation. >> - Merge with Master >> - Addressing PR review 2: refactor & reuse MacroExpand::scalar_replacement method. >> - ... and 5 more: https://git.openjdk.org/jdk/compare/46c4da7f...8f81a7c8 > > src/hotspot/share/code/debugInfo.cpp line 301: > >> 299: void ObjectMergeValue::print_detailed(outputStream* st) const { >> 300: st->print("merge: ID=%d", _id); >> 301: #ifndef PRODUCT > > Can you post a sample of the output, please? > > Why is it limited to non-product builds? It's valuable irrespective of build flavor. > > As I see in `ObjectValue::print_on` and `ScopeDesc::print_on`, you mix `print_on` with `print_fields_on`. Any particular reason for that? You could add `is_object_merge` case in ObjectValue::print_on` instead and extend `ObjectValue::print_fields_on` to cover `ObjectMergeValue` case. I find it hard to reason about `ObjectValue::print_on` vs `ObjectMergeValue::print_on` since it's a non-virtual method. > > > > Also, formatting is broken. I added a few samples below and there are a few more here: https://gist.github.com/JohnTortugo/913523947e08157def6cfebafa7d5daa Sample 1: Compiled method (c2) 415 24 TestTrapAfterMerge::test (57 bytes) total in heap [0x00007f7b4d03da90,0x00007f7b4d03de18] = 904 relocation [0x00007f7b4d03dc00,0x00007f7b4d03dc18] = 24 main code [0x00007f7b4d03dc20,0x00007f7b4d03dcb8] = 152 stub code [0x00007f7b4d03dcb8,0x00007f7b4d03dcd0] = 24 oops [0x00007f7b4d03dcd0,0x00007f7b4d03dce0] = 16 metadata [0x00007f7b4d03dce0,0x00007f7b4d03dce8] = 8 scopes data [0x00007f7b4d03dce8,0x00007f7b4d03dd50] = 104 scopes pcs [0x00007f7b4d03dd50,0x00007f7b4d03de10] = 192 dependencies [0x00007f7b4d03de10,0x00007f7b4d03de18] = 8 scopes: ScopeDesc(pc=0x00007f7b4d03dc3a offset=1a): TestTrapAfterMerge::test at -1 (line 3) ScopeDesc(pc=0x00007f7b4d03dc41 offset=21): TestTrapAfterMerge::test at 11 (line 5) ScopeDesc(pc=0x00007f7b4d03dc44 offset=24): TestTrapAfterMerge::test at 51 (line 12) ScopeDesc(pc=0x00007f7b4d03dc4a offset=2a): TestTrapAfterMerge::test at 46 (line 8) ScopeDesc(pc=0x00007f7b4d03dc52 offset=32): TestTrapAfterMerge::test at 37 (line 9) ScopeDesc(pc=0x00007f7b4d03dc57 offset=37): TestTrapAfterMerge::test at 43 (line 8) ScopeDesc(pc=0x00007f7b4d03dc61 offset=41): TestTrapAfterMerge::test at 46 (line 8) reexecute=true Locals - l0: empty - l1: empty - l2: reg rbx [6],int - l3: empty - l4: merge: ID=26 - l5: reg r11 [22],int Objects - 0: merge: ID=26, selector="reg r10 [20],int", merge_pointer="nullptr", candidate objs=[27, 28] - 1: obj: ID=27, is_root=0, N.Fields=1, klass: Point Fields: reg r8 [16],int - 2: obj: ID=28, is_root=0, N.Fields=1, klass: Point Fields: reg rcx [2],int ScopeDesc(pc=0x00007f7b4d03dc63 offset=43): TestTrapAfterMerge::test at 46 (line 8) ScopeDesc(pc=0x00007f7b4d03dc6c offset=4c): TestTrapAfterMerge::test at 34 (line 8) ScopeDesc(pc=0x00007f7b4d03dc71 offset=51): TestTrapAfterMerge::test at 55 (line 12) - Sample2: Compiled method (c2) 443 24 TestManys::test (41 bytes) total in heap [0x00007f35e9155b90,0x00007f35e9155e78] = 744 relocation [0x00007f35e9155d00,0x00007f35e9155d18] = 24 main code [0x00007f35e9155d20,0x00007f35e9155d88] = 104 stub code [0x00007f35e9155d88,0x00007f35e9155da0] = 24 oops [0x00007f35e9155da0,0x00007f35e9155db0] = 16 metadata [0x00007f35e9155db0,0x00007f35e9155db8] = 8 scopes data [0x00007f35e9155db8,0x00007f35e9155e10] = 88 scopes pcs [0x00007f35e9155e10,0x00007f35e9155e70] = 96 dependencies [0x00007f35e9155e70,0x00007f35e9155e78] = 8 scopes: ScopeDesc(pc=0x00007f35e9155d3a offset=1a): TestManys::test at -1 (line 57) ScopeDesc(pc=0x00007f35e9155d42 offset=22): TestManys::test at 11 (line 59) ScopeDesc(pc=0x00007f35e9155d58 offset=38): TestManys::test at 25 (line 63) Locals - l0: empty - l1: empty - l2: empty - l3: empty - l4: empty - l5: empty - l6: empty - l7: empty - l8: merge: ID=26 Objects - 0: merge: ID=26, selector="reg rbp [10],int", merge_pointer="nullptr", candidate objs=[27, 28] - 1: obj: ID=27, is_root=0, N.Fields=4, klass: Point Fields: stack[36], stack[36], 0, 0 - 2: obj: ID=28, is_root=0, N.Fields=4, klass: Point Fields: 2023, 0, 0, 0 ScopeDesc(pc=0x00007f35e9155d74 offset=54): TestManys::test at 25 (line 63) - Sample3: Compiled method (c2) 436 24 TestMultiSFO::test (48 bytes) total in heap [0x00007f1df5155590,0x00007f1df5155850] = 704 relocation [0x00007f1df5155700,0x00007f1df5155718] = 24 main code [0x00007f1df5155720,0x00007f1df5155788] = 104 stub code [0x00007f1df5155788,0x00007f1df51557a0] = 24 oops [0x00007f1df51557a0,0x00007f1df51557b0] = 16 metadata [0x00007f1df51557b0,0x00007f1df51557b8] = 8 scopes data [0x00007f1df51557b8,0x00007f1df51557f8] = 64 scopes pcs [0x00007f1df51557f8,0x00007f1df5155848] = 80 dependencies [0x00007f1df5155848,0x00007f1df5155850] = 8 scopes: ScopeDesc(pc=0x00007f1df515573a offset=1a): TestMultiSFO::test at -1 (line 12) ScopeDesc(pc=0x00007f1df515575c offset=3c): TestMultiSFO::test at 28 (line 19) Locals - l0: empty - l1: empty - l2: empty - l3: merge: ID=14 - l4: obj: ID=15, is_root=1, N.Fields=2, klass: TestMultiSFO$Point Fields: stack[12], stack[8] Objects - 0: merge: ID=14, selector="reg rbp [10],int", merge_pointer="nullptr", candidate objs=[15, 16] - 1: obj: ID=15, is_root=1, N.Fields=2, klass: TestMultiSFO$Point Fields: stack[12], stack[8] - 2: obj: ID=16, is_root=0, N.Fields=2, klass: TestMultiSFO$Point Fields: stack[8], stack[12] ScopeDesc(pc=0x00007f1df5155778 offset=58): TestMultiSFO::test at 28 (line 19) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218500009 From alanb at openjdk.org Mon Jun 5 19:46:53 2023 From: alanb at openjdk.org (Alan Bateman) Date: Mon, 5 Jun 2023 19:46:53 GMT Subject: RFR: 8309408: Thread.sleep cleanup In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 17:25:43 GMT, Aleksey Shipilev wrote: > I think we need to delay this until [JDK-8309361](https://bugs.openjdk.org/browse/JDK-8309361) is resolved, in case we would like to revert [JDK-8305092](https://bugs.openjdk.org/browse/JDK-8305092). Okay, I won't integrate this until we see what the issue is. That said, if the sub-mills support needs to be reverted then I think we should keep the interface as nanos on the bound between the VM and the libraries. The reason is that it's mostly nanos at the Java level now, meaning 2 out of the 3 sleep methods support sub-mills, and the virtual thread sleep and the JFR event are in nanos too. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14303#issuecomment-1577374410 From cslucas at openjdk.org Mon Jun 5 19:55:09 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 5 Jun 2023 19:55:09 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Mon, 5 Jun 2023 19:26:59 GMT, Cesar Soares Lucas wrote: >> src/hotspot/share/code/debugInfo.cpp line 301: >> >>> 299: void ObjectMergeValue::print_detailed(outputStream* st) const { >>> 300: st->print("merge: ID=%d", _id); >>> 301: #ifndef PRODUCT >> >> Can you post a sample of the output, please? >> >> Why is it limited to non-product builds? It's valuable irrespective of build flavor. >> >> As I see in `ObjectValue::print_on` and `ScopeDesc::print_on`, you mix `print_on` with `print_fields_on`. Any particular reason for that? You could add `is_object_merge` case in ObjectValue::print_on` instead and extend `ObjectValue::print_fields_on` to cover `ObjectMergeValue` case. I find it hard to reason about `ObjectValue::print_on` vs `ObjectMergeValue::print_on` since it's a non-virtual method. >> >> >> >> Also, formatting is broken. > > I added a few samples below and there are a few more here: https://gist.github.com/JohnTortugo/913523947e08157def6cfebafa7d5daa > > Sample 1: > > > Compiled method (c2) 415 24 TestTrapAfterMerge::test (57 bytes) > total in heap [0x00007f7b4d03da90,0x00007f7b4d03de18] = 904 > relocation [0x00007f7b4d03dc00,0x00007f7b4d03dc18] = 24 > main code [0x00007f7b4d03dc20,0x00007f7b4d03dcb8] = 152 > stub code [0x00007f7b4d03dcb8,0x00007f7b4d03dcd0] = 24 > oops [0x00007f7b4d03dcd0,0x00007f7b4d03dce0] = 16 > metadata [0x00007f7b4d03dce0,0x00007f7b4d03dce8] = 8 > scopes data [0x00007f7b4d03dce8,0x00007f7b4d03dd50] = 104 > scopes pcs [0x00007f7b4d03dd50,0x00007f7b4d03de10] = 192 > dependencies [0x00007f7b4d03de10,0x00007f7b4d03de18] = 8 > scopes: > ScopeDesc(pc=0x00007f7b4d03dc3a offset=1a): > TestTrapAfterMerge::test at -1 (line 3) > ScopeDesc(pc=0x00007f7b4d03dc41 offset=21): > TestTrapAfterMerge::test at 11 (line 5) > ScopeDesc(pc=0x00007f7b4d03dc44 offset=24): > TestTrapAfterMerge::test at 51 (line 12) > ScopeDesc(pc=0x00007f7b4d03dc4a offset=2a): > TestTrapAfterMerge::test at 46 (line 8) > ScopeDesc(pc=0x00007f7b4d03dc52 offset=32): > TestTrapAfterMerge::test at 37 (line 9) > ScopeDesc(pc=0x00007f7b4d03dc57 offset=37): > TestTrapAfterMerge::test at 43 (line 8) > ScopeDesc(pc=0x00007f7b4d03dc61 offset=41): > TestTrapAfterMerge::test at 46 (line 8) reexecute=true > Locals > - l0: empty > - l1: empty > - l2: reg rbx [6],int > - l3: empty > - l4: merge: ID=26 > - l5: reg r11 [22],int > Objects > - 0: merge: ID=26, selector="reg r10 [20],int", merge_pointer="nullptr", candidate objs=[27, 28] > - 1: obj: ID=27, is_root=0, N.Fields=1, klass: Point > Fields: reg r8 [16],int > - 2: obj: ID=28, is_root=0, N.Fields=1, klass: Point > Fields: reg rcx [2],int > ScopeDesc(pc=0x00007f7b4d03dc63 offset=43): > TestTrapAfterMerge::test at 46 (line 8) > ScopeDesc(pc=0x00007f7b4d03dc6c offset=4c): > TestTrapAfterMerge::test at 34 (line 8) > ScopeDesc(pc=0x00007f7b4d03dc71 offset=51): > TestTrapAfterMerge::test at 55 (line 12) > > > - Sample2: > > > Compiled method (c2) 443 24 TestManys::test (41 bytes) > total in heap [0x00007f35e9155b90,0x00007f35e9155e78] = 744 > relocation [0x00007f35e9155d00,0x00007f35e9155d18] = 24 > main code [0x00007f35e9155d20,0x00007f35e9155d88] = 104 > stub code [0x00007f35e9155d88,0x00007f35e9155da0] = 24 > oops [0x00007f35e9155da0,0x00007f35e9155db0] =... > Why is it limited to non-product builds? It's valuable irrespective of build flavor. This is because `print_on` in `AnyObj` is only defined in non-product builds. I based implementation of `ObjectMergeValue::print_on` on `ObjectValue::print_on`. In `ObjectValue::print_on` fields aren't printed in product builds. > Any particular reason for that? You could add is_object_merge case in ObjectValue::print_oninstead and extendObjectValue::print_fields_onto coverObjectMergeValue case. I'll do that then. > Also, formatting is broken. Can you please share an example? If you mean the tabs on lines 303/304/306/307 I added those because I thought would make the code easier to read, but if you want I can definitely remove that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218523643 From cslucas at openjdk.org Mon Jun 5 19:55:14 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 5 Jun 2023 19:55:14 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Mon, 5 Jun 2023 05:10:13 GMT, Vladimir Ivanov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: >> >> - Catching up with master branch. >> >> Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Address PR review 6: refactoring around rematerialization & improve test cases. >> - Address PR review 5: refactor on rematerialization & add tests. >> - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Address part of PR review 4 & fix a bug setting only_candidate >> - Catching up with master >> >> Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Fix tests. Remember previous reducible Phis. >> - Address PR review 3. Some comments and be able to abort compilation. >> - Merge with Master >> - Addressing PR review 2: refactor & reuse MacroExpand::scalar_replacement method. >> - ... and 5 more: https://git.openjdk.org/jdk/compare/46c4da7f...8f81a7c8 > > src/hotspot/share/opto/compile.cpp line 2332: > >> 2330: } >> 2331: >> 2332: NOT_PRODUCT(ConnectionGraph::verify_ram_nodes(this, root());) > > Why do you limit the check to non-product builds only? It won't fail the compilation with product builds. Duh. I'll fix that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218525026 From cjplummer at openjdk.org Mon Jun 5 20:10:54 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 5 Jun 2023 20:10:54 GMT Subject: RFR: 8309408: Thread.sleep cleanup In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 15:10:18 GMT, Alan Bateman wrote: >> test/hotspot/jtreg/vmTestbase/nsk/monitoring/share/ThreadController.java line 660: >> >>> 658: expectedMethods.add(Thread.class.getName() + ".sleep"); >>> 659: expectedMethods.add(Thread.class.getName() + ".sleepNanos"); >>> 660: expectedMethods.add(Thread.class.getName() + ".sleepNanos0"); >> >> I'm surprised this test doesn't list `beforeSleep` and `afterSleep`. > >> There is one potential, pre-existing, test omission noted below. >> I'm surprised this test doesn't list `beforeSleep` and `afterSleep`. > > The monitoring/stress/thread tests will fail if they observe an unexpected method name in the stack trace. I don't think it can happen because the tests poll the thread state and for SleepingThread, it will sample the stack trace when the thread state is timed-wait. The beforeSleep/afterSleep methods won't in the stack trace when sleeping. It would be harmless to add them in that they aren't going to cause these tests to fail but might help with any further changes. The following commit in loom heavily modified this file with a lot of added expected methods. There are other related tests with similar changes. I'm not so sure I understand the need for so many additions, and also why expectedLength is so out of sync with the number of added method. I don't believe this commit was reviewed individually, but was just part of the overall loom review when merge into jdk. Perhaps it should be revisited. https://github.com/openjdk/loom/commit/26e66bc1a6a0dd735c8138a696809caba3e82b26 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14303#discussion_r1218539693 From mchung at openjdk.org Mon Jun 5 20:11:56 2023 From: mchung at openjdk.org (Mandy Chung) Date: Mon, 5 Jun 2023 20:11:56 GMT Subject: RFR: 8306647: Implementation of Structured Concurrency (Preview) [v4] In-Reply-To: References: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> Message-ID: On Thu, 1 Jun 2023 13:43:33 GMT, Alan Bateman wrote: >> This is the implementation of: >> >> - JEP 453: Structured Concurrency (Preview) >> - JEP 446: Scoped Values (Preview) >> >> For the most part, this is just moving code and tests. StructuredTaskScope moves to j.u.concurrent as a preview API, ScopedValue moves to j.lang as a preview API, and module jdk.incubator.concurrent has been removed. The significant API changes since incubator are: >> >> - StructuredTaskScope.fork returns Subtask instead of Future (JEP 453 has a section on this) >> - ScopedValue.where methods are replaced with runWhere, callWhere and getWhere > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: > > - Sync up from loom repo > - Merge > - Sync with loom repo, re-work ScopedValue class description > - Sync up from loom repo > - Remove csm.Threads > - Merge > - Test should not be in update for main line > - Sync with loom repo > - Sync up tests frmo loom repo > - Sync up with loom repo > - ... and 5 more: https://git.openjdk.org/jdk/compare/a46b5acc...cc902ce6 I reviewed the implementation changes to promote an incubating API to a preview API. That part looks good. ------------- Marked as reviewed by mchung (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13932#pullrequestreview-1463317525 From kdnilsen at openjdk.org Mon Jun 5 20:13:11 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 5 Jun 2023 20:13:11 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp line 733: > 731: } else { > 732: iu_barrier(masm, val, tmp3); > 733: // TODO: store_check missing in upstream Remove this comment with integration. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1218543687 From vlivanov at openjdk.org Mon Jun 5 20:31:02 2023 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 5 Jun 2023 20:31:02 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Mon, 5 Jun 2023 19:50:25 GMT, Cesar Soares Lucas wrote: > If you mean the tabs on lines 303/304/306/307 Yes, it confused me. As an alternative, you could put selector and merge_pointer-related statements on the same line, but I'm not sure how much it improves readability: st->print(", selector=""); _selector->print_on(st); st->print("""); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218558213 From vlivanov at openjdk.org Mon Jun 5 20:31:03 2023 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 5 Jun 2023 20:31:03 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: <4gr0ARilcuMl1Zfht5_7qYOd-OouT_2rIa8SgQuQWDw=.b55a2bc3-def0-4e29-bfb2-cc940d3493fb@github.com> On Mon, 5 Jun 2023 20:27:42 GMT, Vladimir Ivanov wrote: >>> Why is it limited to non-product builds? It's valuable irrespective of build flavor. >> >> This is because `print_on` in `AnyObj` is only defined in non-product builds. I based implementation of `ObjectMergeValue::print_on` on `ObjectValue::print_on`. In `ObjectValue::print_on` fields aren't printed in product builds. >> >>> Any particular reason for that? You could add is_object_merge case in ObjectValue::print_oninstead and extendObjectValue::print_fields_onto coverObjectMergeValue case. >> >> I'll do that then. >> >>> Also, formatting is broken. >> >> Can you please share an example? If you mean the tabs on lines 303/304/306/307 I added those because I thought would make the code easier to read, but if you want I can definitely remove that. > >> If you mean the tabs on lines 303/304/306/307 > > Yes, it confused me. As an alternative, you could put selector and merge_pointer-related statements on the same line, but I'm not sure how much it improves readability: > > st->print(", selector=""); _selector->print_on(st); st->print("""); A couple of suggestions about the output: * `merge`: it's clearer to call it `merge_obj` * `obj` vs `merge` output: obj output is duplicated in ScopeDesc entries and Objects sections; before it was a short version printed in Locals/Expressions and all the details were included in Objects; I like to see field locations in the short version, but including everything looks way too much IMO; * it makes sense to include selector and merge_pointer info in short version, but `is_root` can be omitted ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218558295 From cjplummer at openjdk.org Mon Jun 5 20:33:52 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 5 Jun 2023 20:33:52 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> References: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> Message-ID: On Mon, 5 Jun 2023 19:00:49 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge > - minor tweaks in libThreadStateTest.cpp > - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING Without the fix in place, do your new tests reproduce the issue> I'm trying to recall the origins of the filing of this CR. I thought I had noticed this issue while working with a JDI test and discussed it with you and Alan. Just wondering if there is something that can be done to a JDI test to also test for this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14298#issuecomment-1577432927 From dnsimon at openjdk.org Mon Jun 5 20:51:07 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 5 Jun 2023 20:51:07 GMT Subject: RFR: 8309136: [JVMCI] add -XX:+UseGraalJIT flag [v4] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 16:20:49 GMT, Doug Simon wrote: >> Use of the Graal-based JIT in OpenJDK currently requires the following flag: `-XX:+EnableJVMCIProduct` >> >> This has no direct association with Graal. If the JDK image happens to include a non-Graal JVMCI implementation, it will be automatically selected. This would come as a surprise to users who equate JVMCI with Graal. >> >> This PR introduces a new flag, `-XX:+UseGraalJIT` to address these shortcomings. It is an alias for `-XX:+EnableJVMCIProduct -Djvmci.Compiler=graal`. >> >> When `-XX:+UseGraalJIT` is specified, the VM fails fast at startup if there is a non-Graal JVMCI implementation or no JVMCI implementation in the JDK image. > > Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains eight additional commits since the last revision: > > - [skip-ci] Merge remote-tracking branch 'openjdk-jdk/master' into JDK-8309136 > - improve error message when UseGraalJIT is used without -XX:+UnlockExperimentalVMOptions > - use strncmp instead of strcmp > - fix date in copyright header > - set UseGraalJIT value in enable_jvmci_product_mode > - added missing test of UseJVMCICompiler when adjusting JVMCI flags under -Xint > - review based fixes > - add UseGraalJIT VM flag Thanks to all the reviewers. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14231#issuecomment-1577450756 From dnsimon at openjdk.org Mon Jun 5 20:51:08 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Mon, 5 Jun 2023 20:51:08 GMT Subject: Integrated: 8309136: [JVMCI] add -XX:+UseGraalJIT flag In-Reply-To: References: Message-ID: <3y3VWQ1UhRbnC7IdIbvXgHsuBxuo6aswfCrliAKax98=.c6d5f42d-5873-4802-85c7-2a755d19c615@github.com> On Tue, 30 May 2023 22:31:13 GMT, Doug Simon wrote: > Use of the Graal-based JIT in OpenJDK currently requires the following flag: `-XX:+EnableJVMCIProduct` > > This has no direct association with Graal. If the JDK image happens to include a non-Graal JVMCI implementation, it will be automatically selected. This would come as a surprise to users who equate JVMCI with Graal. > > This PR introduces a new flag, `-XX:+UseGraalJIT` to address these shortcomings. It is an alias for `-XX:+EnableJVMCIProduct -Djvmci.Compiler=graal`. > > When `-XX:+UseGraalJIT` is specified, the VM fails fast at startup if there is a non-Graal JVMCI implementation or no JVMCI implementation in the JDK image. This pull request has now been integrated. Changeset: b3c9d678 Author: Doug Simon URL: https://git.openjdk.org/jdk/commit/b3c9d6785e061faf5ea9574bed2f9ab73cc11eaf Stats: 66 lines in 5 files changed: 43 ins; 0 del; 23 mod 8309136: [JVMCI] add -XX:+UseGraalJIT flag Reviewed-by: dholmes, kvn ------------- PR: https://git.openjdk.org/jdk/pull/14231 From sspitsyn at openjdk.org Mon Jun 5 21:08:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 5 Jun 2023 21:08:57 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> References: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> Message-ID: On Mon, 5 Jun 2023 19:00:49 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge > - minor tweaks in libThreadStateTest.cpp > - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING > ------------- PR Comment: https://git.openjdk.org/jdk/pull/14298#issuecomment-1577475470 From cslucas at openjdk.org Mon Jun 5 21:13:06 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 5 Jun 2023 21:13:06 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: <4gr0ARilcuMl1Zfht5_7qYOd-OouT_2rIa8SgQuQWDw=.b55a2bc3-def0-4e29-bfb2-cc940d3493fb@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> <4gr0ARilcuMl1Zfht5_7qYOd-OouT_2rIa8SgQuQWDw=.b55a2bc3-def0-4e29-bfb2-cc940d3493fb@github.com> Message-ID: <3bgmER7fyi8uvkp58Fwr5s4XHT0BWOoED49EVDTRSDI=.a5839cc3-c2b2-4a94-a097-f748a3cf0a29@github.com> On Mon, 5 Jun 2023 20:27:48 GMT, Vladimir Ivanov wrote: >>> If you mean the tabs on lines 303/304/306/307 >> >> Yes, it confused me. As an alternative, you could put selector and merge_pointer-related statements on the same line, but I'm not sure how much it improves readability: >> >> st->print(", selector=""); _selector->print_on(st); st->print("""); > > A couple of suggestions about the output: > * `merge`: it's clearer to call it `merge_obj` > * `obj` vs `merge` output: obj output is duplicated in ScopeDesc entries and Objects sections; before it was a short version printed in Locals/Expressions and all the details were included in Objects; I like to see field locations in the short version, but including everything looks way too much IMO; > * it makes sense to include selector and merge_pointer info in short version, but `is_root` can be omitted Thanks @iwanowww . Does the output below look good to you? It prints ObjectValue in the same format as it was before this PR and only print details of the merge in the "Objects" section. Is there other output section that you think needs to be adjusted? Compiled method (c2) 436 24 TestMultiSFO::test (48 bytes) total in heap [0x00007f1df5155590,0x00007f1df5155850] = 704 relocation [0x00007f1df5155700,0x00007f1df5155718] = 24 main code [0x00007f1df5155720,0x00007f1df5155788] = 104 stub code [0x00007f1df5155788,0x00007f1df51557a0] = 24 oops [0x00007f1df51557a0,0x00007f1df51557b0] = 16 metadata [0x00007f1df51557b0,0x00007f1df51557b8] = 8 scopes data [0x00007f1df51557b8,0x00007f1df51557f8] = 64 scopes pcs [0x00007f1df51557f8,0x00007f1df5155848] = 80 dependencies [0x00007f1df5155848,0x00007f1df5155850] = 8 scopes: ScopeDesc(pc=0x00007f1df515573a offset=1a): TestMultiSFO::test at -1 (line 12) ScopeDesc(pc=0x00007f1df515575c offset=3c): TestMultiSFO::test at 28 (line 19) Locals - l0: empty - l1: empty - l2: empty - l3: merge_obj[14] - l4: obj[15] Objects - merge_obj[14], selector="reg rbp [10],int", merge_pointer="nullptr", candidate_objs=[15, 16] - obj[15], is_root=1, klass: TestMultiSFO$Point Fields: stack[12], stack[8] - obj[16], is_root=0, klass: TestMultiSFO$Point Fields: stack[8], stack[12] ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218596666 From sspitsyn at openjdk.org Mon Jun 5 21:29:54 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 5 Jun 2023 21:29:54 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 11:14:06 GMT, Alan Bateman wrote: >> The lines 763-764 are to correct the state exactly for passive carrier thread, a carrier thread which can't progress until the execution control has not been returned from a virtual thread executed on the top. It is never for a platform thread which is not a carrier thread. "Passive" is the best word I was able to find for this meaning. Do you have any other word/suggestion in mind? > >> The lines 763-764 are to correct the state exactly for passive carrier thread, a carrier thread which can't progress until the execution control has not been returned from a virtual thread executed on the top. It is never for a platform thread which is not a carrier thread. "Passive" is the best word I was able to find for this meaning. Do you have any other word/suggestion in mind? > > It's just a carrier. A platform thread becomes a carrier when a virtual thread is mounted, it ceases to be a carrier once the virtual thread is unmounted. The mental model is that the carrier is blocked so reporting its state as waiting indefinitely is correct. Maybe you don't want to rename it in this PR but renaming this function to something like is_carrying would convey that it's asking the question if a given JavaThread is carrying the given virtual thread oop. Okay, I see you point. Unfortunately, I've always referred the platform thread with an executed FJP schedular as a carrier thread. The term 'carrier' with this meaning is everywhere in the JVMTI code. It looks very confusing to call a thread to be a carrier thread only during some phases of its execution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1218613536 From vlivanov at openjdk.org Mon Jun 5 22:08:02 2023 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Mon, 5 Jun 2023 22:08:02 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: <3bgmER7fyi8uvkp58Fwr5s4XHT0BWOoED49EVDTRSDI=.a5839cc3-c2b2-4a94-a097-f748a3cf0a29@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> <4gr0ARilcuMl1Zfht5_7qYOd-OouT_2rIa8SgQuQWDw=.b55a2bc3-def0-4e29-bfb2-cc940d3493fb@github.com> <3bgmER7fyi8uvkp58Fwr5s4XHT0BWOoED49EVDTRSDI=.a5839cc3-c2b2-4a94-a097-f748a3cf0a29@github.com> Message-ID: <4CuSp8KR3SDGjc88Pd57VcwsBdjG5_FUT94U8XkoM0s=.e61ce2a5-b9b4-40aa-9399-62b3ac275634@github.com> On Mon, 5 Jun 2023 21:10:22 GMT, Cesar Soares Lucas wrote: >> A couple of suggestions about the output: >> * `merge`: it's clearer to call it `merge_obj` >> * `obj` vs `merge` output: obj output is duplicated in ScopeDesc entries and Objects sections; before it was a short version printed in Locals/Expressions and all the details were included in Objects; I like to see field locations in the short version, but including everything looks way too much IMO; >> * it makes sense to include selector and merge_pointer info in short version, but `is_root` can be omitted > > Thanks @iwanowww . Does the output below look good to you? It prints ObjectValue in the same format as it was before this PR and only print details of the merge in the "Objects" section. Is there other output section that you think needs to be adjusted? > > > Compiled method (c2) 436 24 TestMultiSFO::test (48 bytes) > total in heap [0x00007f1df5155590,0x00007f1df5155850] = 704 > relocation [0x00007f1df5155700,0x00007f1df5155718] = 24 > main code [0x00007f1df5155720,0x00007f1df5155788] = 104 > stub code [0x00007f1df5155788,0x00007f1df51557a0] = 24 > oops [0x00007f1df51557a0,0x00007f1df51557b0] = 16 > metadata [0x00007f1df51557b0,0x00007f1df51557b8] = 8 > scopes data [0x00007f1df51557b8,0x00007f1df51557f8] = 64 > scopes pcs [0x00007f1df51557f8,0x00007f1df5155848] = 80 > dependencies [0x00007f1df5155848,0x00007f1df5155850] = 8 > scopes: > ScopeDesc(pc=0x00007f1df515573a offset=1a): > TestMultiSFO::test at -1 (line 12) > ScopeDesc(pc=0x00007f1df515575c offset=3c): > TestMultiSFO::test at 28 (line 19) > Locals > - l0: empty > - l1: empty > - l2: empty > - l3: merge_obj[14] > - l4: obj[15] > > Objects > - merge_obj[14], selector="reg rbp [10],int", merge_pointer="nullptr", candidate_objs=[15, 16] > - obj[15], is_root=1, klass: TestMultiSFO$Point > Fields: stack[12], stack[8] > - obj[16], is_root=0, klass: TestMultiSFO$Point > Fields: stack[8], stack[12] Thanks, it looks much better now (except the position in Objects array is missing). It makes sense to mention `is_root` for merge_obj case even though it's always equals to '1`. Also, make merge_pointer optional and omit it when its value is null. BTW instead of printing `is_root=0/1`, you can introduce a more compact notation for and mark relevant lines with a single symbol: - 0: R merge_obj[14], selector="reg rbp [10],int" candidates=[15, 16] - 1: R obj[15], klass: TestMultiSFO$Point Fields: stack[12], stack[8] - 2: obj[16], klass: TestMultiSFO$Point Fields: stack[8], stack[12] ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218642386 From cslucas at openjdk.org Mon Jun 5 22:49:04 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 5 Jun 2023 22:49:04 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Mon, 5 Jun 2023 05:05:26 GMT, Vladimir Ivanov wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 15 commits: >> >> - Catching up with master branch. >> >> Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Address PR review 6: refactoring around rematerialization & improve test cases. >> - Address PR review 5: refactor on rematerialization & add tests. >> - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Address part of PR review 4 & fix a bug setting only_candidate >> - Catching up with master >> >> Merge remote-tracking branch 'origin/master' into rematerialization-of-merges >> - Fix tests. Remember previous reducible Phis. >> - Address PR review 3. Some comments and be able to abort compilation. >> - Merge with Master >> - Addressing PR review 2: refactor & reuse MacroExpand::scalar_replacement method. >> - ... and 5 more: https://git.openjdk.org/jdk/compare/46c4da7f...8f81a7c8 > > src/hotspot/share/code/debugInfo.cpp line 251: > >> 249: // Set it to true so that the object will get rematerialized >> 250: if (!_selected->is_root()) { >> 251: _selected->set_root(true); > > Why do you need `_selected` to be marked as root? I think you're right, there is no need for that. I'll remove/refactor that and run tests again. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218672363 From cslucas at openjdk.org Mon Jun 5 22:49:05 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 5 Jun 2023 22:49:05 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v14] In-Reply-To: <4CuSp8KR3SDGjc88Pd57VcwsBdjG5_FUT94U8XkoM0s=.e61ce2a5-b9b4-40aa-9399-62b3ac275634@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> <4gr0ARilcuMl1Zfht5_7qYOd-OouT_2rIa8SgQuQWDw=.b55a2bc3-def0-4e29-bfb2-cc940d3493fb@github.com> <3bgmER7fyi8uvkp58Fwr5s4XHT0BWOoED49EVDTRSDI=.a5839cc3-c2b2-4a94-a097-f748a3cf0a29@github.com> <4CuSp8KR3SDGjc88Pd57VcwsBdjG5_FUT94U8XkoM0s=.e61ce2a5-b9b4-40aa-9399-62b3ac275634@github.com> Message-ID: On Mon, 5 Jun 2023 22:03:59 GMT, Vladimir Ivanov wrote: >> Thanks @iwanowww . Does the output below look good to you? It prints ObjectValue in the same format as it was before this PR and only print details of the merge in the "Objects" section. Is there other output section that you think needs to be adjusted? >> >> >> Compiled method (c2) 436 24 TestMultiSFO::test (48 bytes) >> total in heap [0x00007f1df5155590,0x00007f1df5155850] = 704 >> relocation [0x00007f1df5155700,0x00007f1df5155718] = 24 >> main code [0x00007f1df5155720,0x00007f1df5155788] = 104 >> stub code [0x00007f1df5155788,0x00007f1df51557a0] = 24 >> oops [0x00007f1df51557a0,0x00007f1df51557b0] = 16 >> metadata [0x00007f1df51557b0,0x00007f1df51557b8] = 8 >> scopes data [0x00007f1df51557b8,0x00007f1df51557f8] = 64 >> scopes pcs [0x00007f1df51557f8,0x00007f1df5155848] = 80 >> dependencies [0x00007f1df5155848,0x00007f1df5155850] = 8 >> scopes: >> ScopeDesc(pc=0x00007f1df515573a offset=1a): >> TestMultiSFO::test at -1 (line 12) >> ScopeDesc(pc=0x00007f1df515575c offset=3c): >> TestMultiSFO::test at 28 (line 19) >> Locals >> - l0: empty >> - l1: empty >> - l2: empty >> - l3: merge_obj[14] >> - l4: obj[15] >> >> Objects >> - merge_obj[14], selector="reg rbp [10],int", merge_pointer="nullptr", candidate_objs=[15, 16] >> - obj[15], is_root=1, klass: TestMultiSFO$Point >> Fields: stack[12], stack[8] >> - obj[16], is_root=0, klass: TestMultiSFO$Point >> Fields: stack[8], stack[12] > > Thanks, it looks much better now (except the position in Objects array is missing). > > It makes sense to mention `is_root` for merge_obj case even though it's always equals to '1`. Also, make merge_pointer optional and omit it when its value is null. > > BTW instead of printing `is_root=0/1`, you can introduce a more compact notation and mark relevant lines with a single symbol: > > - 0: R merge_obj[14], selector="reg rbp [10],int" candidates=[15, 16] > - 1: R obj[15], klass: TestMultiSFO$Point > Fields: stack[12], stack[8] > - 2: obj[16], klass: TestMultiSFO$Point > Fields: stack[8], stack[12] Sounds good. I'll make the changes and push them asap. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/12897#discussion_r1218671994 From sgibbons at openjdk.org Mon Jun 5 23:48:21 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 5 Jun 2023 23:48:21 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v10] In-Reply-To: References: Message-ID: > Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. > > Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). > > Old: > gcc-12.2.1-4.fc36.x86_64 > 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix > JVM version: 21-internal > Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 > Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 > Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 > Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 > Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 > Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 > Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 > Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 > Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 > Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 > New: > JVM version: 21-internal (float) > Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 > Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 > Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 > Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 > Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 > Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 > Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fix tests; need vlbwdq for vpbroadcastq ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14224/files - new: https://git.openjdk.org/jdk/pull/14224/files/9b2c1db5..e77d0817 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14224&range=08-09 Stats: 43 lines in 4 files changed: 30 ins; 1 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/14224.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14224/head:pull/14224 PR: https://git.openjdk.org/jdk/pull/14224 From sgibbons at openjdk.org Mon Jun 5 23:48:46 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Mon, 5 Jun 2023 23:48:46 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v9] In-Reply-To: <9gyMwajVcShejHFe9dDwsiaGubd4z4x8jn67-q3YBQM=.4e27a912-e2e5-48ee-8a12-fdcb52dfdb61@github.com> References: <9gyMwajVcShejHFe9dDwsiaGubd4z4x8jn67-q3YBQM=.4e27a912-e2e5-48ee-8a12-fdcb52dfdb61@github.com> Message-ID: On Mon, 5 Jun 2023 18:36:29 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix test @vnkozlov I believe this is ready for integration now. Can I ask you to run your test battery on this PR please? I should have approvals very soon. Thank you. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14224#issuecomment-1577694036 From kdnilsen at openjdk.org Mon Jun 5 23:57:09 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Mon, 5 Jun 2023 23:57:09 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: <58eIIR00lhW278uv5z9Klo0SBJJWfn6D4JQFqgslqdE=.ca795de6-2480-4bb4-bbdc-12388f8fe388@github.com> On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. src/hotspot/share/gc/shenandoah/heuristics/shenandoahPassiveHeuristics.cpp line 3: > 1: /* > 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Copyright overreach. Revert. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1218719643 From iklam at openjdk.org Tue Jun 6 00:13:53 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 6 Jun 2023 00:13:53 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: <0_uC7OzsWkpKayYwBA4Q3h1ZJuyD3GNukoB8B_6jSdE=.a549624f-75b7-4c5f-abc2-ccdf4954d013@github.com> Message-ID: On Fri, 2 Jun 2023 20:41:53 GMT, Ashutosh Mehra wrote: >> src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 712: >> >>> 710: // the location of the archive space in the heap. The returned address may or may >>> 711: // not be same as the preferred address. >>> 712: HeapWord* alloc_archive_region(size_t word_size, HeapWord* preferred_addr); >> >> Sorry to be picky, but I think `region` implies a single region, but G1 could allocate one or more regions to satisfy the request. I think it's better to use `allocate_archive_range(MemRegion requested_range)` to be more neutral. Passing the range in a `MemRegion` will also look similar to the API right above this one. >> >> Maybe we should also change `populate_archive_regions_bot_part` and `dealloc_archive_regions` to use `_range` as well. What do you think, @tschatzl > > How about replacing `allocate_archive_range` with `allocate_archive_space` which is actually planned for the next patch? > I am more inclined to keep size and address as separate parameters because the address is just a hint for the collectors, while size is not. With MemRegion as the type this information is not conveyed to the reader without reading the comments/code. This is also the reason why I prefer using "preferred_addr" rather than "requested_addr". > But if you feel otherwise I will update the API to use MemRegion. > >> Maybe we should also change populate_archive_regions_bot_part and dealloc_archive_regions to use _range as well > > I am not sure if it is necessary to update these APIs, as I would anyway be replacing them with fixup_archive_space() and handle_archive_space_failure() as mentioned in the description of main issue [JDK-8296263](https://bugs.openjdk.org/browse/JDK-8296263). IMO these names are more generic. If you want I can update the code to use these new names in this patch. If you plan to change these soon I think it's OK to leave the names as you have for this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1218731829 From sspitsyn at openjdk.org Tue Jun 6 01:34:26 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 01:34:26 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread Message-ID: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `GetSingleStackTraceClosure` and `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. A minor refactoring of the `GetSingleStackTraceClosure` was made to fix the issue. Also, a new test was added to provide coverage: - `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` Testing: - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` - TBD: tiers 1-6 ------------- Commit messages: - remove a trailing space in new ThreadListStackTracesTest.java - 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread Changes: https://git.openjdk.org/jdk/pull/14326/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8295976 Stats: 241 lines in 5 files changed: 224 ins; 10 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/14326.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14326/head:pull/14326 PR: https://git.openjdk.org/jdk/pull/14326 From shade at openjdk.org Tue Jun 6 05:50:53 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 Jun 2023 05:50:53 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v6] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 10:10:22 GMT, Boris Ulasevich wrote: >> Async profiler shows that applications spend up to 10% in itable_stubs. >> >> The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. >> >> This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > push/pop(temp_get) -> push/pop(rdx) Passes the `tier1 tier2 tier3` on Linux x86_64 fastdebug for me. Looks good! ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/13460#pullrequestreview-1464312822 From dzhang at openjdk.org Tue Jun 6 07:05:53 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Tue, 6 Jun 2023 07:05:53 GMT Subject: RFR: 8309418: RISC-V: Make use of vl1r.v & vfabs.v pseudo-instructions where appropriate In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 06:51:42 GMT, Fei Yang wrote: >> Hi all, >> We should add assembler functions for two pseudo-instructions vl1r.v [1] & >> vfabs.v [2] and use them when appropriate for better readability. >> >> At the same time, we removed a few unused assembly instructions. Please take a look >> and have some reviews. Thanks a lot. >> >> [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#79-vector-loadstore-whole-register-instructions >> [2] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1312-vector-floating-point-sign-injection-instructions >> >> ## Testing: >> qemu w/ UseRVV: >> - [x] Tier1 tests (release) >> - [x] Tier2 tests (release) >> - [x] Tier3 tests (release) >> - [x] test/jdk/jdk/incubator/vector (release/fastdebug) > > Looks good. @RealFYang @luhenry @zifeihan Thanks for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14309#issuecomment-1578040209 From alanb at openjdk.org Tue Jun 6 07:13:15 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 6 Jun 2023 07:13:15 GMT Subject: RFR: 8306647: Implementation of Structured Concurrency (Preview) [v5] In-Reply-To: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> References: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> Message-ID: > This is the implementation of: > > - JEP 453: Structured Concurrency (Preview) > - JEP 446: Scoped Values (Preview) > > For the most part, this is just moving code and tests. StructuredTaskScope moves to j.u.concurrent as a preview API, ScopedValue moves to j.lang as a preview API, and module jdk.incubator.concurrent has been removed. The significant API changes since incubator are: > > - StructuredTaskScope.fork returns Subtask instead of Future (JEP 453 has a section on this) > - ScopedValue.where methods are replaced with runWhere, callWhere and getWhere Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: - Fix typo in javadoc - Merge - Merge - Sync up from loom repo - Merge - Sync with loom repo, re-work ScopedValue class description - Sync up from loom repo - Remove csm.Threads - Merge - Test should not be in update for main line - ... and 8 more: https://git.openjdk.org/jdk/compare/2e9eff56...0f514588 ------------- Changes: https://git.openjdk.org/jdk/pull/13932/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13932&range=04 Stats: 9229 lines in 40 files changed: 4856 ins; 4315 del; 58 mod Patch: https://git.openjdk.org/jdk/pull/13932.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13932/head:pull/13932 PR: https://git.openjdk.org/jdk/pull/13932 From duke at openjdk.org Tue Jun 6 07:28:55 2023 From: duke at openjdk.org (JoKern65) Date: Tue, 6 Jun 2023 07:28:55 GMT Subject: RFR: JDK-8308288: Fix xlc17 clang warnings and build errors in hotspot In-Reply-To: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> References: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> Message-ID: On Fri, 2 Jun 2023 11:28:45 GMT, JoKern65 wrote: > This pr is a split off from JDK-8308288 : Fix xlc17 clang warnings in shared code https://github.com/openjdk/jdk/pull/14146 > It handles the part in hotspot. > > It handles the error introduced by a redefine of malloc in stdlib.h resulting in the following build error: > > /data/d042520/pr/jdk/src/hotspot/share/runtime/os.cpp:616:5: error: no member named '_vec_malloc' in 'LogTag'; did you mean 'vec_malloc'? > log_warning(malloc, free)("ptr caught: " PTR_FORMAT, p2i(ptr)); > ^~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:46:28: note: expanded from macro 'log_warning' > #define log_warning(...) (!log_is_enabled(Warning, __VA_ARGS__)) ? (void)0 : LogImpl::write > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:68:45: note: expanded from macro 'log_is_enabled' > #define log_is_enabled(level, ...) (LogImpl::is_level(LogLevel::level)) > ^~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:221:38: note: expanded from macro 'LOG_TAGS' > #define LOG_TAGS(...) EXPAND_VARARGS(LOG_TAGS_EXPANDED(__VA_ARGS__, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG)) > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:217:57: note: expanded from macro 'LOG_TAGS_EXPANDED' > #define LOG_TAGS_EXPANDED(T0, T1, T2, T3, T4, T5, ...) PREFIX_LOG_TAG(T0), PREFIX_LOG_TAG(T1), PREFIX_LOG_TAG(T2), \ > ^~~~~~~~~~~~~~~~~~ > ... (rest of output omitted) > > > Additionally it solves the need for an #include on AIX for any usage of the alloca function, by adding the include to globalDefinitions_xlc.hpp @kimbarrett, @tstuefe, are you fine with pushing the fix for the malloc macro as is? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14283#issuecomment-1578069668 From dholmes at openjdk.org Tue Jun 6 07:32:53 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 6 Jun 2023 07:32:53 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread In-Reply-To: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: On Tue, 6 Jun 2023 00:50:34 GMT, Serguei Spitsyn wrote: > The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. > The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `GetSingleStackTraceClosure` and `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. > > Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. > A minor refactoring of the `GetSingleStackTraceClosure` was made to fix the issue. > > Also, a new test was added to provide coverage: > - `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > > Testing: > - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > - TBD: tiers 1-6 Just a passing comment but I happened to notice today that when a virtual thread blocks on a legacy synchronization mechanism, it delegates to its carrier thread to report its state. It is not at all clear to me how this is handled at the JVMTI level. ------------- PR Review: https://git.openjdk.org/jdk/pull/14326#pullrequestreview-1464472318 From alanb at openjdk.org Tue Jun 6 07:42:52 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 6 Jun 2023 07:42:52 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: <4CPJ19jOAYq8cQqJWqGRgwO-kbLv0glRGTna0FY4jtE=.c1ca455b-80f6-47e9-ba37-1db3b33bb95c@github.com> On Tue, 6 Jun 2023 07:30:26 GMT, David Holmes wrote: > Just a passing comment but I happened to notice today that when a virtual thread blocks on a legacy synchronization mechanism, it delegates to its carrier thread to report its state. It is not at all clear to me how this is handled at the JVMTI level. When mounted, the virtual thread state comes from its carrier. JVM TI GetThreadState and other functions that return state do the same. Somehow the bulk function GetThreadListStackTraces was missed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14326#issuecomment-1578097719 From dzhang at openjdk.org Tue Jun 6 09:11:01 2023 From: dzhang at openjdk.org (Dingli Zhang) Date: Tue, 6 Jun 2023 09:11:01 GMT Subject: Integrated: 8309418: RISC-V: Make use of vl1r.v & vfabs.v pseudo-instructions where appropriate In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 06:13:08 GMT, Dingli Zhang wrote: > Hi all, > We should add assembler functions for two pseudo-instructions vl1r.v [1] & > vfabs.v [2] and use them when appropriate for better readability. > > At the same time, we removed a few unused assembly instructions. Please take a look > and have some reviews. Thanks a lot. > > [1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#79-vector-loadstore-whole-register-instructions > [2] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#1312-vector-floating-point-sign-injection-instructions > > ## Testing: > qemu w/ UseRVV: > - [x] Tier1 tests (release) > - [x] Tier2 tests (release) > - [x] Tier3 tests (release) > - [x] test/jdk/jdk/incubator/vector (release/fastdebug) This pull request has now been integrated. Changeset: 5146a582 Author: Dingli Zhang Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/5146a58249bbbfdf7304e9f8062c95369ccd820f Stats: 19 lines in 5 files changed: 8 ins; 7 del; 4 mod 8309418: RISC-V: Make use of vl1r.v & vfabs.v pseudo-instructions where appropriate Reviewed-by: fyang, luhenry, gcao ------------- PR: https://git.openjdk.org/jdk/pull/14309 From tschatzl at openjdk.org Tue Jun 6 09:43:52 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 6 Jun 2023 09:43:52 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: <0_uC7OzsWkpKayYwBA4Q3h1ZJuyD3GNukoB8B_6jSdE=.a549624f-75b7-4c5f-abc2-ccdf4954d013@github.com> Message-ID: On Tue, 6 Jun 2023 00:11:29 GMT, Ioi Lam wrote: >> How about replacing `allocate_archive_range` with `allocate_archive_space` which is actually planned for the next patch? >> I am more inclined to keep size and address as separate parameters because the address is just a hint for the collectors, while size is not. With MemRegion as the type this information is not conveyed to the reader without reading the comments/code. This is also the reason why I prefer using "preferred_addr" rather than "requested_addr". >> But if you feel otherwise I will update the API to use MemRegion. >> >>> Maybe we should also change populate_archive_regions_bot_part and dealloc_archive_regions to use _range as well >> >> I am not sure if it is necessary to update these APIs, as I would anyway be replacing them with fixup_archive_space() and handle_archive_space_failure() as mentioned in the description of main issue [JDK-8296263](https://bugs.openjdk.org/browse/JDK-8296263). IMO these names are more generic. If you want I can update the code to use these new names in this patch. > > If you plan to change these soon I think it's OK to leave the names as you have for this PR. > Sorry to be picky, but I think `region` implies a single region, but G1 could allocate one or more regions to satisfy the request. I think it's better to use `allocate_archive_range(MemRegion requested_range)` to be more neutral. Passing the range in a `MemRegion` will also look similar to the API right above this one. > > Maybe we should also change `populate_archive_regions_bot_part` and `dealloc_archive_regions` to use `_range` as well. What do you think, @tschatzl I remember suggesting a rename long time ago in some earlier PR about refactoring this code.. > How about replacing `allocate_archive_range` with `allocate_archive_space` which is actually planned for the next patch? As long as the naming is consistent throughout. > I am more inclined to keep size and address as separate parameters because the address is just a hint for the collectors, while size is not. With MemRegion as the type this information is not conveyed to the reader without reading the comments/code. This is also the reason why I prefer using "preferred_addr" rather than "requested_addr". But if you feel otherwise I will update the API to use MemRegion. I think separating the arguments for the above mentioned reasons is fine. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1219296138 From alanb at openjdk.org Tue Jun 6 10:12:54 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 6 Jun 2023 10:12:54 GMT Subject: RFR: 8309408: Thread.sleep cleanup In-Reply-To: References: Message-ID: <8l9FG4c1c4Ydlv3gQBiramw0YpSthxC-4wXOxVWiIOE=.b14f4319-2ecf-47ea-8118-5a3fe60d92c2@github.com> On Mon, 5 Jun 2023 20:05:24 GMT, Chris Plummer wrote: > The following commit in loom heavily modified this file with a lot of added expected methods. There are other related tests with similar changes. I'm not so sure I understand the need for so many additions, and also why expectedLength is so out of sync with the number of added method. I don't believe this commit was reviewed individually, but was just part of the overall loom review when merge into jdk. Perhaps it should be revisited. These tests aren't easy to read or maintain, it would be good to re-visit them. In some cases, the tests capture the stack trace asynchronously so the test needs to know about all code paths. As regards ThreadController, used by the nsk/monitoring/stress/thread/straceXXX tests, the main thread waits at a barrier (a CountDownLatch) until all sleeping threads are ready to sleep. Once the main thread is released, it checks all the sleepers are in TIMED_WAITING state and samples their stack traces with the ThreadMXBean and related APIs. The test fails if there are frames corresponding to methods that the test doesn't know about. If a thread is sleeping then we shouldn't see frames for beforeSleep/afterSlee. My reading of these tests is that the main thread could poll a SleepingThread after it counts down and before it parks in sleep. It's doing an expensive ThreadMXBean::getAllThreadIds once released and that may explain why it hasn't been seen. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14303#discussion_r1219348675 From mbaesken at openjdk.org Tue Jun 6 10:38:53 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 6 Jun 2023 10:38:53 GMT Subject: RFR: JDK-8308288: Fix xlc17 clang warnings and build errors in hotspot In-Reply-To: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> References: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> Message-ID: On Fri, 2 Jun 2023 11:28:45 GMT, JoKern65 wrote: > This pr is a split off from JDK-8308288 : Fix xlc17 clang warnings in shared code https://github.com/openjdk/jdk/pull/14146 > It handles the part in hotspot. > > It handles the error introduced by a redefine of malloc in stdlib.h resulting in the following build error: > > /data/d042520/pr/jdk/src/hotspot/share/runtime/os.cpp:616:5: error: no member named '_vec_malloc' in 'LogTag'; did you mean 'vec_malloc'? > log_warning(malloc, free)("ptr caught: " PTR_FORMAT, p2i(ptr)); > ^~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:46:28: note: expanded from macro 'log_warning' > #define log_warning(...) (!log_is_enabled(Warning, __VA_ARGS__)) ? (void)0 : LogImpl::write > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:68:45: note: expanded from macro 'log_is_enabled' > #define log_is_enabled(level, ...) (LogImpl::is_level(LogLevel::level)) > ^~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:221:38: note: expanded from macro 'LOG_TAGS' > #define LOG_TAGS(...) EXPAND_VARARGS(LOG_TAGS_EXPANDED(__VA_ARGS__, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG)) > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:217:57: note: expanded from macro 'LOG_TAGS_EXPANDED' > #define LOG_TAGS_EXPANDED(T0, T1, T2, T3, T4, T5, ...) PREFIX_LOG_TAG(T0), PREFIX_LOG_TAG(T1), PREFIX_LOG_TAG(T2), \ > ^~~~~~~~~~~~~~~~~~ > ... (rest of output omitted) > > > Additionally it solves the need for an #include on AIX for any usage of the alloca function, by adding the include to globalDefinitions_xlc.hpp The malloc change does not look very nice, but I think for now we can use that on AIX with xlc17. ------------- Marked as reviewed by mbaesken (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14283#pullrequestreview-1464865277 From myano at openjdk.org Tue Jun 6 11:04:12 2023 From: myano at openjdk.org (Masanori Yano) Date: Tue, 6 Jun 2023 11:04:12 GMT Subject: RFR: 8308751: Create new switch to print error reporting output to both hs_err_pid file and stdout/stderr [v3] In-Reply-To: References: Message-ID: > I think it makes sense to add the ErrorFileWithStdout and ErrorFileWithStderr for troubleshooting. > I would appriciate if someone could review it. Masanori Yano has updated the pull request incrementally with one additional commit since the last revision: 8308751: Create new switch to print error reporting output to both hs_err_pid file and stdout/stderr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14114/files - new: https://git.openjdk.org/jdk/pull/14114/files/2f228b3b..95148a4f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14114&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14114&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14114.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14114/head:pull/14114 PR: https://git.openjdk.org/jdk/pull/14114 From duke at openjdk.org Tue Jun 6 12:16:53 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 6 Jun 2023 12:16:53 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: <0_uC7OzsWkpKayYwBA4Q3h1ZJuyD3GNukoB8B_6jSdE=.a549624f-75b7-4c5f-abc2-ccdf4954d013@github.com> Message-ID: <_6oVhOi34xxHPz9f0DqKPqCunDLqTHOfzbYfAxdUns8=.b6743e6e-ab1f-4007-b2d1-884e98add0eb@github.com> On Tue, 6 Jun 2023 09:41:03 GMT, Thomas Schatzl wrote: >> If you plan to change these soon I think it's OK to leave the names as you have for this PR. > >> Sorry to be picky, but I think `region` implies a single region, but G1 could allocate one or more regions to satisfy the request. I think it's better to use `allocate_archive_range(MemRegion requested_range)` to be more neutral. Passing the range in a `MemRegion` will also look similar to the API right above this one. >> >> Maybe we should also change `populate_archive_regions_bot_part` and `dealloc_archive_regions` to use `_range` as well. What do you think, @tschatzl > > I remember suggesting a rename long time ago in some earlier PR about refactoring this code.. > >> How about replacing `allocate_archive_range` with `allocate_archive_space` which is actually planned for the next patch? > > As long as the naming is consistent throughout. > >> I am more inclined to keep size and address as separate parameters because the address is just a hint for the collectors, while size is not. With MemRegion as the type this information is not conveyed to the reader without reading the comments/code. This is also the reason why I prefer using "preferred_addr" rather than "requested_addr". > But if you feel otherwise I will update the API to use MemRegion. > > I think separating the arguments for the above mentioned reasons is fine. > If you plan to change these soon I think it's OK to leave the names as you have for this PR. Okay, then lets do the renaming in the next PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1219528842 From duke at openjdk.org Tue Jun 6 12:31:52 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 6 Jun 2023 12:31:52 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: <6ib0Q1-yxhxQZWw14G9C1x1eCgZ06-wbIx-5FYAQtkU=.f7cfb363-ffbd-4ac0-8668-2f2ff7a95a6a@github.com> On Fri, 2 Jun 2023 07:27:17 GMT, Thomas Schatzl wrote: >>> @tschatzl >>> >>> > I'm not convinced not giving a preferred location is a good idea. That seems to reduce the opportunity to directly map archives significantly. Previously, with only heap size changes, the archive could be mapped in still. >>> >>> I am not sure I get this. The patch does not change the ability to map the archive. It just moved the calculation to map the archive region from CDS to G1. Before this patch CDS code would determine the address for mapping the archive space towards the top of the heap and pass that address to G1. This patch just moves that calculation to G1. So it should be at-par with the current state. If it is not, please point out and I can work on fixing that. >>> >>> > Since this change is an intermediate step, could you provide an overview of the final API/change too? It is hard to comment on this without knowing where you are going with that. >>> >>> Ok, I have updated the description of the main issue [JDK-8296263](https://bugs.openjdk.org/browse/JDK-8296263) with some details on the expected changes in the future patches. My main aim with this work is mainly code reorganization to avoid using different GC APIs in CDS depending on the GC policy in use. When this is completed I expect it to provide same functionality as today. Any enhancement, like passing preferred location to map archive heap, can be built on top of this. >>> >>> Hope this helps. >> >> Hi Ashutosh, >> >> You are right that in the existing code, although filemap.cpp finds out where the requested range is, it doesn't actually pass that to G1. It just requests G1 to reserve a range at the end of the runtime heap. So your PR preserves this behavior. >> >> I think Thomas's point is, the requested range should be passed to the `alloc_archive_regions()` API, even though the collector may simply ignore it. >> >> In the past, the archive G1 regions were not movable, so it was preferable to put them at the end of the heap, even though that might cause relocation. Now that the archive regions are just regular "old" regions, which can move, it may be preferable to reserve them at the requested range. >> >> BTW, perhaps `alloc_archive_regions()` should be renamed to `alloc_archive_range()` going forward. The plural form of "regions" sounds odd for non-region based collectors. > >> Hi Ashutosh, >> >> You are right that in the existing code, although filemap.cpp finds out where the requested range is, it doesn't actually pass that to G1. It just requests G1 to reserve a range at the end of the runtime heap. So your PR preserves this behavior. >> >> I think Thomas's point is, the requested range should be passed to the `alloc_archive_regions()` API, even though the collector may simply ignore it. > > Exactly, sorry if I wasn't clear enough. > >> >> In the past, the archive G1 regions were not movable, so it was preferable to put them at the end of the heap, even though that might cause relocation. Now that the archive regions are just regular "old" regions, which can move, it may be preferable to reserve them at the requested range. >> >> BTW, perhaps `alloc_archive_regions()` should be renamed to `alloc_archive_range()` going forward. The plural form of "regions" sounds odd for non-region based collectors. > > +1 > > Thanks, > Thomas @tschatzl let me know if there are any other concerns to address in this patch. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1578676159 From stuefe at openjdk.org Tue Jun 6 12:44:13 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 6 Jun 2023 12:44:13 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. Hi @kdnilsen, I see that following settings changed default values for all of Shenandoah: - ShenandoahLearningSteps (was 10, now 5) - ShenandoahImmediateThreshold (was 90, now 70) - ShenandoahAdaptiveDecayFactor (was 0.5, now 0.1) - ShenandoahFullGCThreshold (was 3, now 64) Assuming that the behavior of legacy Shenandoah remains unchanged, I assume the switches are now handled differently to arrive at the same behavior. I see that we now have ShenandoahOOMGCRetries. Does the changed default for ShenandoahFullGCThreshold and this new ShenandoahOOMGCRetries switch mean the degeneration behavior of legacy Shenandoah did change? I think the general thrust of my questions is, you assured us that legacy Shenandoah will show the same behavior post-patch, but since the settings changed, I assume that the meaning of these settings did change. We will need to document these effects for users of legacy Shenandoah, in case they need to translate existing settings in their environment. A release note would be really helpful. Cheers, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1578693858 From tschatzl at openjdk.org Tue Jun 6 12:44:53 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 6 Jun 2023 12:44:53 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: On Fri, 2 Jun 2023 07:27:17 GMT, Thomas Schatzl wrote: >>> @tschatzl >>> >>> > I'm not convinced not giving a preferred location is a good idea. That seems to reduce the opportunity to directly map archives significantly. Previously, with only heap size changes, the archive could be mapped in still. >>> >>> I am not sure I get this. The patch does not change the ability to map the archive. It just moved the calculation to map the archive region from CDS to G1. Before this patch CDS code would determine the address for mapping the archive space towards the top of the heap and pass that address to G1. This patch just moves that calculation to G1. So it should be at-par with the current state. If it is not, please point out and I can work on fixing that. >>> >>> > Since this change is an intermediate step, could you provide an overview of the final API/change too? It is hard to comment on this without knowing where you are going with that. >>> >>> Ok, I have updated the description of the main issue [JDK-8296263](https://bugs.openjdk.org/browse/JDK-8296263) with some details on the expected changes in the future patches. My main aim with this work is mainly code reorganization to avoid using different GC APIs in CDS depending on the GC policy in use. When this is completed I expect it to provide same functionality as today. Any enhancement, like passing preferred location to map archive heap, can be built on top of this. >>> >>> Hope this helps. >> >> Hi Ashutosh, >> >> You are right that in the existing code, although filemap.cpp finds out where the requested range is, it doesn't actually pass that to G1. It just requests G1 to reserve a range at the end of the runtime heap. So your PR preserves this behavior. >> >> I think Thomas's point is, the requested range should be passed to the `alloc_archive_regions()` API, even though the collector may simply ignore it. >> >> In the past, the archive G1 regions were not movable, so it was preferable to put them at the end of the heap, even though that might cause relocation. Now that the archive regions are just regular "old" regions, which can move, it may be preferable to reserve them at the requested range. >> >> BTW, perhaps `alloc_archive_regions()` should be renamed to `alloc_archive_range()` going forward. The plural form of "regions" sounds odd for non-region based collectors. > >> Hi Ashutosh, >> >> You are right that in the existing code, although filemap.cpp finds out where the requested range is, it doesn't actually pass that to G1. It just requests G1 to reserve a range at the end of the runtime heap. So your PR preserves this behavior. >> >> I think Thomas's point is, the requested range should be passed to the `alloc_archive_regions()` API, even though the collector may simply ignore it. > > Exactly, sorry if I wasn't clear enough. > >> >> In the past, the archive G1 regions were not movable, so it was preferable to put them at the end of the heap, even though that might cause relocation. Now that the archive regions are just regular "old" regions, which can move, it may be preferable to reserve them at the requested range. >> >> BTW, perhaps `alloc_archive_regions()` should be renamed to `alloc_archive_range()` going forward. The plural form of "regions" sounds odd for non-region based collectors. > > +1 > > Thanks, > Thomas > @tschatzl let me know if there are any other concerns to address in this patch. Given the API is in flux, and the follow-up not certain to be reviewed by Thursday I would like to have this patch *at least* moved to after the JDK 21 fork. Otherwise maintainers need to deal with this awkwardness for a long time unnecessarily. Unlike Ioi I am not good with the current naming. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1578693782 From tschatzl at openjdk.org Tue Jun 6 12:48:55 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 6 Jun 2023 12:48:55 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 19:46:31 GMT, Ashutosh Mehra wrote: >> This patch is the first step towards having a single set of GC APIs for allocating heap space for the archived objects (See https://bugs.openjdk.org/browse/JDK-8296263). >> It moves some of the G1 specific logic from CDS to G1 gc without changing the functionality. >> >> Changes that add/update GC APIs for handling archive heap would be introduced in upcoming patches. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Review comments - updates to alloc_archive_regions() api > > Signed-off-by: Ashutosh Mehra (I can't just request changes without a comment, although I commented before, so here the same again) Given soon JDK21 fork, and the risk of having this API change for a long time in JDK 21, I am not good with leaving the change as is. Please either fix up the naming or leave integration until after the fork. ------------- Changes requested by tschatzl (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14208#pullrequestreview-1465116293 From duke at openjdk.org Tue Jun 6 12:57:56 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Tue, 6 Jun 2023 12:57:56 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v2] In-Reply-To: References: <_wIgfWGtjNLtTm6s9_WsLWWEbLlBsW-xEr1CwBBoM0M=.3ce944c1-23ed-4376-8896-e3a437de17b0@github.com> Message-ID: On Tue, 6 Jun 2023 12:40:48 GMT, Thomas Schatzl wrote: > Given the API is in flux, and the follow-up not certain to be reviewed by Thursday I would like to have this patch at least moved to after the JDK 21 fork. Otherwise maintainers need to deal with this awkwardness for a long time unnecessarily. @tschatzl thanks for clarifying. I agree with your reasoning and I am fine with not having this patch in JDK 21. Having partial changes in JDK 21 doesn't serve any purpose anyway. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14208#issuecomment-1578715693 From sspitsyn at openjdk.org Tue Jun 6 13:31:11 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 13:31:11 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v2] In-Reply-To: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: > The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. > The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `GetSingleStackTraceClosure` and `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. > > Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. > A minor refactoring of the `GetSingleStackTraceClosure` was made to fix the issue. > > Also, a new test was added to provide coverage: > - `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > > Testing: > - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > - TBD: tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: simplify GetSingleStackTraceClosure, fix issue in VM_GetThreadListStackTraces::doit, improve test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14326/files - new: https://git.openjdk.org/jdk/pull/14326/files/77718470..4e794bd5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=00-01 Stats: 27 lines in 4 files changed: 11 ins; 5 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/14326.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14326/head:pull/14326 PR: https://git.openjdk.org/jdk/pull/14326 From sspitsyn at openjdk.org Tue Jun 6 13:37:03 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 13:37:03 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v3] In-Reply-To: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: > The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. > The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `GetSingleStackTraceClosure` and `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. > > Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. > A minor refactoring of the `GetSingleStackTraceClosure` was made to fix the issue. > > Also, a new test was added to provide coverage: > - `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > > Testing: > - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > - TBD: tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: - fixed typo in a comment in jvmtiEnvBase.cpp - nit: restored one comment as was before ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14326/files - new: https://git.openjdk.org/jdk/pull/14326/files/4e794bd5..d20e1221 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14326.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14326/head:pull/14326 PR: https://git.openjdk.org/jdk/pull/14326 From kdnilsen at openjdk.org Tue Jun 6 13:48:18 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 13:48:18 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: <28NBsNni8p68iy2K_TjXIC1oScdrAtyK-NJd5OH0gVc=.c4adc4a5-82c9-4afa-b1a9-9e5b99b2ecdd@github.com> On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. Thanks Thomas for the feedback: These proposed changes represent improvements to both Generational and Non-generational modes of operation. We can revert if that is desired, or we can specialize Generational versions of these parameters so that they can have different values in different modes, but here is a bit of background. We've done considerable testing on a variety of synthetic workloads and some limited testing on production workloads. As we move towards upstream integration, we expect this will help us gain exposure to more production workloads. The following changes were based on results of this testing: * Decrease ShenandoahLearningSteps to 5 (from 10): For some workloads, we observed that there were "way too many" learning cycles being triggered. We also observed that the learning achieved during learning cycles was not as trustworthy as the learning achieved during actual operation, because these learning cycles typically trigger during initialization phases which are not representative of real-world operation and because they usually trigger so prematurely that there has not been enough time for allocated objects to die before we garbage collect. * Change ShenandoahImmediateThreshold to 70 from 90: We discovered during experiments with settings on certain real production workloads that reducing the threshold for abbreviated cycles significantly improved throughput, reduced degenerated cycles, and reduced high percentile end-to-end latency on the relevant services. These experiments were based on single-generation Shenandoah. We saw no negative impact of making this change on our various workloads. * I'll let @earthling-amzn comment on the change to ShenandoahAdaptiveDecayFactor. My recollection is that this change was also motivated by experience with single-generation Shenandoah on a real production workload. * The change of ShenandoahFullGCThreshold from 3 to 64 was motivated by some observations with specjbb performance as it ratchets up the workload to determine MaxJOPS. We observed that for both single-generation Shenandoah and generational Shenandoah, the typical behavior was that a single Full GC trigger causes an "infinite" sequence of Full GC, even though we may have only lost the concurrent GC race by a small amount. This is because (1) Full GC discards all the incremental work of the concurrent GC that was just interrupted, (2) STW Full GC creates a situation in which pent up demand for execution and allocation accumulates during the STW pause so there's a huge demand for allocation immediately following the end of Full GC, (3) The concurrent GC that triggers immediately after Full GC completes is "destined" to fail because no garbage has been introduced since Full GC finished and since SATB does not collect floating garbage that accumulates after the start of concurrent GC a nd since the allocation spike is so high immediately following the Full GC (e.g. 11GB/s instead of 3GB/s normally). This change allows a sequence of degenerated GCs to manage slow evolution and sudden bursts of allocation rate much more effectively than the original code. This is accompanied by a change in how we detect and throw OOM. We wait for at least one Full GC but we don't force ShenandoahFullGCThreshold allocation failures before thowing OOM. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1578800487 From shade at openjdk.org Tue Jun 6 14:32:05 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 Jun 2023 14:32:05 GMT Subject: RFR: 8309543: Micro-optimize x86 assembler UseCondCardMark Message-ID: Noticed this while explaining a related code: there is no need to make a full jump, and a short jump would suffice. Assembler does not know about this shortening, because it is a forward branch. Makes a slightly more compact interpreter code. ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/14335/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14335&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309543 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14335.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14335/head:pull/14335 PR: https://git.openjdk.org/jdk/pull/14335 From cslucas at openjdk.org Tue Jun 6 14:51:46 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 6 Jun 2023 14:51:46 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v15] In-Reply-To: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: > Can I please get reviews for this PR? > > The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. > > With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: > > ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) > > What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: > > ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) > > This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. > > The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. > > The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. > > I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Address PR review 6: debug format output & some refactoring. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12897/files - new: https://git.openjdk.org/jdk/pull/12897/files/8f81a7c8..3a5ed401 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12897&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12897&range=13-14 Stats: 112 lines in 6 files changed: 37 ins; 59 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/12897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12897/head:pull/12897 PR: https://git.openjdk.org/jdk/pull/12897 From mdoerr at openjdk.org Tue Jun 6 15:16:52 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 6 Jun 2023 15:16:52 GMT Subject: RFR: 8309543: Micro-optimize x86 assembler UseCondCardMark In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 14:24:14 GMT, Aleksey Shipilev wrote: > Noticed this while explaining a related code: there is no need to make a full jump, and a short jump would suffice. Assembler does not know about this shortening, because it is a forward branch. > > Makes a slightly more compact interpreter code. Looks good. I side note: I wonder if it makes any sense to use conditional card marking in the interpreter. I don't think it's fast enough to cause trouble in the processor's memory subsystem by executing oop accesses. Maybe shorter code, less complexity and fewer branches would be a better choice for the interpreter? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14335#issuecomment-1578957992 From kvn at openjdk.org Tue Jun 6 15:26:54 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 6 Jun 2023 15:26:54 GMT Subject: RFR: 8309543: Micro-optimize x86 assembler UseCondCardMark In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 14:24:14 GMT, Aleksey Shipilev wrote: > Noticed this while explaining a related code: there is no need to make a full jump, and a short jump would suffice. Assembler does not know about this shortening, because it is a forward branch. > > Makes a slightly more compact interpreter code. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14335#pullrequestreview-1465482787 From shade at openjdk.org Tue Jun 6 15:30:55 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 Jun 2023 15:30:55 GMT Subject: RFR: 8309543: Micro-optimize x86 assembler UseCondCardMark In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 15:14:20 GMT, Martin Doerr wrote: > Looks good. I side note: I wonder if it makes any sense to use conditional card marking in the interpreter. I don't think it's fast enough to cause trouble in the processor's memory subsystem by executing oop accesses. Maybe shorter code, less complexity and fewer branches would be a better choice for the interpreter? We actually went for conditional card marks in interpreter in https://bugs.openjdk.org/browse/JDK-8078438, because we wanted to avoid interaction with the warm code :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14335#issuecomment-1578985608 From wkemper at openjdk.org Tue Jun 6 15:44:15 2023 From: wkemper at openjdk.org (William Kemper) Date: Tue, 6 Jun 2023 15:44:15 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. Lowering `ShenandoahAdaptiveDecayFactor` allows the heuristic to give more weight to older samples of the allocation rate and cycle times. We found that with the original value (0.5), the heuristics would "forget" history too soon. With the original value, the heuristics were more likely to mistime their trigger because of a few recent, short cycles. This was particularly true after we lowered `ShenandoahImmediateThreshold`, which resulted in more cycles which could skip evacuation and updating refs. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579008730 From alanb at openjdk.org Tue Jun 6 15:45:57 2023 From: alanb at openjdk.org (Alan Bateman) Date: Tue, 6 Jun 2023 15:45:57 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 21:26:39 GMT, Serguei Spitsyn wrote: > Okay, I see you point. Unfortunately, I've always referred the platform thread with an executed FJP schedular as a carrier thread. The term 'carrier' with this meaning is everywhere in the JVMTI code. It looks very confusing to call a thread to be a carrier thread only during some phases of its execution. Okay, I'm just pointing out that is_passive_carrier_thread is confusing looks a bit strange here as the is testing if a JavaThread is carrying a virtual thread oop - it's not testing if the thread is owned by the virtual thread scheduler. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1219888219 From jbhateja at openjdk.org Tue Jun 6 16:12:03 2023 From: jbhateja at openjdk.org (Jatin Bhateja) Date: Tue, 6 Jun 2023 16:12:03 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v10] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 23:48:21 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix tests; need vlbwdq for vpbroadcastq Marked as reviewed by jbhateja (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14224#pullrequestreview-1465575042 From sviswanathan at openjdk.org Tue Jun 6 16:15:01 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 6 Jun 2023 16:15:01 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v10] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 23:48:21 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix tests; need vlbwdq for vpbroadcastq The changes look good to me as well. ------------- Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14224#pullrequestreview-1465580781 From mdoerr at openjdk.org Tue Jun 6 16:26:21 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 6 Jun 2023 16:26:21 GMT Subject: RFR: 8309543: Micro-optimize x86 assembler UseCondCardMark In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 14:24:14 GMT, Aleksey Shipilev wrote: > Noticed this while explaining a related code: there is no need to make a full jump, and a short jump would suffice. Assembler does not know about this shortening, because it is a forward branch. > > Makes a slightly more compact interpreter code. Marked as reviewed by mdoerr (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14335#pullrequestreview-1465600814 From mdoerr at openjdk.org Tue Jun 6 16:26:31 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Tue, 6 Jun 2023 16:26:31 GMT Subject: RFR: 8309543: Micro-optimize x86 assembler UseCondCardMark In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 15:28:25 GMT, Aleksey Shipilev wrote: > > Looks good. I side note: I wonder if it makes any sense to use conditional card marking in the interpreter. I don't think it's fast enough to cause trouble in the processor's memory subsystem by executing oop accesses. Maybe shorter code, less complexity and fewer branches would be a better choice for the interpreter? > > We actually went for conditional card marks in interpreter in https://bugs.openjdk.org/browse/JDK-8078438, because we wanted to avoid interaction with the warm code :) Ok. Thanks for the pointer. We don't have it for all platforms, but I don't think it's important enough. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14335#issuecomment-1579082351 From dholmes at openjdk.org Tue Jun 6 16:28:16 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 6 Jun 2023 16:28:16 GMT Subject: RFR: 8308751: Create new switch to print error reporting output to both hs_err_pid file and stdout/stderr In-Reply-To: References: Message-ID: On Wed, 24 May 2023 08:15:29 GMT, Masanori Yano wrote: > I think it makes sense to add the ErrorFileWithStdout and ErrorFileWithStderr for troubleshooting. > I would appriciate if someone could review it. I have my doubts about the usefulness of this. The hs_err content can be very large (and only gets bigger over time) and can easily overwhelm any observed "screen". If stdout and stderr are being captured by another utility, such as jtreg when running tests, then again we can easily hit buffer limits and actually lose useful information in the captured log file for stdout/stderr. While the implementation is relatively simple I do not like adding yet another pair of flags in this area - and a CSR request is needed to do this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14114#issuecomment-1562126987 From cslucas at openjdk.org Tue Jun 6 16:51:04 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 6 Jun 2023 16:51:04 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v16] In-Reply-To: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: > Can I please get reviews for this PR? > > The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. > > With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: > > ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) > > What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: > > ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) > > This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. > > The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. > > The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. > > I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 17 commits: - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges Catching up with master. - Address PR review 6: debug format output & some refactoring. - Catching up with master branch. Merge remote-tracking branch 'origin/master' into rematerialization-of-merges - Address PR review 6: refactoring around rematerialization & improve test cases. - Address PR review 5: refactor on rematerialization & add tests. - Merge remote-tracking branch 'origin/master' into rematerialization-of-merges - Address part of PR review 4 & fix a bug setting only_candidate - Catching up with master Merge remote-tracking branch 'origin/master' into rematerialization-of-merges - Fix tests. Remember previous reducible Phis. - Address PR review 3. Some comments and be able to abort compilation. - ... and 7 more: https://git.openjdk.org/jdk/compare/ca6f07f9...cb0b6702 ------------- Changes: https://git.openjdk.org/jdk/pull/12897/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=12897&range=15 Stats: 2741 lines in 26 files changed: 2486 ins; 113 del; 142 mod Patch: https://git.openjdk.org/jdk/pull/12897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12897/head:pull/12897 PR: https://git.openjdk.org/jdk/pull/12897 From psandoz at openjdk.org Tue Jun 6 17:16:36 2023 From: psandoz at openjdk.org (Paul Sandoz) Date: Tue, 6 Jun 2023 17:16:36 GMT Subject: RFR: 8306647: Implementation of Structured Concurrency (Preview) [v5] In-Reply-To: References: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> Message-ID: On Tue, 6 Jun 2023 07:13:15 GMT, Alan Bateman wrote: >> This is the implementation of: >> >> - JEP 453: Structured Concurrency (Preview) >> - JEP 446: Scoped Values (Preview) >> >> For the most part, this is just moving code and tests. StructuredTaskScope moves to j.u.concurrent as a preview API, ScopedValue moves to j.lang as a preview API, and module jdk.incubator.concurrent has been removed. The significant API changes since incubator are: >> >> - StructuredTaskScope.fork returns Subtask instead of Future (JEP 453 has a section on this) >> - ScopedValue.where methods are replaced with runWhere, callWhere and getWhere > > Alan Bateman has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 18 commits: > > - Fix typo in javadoc > - Merge > - Merge > - Sync up from loom repo > - Merge > - Sync with loom repo, re-work ScopedValue class description > - Sync up from loom repo > - Remove csm.Threads > - Merge > - Test should not be in update for main line > - ... and 8 more: https://git.openjdk.org/jdk/compare/2e9eff56...0f514588 Marked as reviewed by psandoz (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/13932#pullrequestreview-1465700542 From jwaters at openjdk.org Tue Jun 6 17:42:17 2023 From: jwaters at openjdk.org (Julian Waters) Date: Tue, 6 Jun 2023 17:42:17 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v16] In-Reply-To: References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: On Sat, 3 Jun 2023 13:45:21 GMT, Julian Waters wrote: >> C++11 added the alignas attribute, for the purpose of specifying alignment on types, much like compiler specific syntax such as gcc's __attribute__((aligned(x))) or Visual C++'s __declspec(align(x)). >> >> We can phase out the use of the macro in favor of the standard attribute. In the meantime, we can replace the compiler specific definitions of ATTRIBUTE_ALIGNED with a portable definition. We might deprecate the use of the macro but changing its implementation quickly and cleanly applies the feature where the macro is being used. >> >> Note: With certain parts of HotSpot using ATTRIBUTE_ALIGNED so indiscriminately, this commit will likely take some time to get right >> >> This will require adding the alignas attribute to the list of language features approved for use in HotSpot code. (Completed with [8297912](https://github.com/openjdk/jdk/pull/11446)) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into alignas > - Merge branch 'openjdk:master' into alignas > - alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - ... and 6 more: https://git.openjdk.org/jdk/compare/6edd786b...48d816d7 Bumping ------------- PR Comment: https://git.openjdk.org/jdk/pull/11431#issuecomment-1579193565 From stuefe at openjdk.org Tue Jun 6 17:59:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 6 Jun 2023 17:59:56 GMT Subject: RFR: 8308751: Create new switch to print error reporting output to both hs_err_pid file and stdout/stderr [v3] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 11:04:12 GMT, Masanori Yano wrote: >> I think it makes sense to add the ErrorFileWithStdout and ErrorFileWithStderr for troubleshooting. >> I would appriciate if someone could review it. > > Masanori Yano has updated the pull request incrementally with one additional commit since the last revision: > > 8308751: Create new switch to print error reporting output to both hs_err_pid file and stdout/stderr > /csr needed > > I have my doubts about the usefulness of this. I agree. This is nothing a simple `tee` couldn't solve. We've long departed from the Unix line of "do one thing and do it well", but there is a limit to how much functionality we should heap onto the JVM. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14114#issuecomment-1579218843 From sviswanathan at openjdk.org Tue Jun 6 18:09:02 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Tue, 6 Jun 2023 18:09:02 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v10] In-Reply-To: References: Message-ID: <0lQJvljjXjPCoK8TAVG2wNevqMuErq_tBTsDct7jvuI=.157e6338-4203-4857-9d51-30a6f0ab5083@github.com> On Mon, 5 Jun 2023 23:48:21 GMT, Scott Gibbons wrote: >> Add an intrinsic for x86 AVX and AVX512 fmod. This addresses both a performance regression and acceleration of the floating point remainder operation (fmod / frem). Also addresses dmod / drem. >> >> Performance has increased an average of ~4x as indicated by the benchmark included with [JDK-8302191](https://bugs.openjdk.org/browse/JDK-8302191). >> >> Old: >> gcc-12.2.1-4.fc36.x86_64 >> 3db352d003c5996a5f86f0f465adf86326f7e1fe openjdk21 + fix >> JVM version: 21-internal >> Iteration 0 regression case Took : 89 noMod case took: 39 noPower case took: 68 >> Iteration 1 regression case Took : 86 noMod case took: 39 noPower case took: 67 >> Iteration 2 regression case Took : 41 noMod case took: 39 noPower case took: 70 >> Iteration 3 regression case Took : 41 noMod case took: 39 noPower case took: 69 >> Iteration 4 regression case Took : 40 noMod case took: 39 noPower case took: 44 >> Iteration 5 regression case Took : 47 noMod case took: 39 noPower case took: 40 >> Iteration 6 regression case Took : 41 noMod case took: 39 noPower case took: 40 >> Iteration 7 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> Iteration 8 regression case Took : 41 noMod case took: 38 noPower case took: 41 >> Iteration 9 regression case Took : 40 noMod case took: 39 noPower case took: 40 >> New: >> JVM version: 21-internal (float) >> Iteration 0 regression case Took : 24 noMod case took: 11 noPower case took: 42 >> Iteration 1 regression case Took : 35 noMod case took: 22 noPower case took: 27 >> Iteration 2 regression case Took : 17 noMod case took: 19 noPower case took: 17 >> Iteration 3 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 4 regression case Took : 17 noMod case took: 3 noPower case took: 17 >> Iteration 5 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 6 regression case Took : 16 noMod case took: 3 noPower case took: 17 >> Iteration 7 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 8 regression case Took : 17 noMod case took: 3 noPower case took: 16 >> Iteration 9 regression case Took : 17 noMod case took: 3 noPower case took: 17 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix tests; need vlbwdq for vpbroadcastq @TobiHartmann @vnkozlov Please advise if we could go ahead and integrate this PR from Scott. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14224#issuecomment-1579230349 From shade at openjdk.org Tue Jun 6 18:28:55 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 6 Jun 2023 18:28:55 GMT Subject: RFR: 8309543: Micro-optimize x86 assembler UseCondCardMark In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 14:24:14 GMT, Aleksey Shipilev wrote: > Noticed this while explaining a related code: there is no need to make a full jump, and a short jump would suffice. Assembler does not know about this shortening, because it is a forward branch. > > Makes a slightly more compact interpreter code. Trivial, right? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14335#issuecomment-1579255057 From duke at openjdk.org Tue Jun 6 18:48:14 2023 From: duke at openjdk.org (Christine Flood) Date: Tue, 6 Jun 2023 18:48:14 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: <5kABzdo1H-Hs5ncz0KFnNH_9Ae0Z7hwPM8Qigg_0bjU=.b42d954e-5b69-4e77-82d8-217b6acb3e43@github.com> On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. I wrote an LRU program back in 2017 which allocates trees and stores them in an array in a round robin fashion, freeing the last allocated. At the time this was written it's purpose was to show how generational GCs can hit the wall and start performing very badly. I ran this on a clean openjdk build, a genshen build in generational mode and a genshen build in non-generational mode. These results are repeatable for me. I would like to understand where the degradation is coming from before moving forward with this patch since it appears to penalize those who wish to just run traditional Shenandoah. Clean cflood at fedora java_programs]$ ~/genshen/cleanjdk/build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC LRU 1000 1000 Took 341892ms to allocate 1000000 trees in a cache of 1000 Genshen generational (we expect this to be bad) [cflood at fedora java_programs]$ ~/genshen/jdk/build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:ShenandoahGCMode=generational LRU 1000 1000 Took 442012ms to allocate 1000000 trees in a cache of 1000 Genshen non-generational (shows what I feel is a significant degradation from the clean build) [cflood at fedora java_programs]$ ~/genshen/jdk/build/linux-x86_64-server-release/images/jdk/bin/java -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC LRU 1000 1000 Took 395679ms to allocate 1000000 trees in a cache of 1000 I think that generational Shenandoah can be a big win for some applications, but I want to fully understand the cost for all applications. I can't attach a .java file so here it is inline in the post. class TreeNode { public TreeNode left, right; public int val; } public class LRU { static int cache_size; static int reps; static int tree_height=16; private static TreeNode[] trees; private static int getIndex(int i) {return i % cache_size;} private static TreeNode makeTree(int h) { if (h == 0) { return null;} else { TreeNode res = new TreeNode(); res.left = makeTree(h - 1); res.right = makeTree(h - 1); res.val = h; return res; } } public static void main(String[] args) { if (args.length != 2) { System.err.println("LRU requires args: cache_size reps"); return; } cache_size = Integer.parseInt(args[0]); reps = Integer.parseInt(args[1]) * cache_size; trees = new TreeNode[cache_size]; long start = System.currentTimeMillis(); for (int i = 0; i < reps; i++) trees[getIndex(i)] = makeTree(tree_height); long end = System.currentTimeMillis(); long ms = end - start; System.out.println("Took " + ms + "ms to allocate " + reps + " trees in a cache of " + cache_size); } } ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579278537 From phh at openjdk.org Tue Jun 6 19:51:15 2023 From: phh at openjdk.org (Paul Hohensee) Date: Tue, 6 Jun 2023 19:51:15 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. Thanks for finding the single-gen regression, we're very happy you took the time to run it and write up your results. We're very concerned about single-gen regressions too because we have single-gen Shen in production for several critical services. We'd like to propose to push now, and tackle/fix the single-gen issue you identified during RDP1, as well as any other significant single-gen regressions that may come up. We have four Shen experts on board, Roman, Aleksey, Kelvin, and William, so believe it's doable before RDP2 in July. In the worst case that we fail, we'd emulate ZGC and move GenShen to it's own directory as an entirely separate collector before RDP2. Make sense? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579351993 From aph at openjdk.org Tue Jun 6 19:59:13 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 6 Jun 2023 19:59:13 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. Marked as reviewed by aph (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14185#pullrequestreview-1465971078 From aph at openjdk.org Tue Jun 6 19:59:14 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 6 Jun 2023 19:59:14 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 19:48:05 GMT, Paul Hohensee wrote: > We'd like to propose to push now, and tackle/fix the single-gen issue you identified during RDP1, as well as any other significant single-gen regressions that may come up. We have four Shen experts on board, Roman, Aleksey, Kelvin, and William, so believe it's doable before RDP2 in July. In the worst case that we fail, we'd emulate ZGC and move GenShen to it's own directory as an entirely separate collector before RDP2. Make sense? That sounds great to me. I'll approve this PR now, but please wait for Christine's ack. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579360356 From duke at openjdk.org Tue Jun 6 20:04:22 2023 From: duke at openjdk.org (Christine Flood) Date: Tue, 6 Jun 2023 20:04:22 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: <7VPTUAp0zrlWrCGVgmWqkw5dXda3-1UPA8fcTEW6KoA=.9188218c-7216-4b74-b467-8dab75131578@github.com> On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. This sounds good to me. On Tue, Jun 6, 2023 at 3:55?PM Andrew Haley ***@***.***> wrote: > We'd like to propose to push now, and tackle/fix the single-gen issue you > identified during RDP1, as well as any other significant single-gen > regressions that may come up. We have four Shen experts on board, Roman, > Aleksey, Kelvin, and William, so believe it's doable before RDP2 in July. > In the worst case that we fail, we'd emulate ZGC and move GenShen to it's > own directory as an entirely separate collector before RDP2. Make sense? > > That sounds great to me. I'll approve this PR now, but please wait for > Christine's ack. > > ? > Reply to this email directly, view it on GitHub > , or > unsubscribe > > . > You are receiving this because you were mentioned.Message ID: > ***@***.***> > ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579367147 From mr at openjdk.org Tue Jun 6 20:19:12 2023 From: mr at openjdk.org (Mark Reinhold) Date: Tue, 6 Jun 2023 20:19:12 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: <7VPTUAp0zrlWrCGVgmWqkw5dXda3-1UPA8fcTEW6KoA=.9188218c-7216-4b74-b467-8dab75131578@github.com> References: <7VPTUAp0zrlWrCGVgmWqkw5dXda3-1UPA8fcTEW6KoA=.9188218c-7216-4b74-b467-8dab75131578@github.com> Message-ID: On Tue, 6 Jun 2023 20:01:02 GMT, Christine Flood wrote: > We'd like to propose to push now, and tackle/fix the single-gen issue you identified during RDP1, as well as any other significant single-gen regressions that may come up. We have four Shen experts on board, Roman, Aleksey, Kelvin, and William, so believe it's doable before RDP2 in July. In the worst case that we fail, we'd emulate ZGC and move GenShen to it's own directory as an entirely separate collector before RDP2. Make sense? Unsolicited advice: If you?re planning for this amount of change during RDP 1 then I?d say that you?re not ready for RDP 1. If this patch were less isolated from the rest of HotSpot then I?d be extremely nervous. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579385819 From phh at openjdk.org Tue Jun 6 20:59:12 2023 From: phh at openjdk.org (Paul Hohensee) Date: Tue, 6 Jun 2023 20:59:12 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v5] In-Reply-To: References: Message-ID: <3GigY7Hg2Clm2MqFsQnCWHxLB77_d-WM2c1kOi6wavA=.cc31a2e4-424d-48d1-9016-c409be4790c3@github.com> On Sun, 4 Jun 2023 21:39:58 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove three asserts making comparisons between atomic volatile variables > > Though changes to the volatile variables are individually protected by > Atomic load and store operations, these asserts were not assuring > atomic access to multiple volatile variables, each of which could be > modified independently of the others. The asserts were therefore not > trustworthy, as has been confirmed by more extensive testing. We understand, and would not have proposed the last chance split-directory alternative without the level of isolation Hotspot's GC interface enables. We've added single-gen performance enhancements on the way and would like to keep them! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579439750 From kdnilsen at openjdk.org Tue Jun 6 21:12:41 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 21:12:41 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v6] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove an inappropriate copyright notice ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/8d80780a..9811d2aa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=04-05 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From cjplummer at openjdk.org Tue Jun 6 21:15:55 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 6 Jun 2023 21:15:55 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v3] In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: On Tue, 6 Jun 2023 13:37:03 GMT, Serguei Spitsyn wrote: >> The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. >> The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. >> Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. >> >> The list of changes is: >> - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. >> - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. >> - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> >> Testing: >> - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> - TBD: tiers 1-6 (all are good) > > Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: > > - fixed typo in a comment in jvmtiEnvBase.cpp > - nit: restored one comment as was before Changes requested by cjplummer (Reviewer). test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 63: > 61: public void run() { > 62: log("TestTask.run()"); > 63: } I think this should be an abstract method. test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 106: > 104: final Thread.State expState = Thread.State.WAITING; > 105: reentrantLock.lock(); > 106: String name = "ObjectMonitorTestTask"; Should be "ReentrantLockTestTask" test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/libThreadListStackTracesTest.cpp line 35: > 33: extern "C" { > 34: > 35: JNIEXPORT jint JNICALL Java_ThreadListStackTracesTest_getStateSingle(JNIEnv* jni, jclass clazz, jthread vthread) { I'd suggest splitting into 2 lines just like Java_ThreadListStackTracesTest_getStateMultiple() for the sake of consistency and being able to more easily compare the two. ------------- PR Review: https://git.openjdk.org/jdk/pull/14326#pullrequestreview-1466112714 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220365810 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220367970 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220361274 From amenkov at openjdk.org Tue Jun 6 21:25:56 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 6 Jun 2023 21:25:56 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> References: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> Message-ID: On Mon, 5 Jun 2023 19:00:49 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge > - minor tweaks in libThreadStateTest.cpp > - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING src/hotspot/share/prims/jvmtiEnvBase.cpp line 764: > 762: > 763: if (is_passive_carrier_thread(jt, thread_oop)) { > 764: state |= (JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY); Not sure I understand this. I'd expect `JVMTI_THREAD_STATE_ALIVE | JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY` to be returned in the case. How can a thread be JVMTI_THREAD_STATE_RUNNABLE and JVMTI_THREAD_STATE_WAITING at the same time? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220384120 From sspitsyn at openjdk.org Tue Jun 6 21:32:02 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 21:32:02 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v3] In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: On Tue, 6 Jun 2023 21:07:46 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: >> >> - fixed typo in a comment in jvmtiEnvBase.cpp >> - nit: restored one comment as was before > > test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 63: > >> 61: public void run() { >> 62: log("TestTask.run()"); >> 63: } > > I think this should be an abstract method. Thanks. Fixed now. > test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 106: > >> 104: final Thread.State expState = Thread.State.WAITING; >> 105: reentrantLock.lock(); >> 106: String name = "ObjectMonitorTestTask"; > > Should be "ReentrantLockTestTask" Thanks. Fixed now. > test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/libThreadListStackTracesTest.cpp line 35: > >> 33: extern "C" { >> 34: >> 35: JNIEXPORT jint JNICALL Java_ThreadListStackTracesTest_getStateSingle(JNIEnv* jni, jclass clazz, jthread vthread) { > > I'd suggest splitting into 2 lines just like > Java_ThreadListStackTracesTest_getStateMultiple() for the sake of consistency and being able to more easily compare the two. Thanks. I've overlooked this. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220390676 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220392423 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220387806 From kdnilsen at openjdk.org Tue Jun 6 21:36:44 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 21:36:44 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v7] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Exit during initialization on unsupported platforms ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/9811d2aa..cc149904 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=05-06 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From sspitsyn at openjdk.org Tue Jun 6 21:37:25 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 21:37:25 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v4] In-Reply-To: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: > The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. > The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. > Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. > > The list of changes is: > - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. > - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. > - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > > Testing: > - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > - TBD: tiers 1-6 (all are good) Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: addressed new test related review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14326/files - new: https://git.openjdk.org/jdk/pull/14326/files/d20e1221..e982f97e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=02-03 Stats: 6 lines in 2 files changed: 1 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14326.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14326/head:pull/14326 PR: https://git.openjdk.org/jdk/pull/14326 From kdnilsen at openjdk.org Tue Jun 6 21:46:12 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 21:46:12 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v7] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 13:32:49 GMT, Thomas Stuefe wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Exit during initialization on unsupported platforms > > test/hotspot/jtreg/gc/shenandoah/TestEvilSyncBug.java line 33: > >> 31: * @modules java.base/jdk.internal.misc >> 32: * java.management >> 33: * @run driver/timeout=480 TestEvilSyncBug -XX:ShenandoahGCHeuristics=aggressive > > Probably fine, but why this change to non-generational testing? Will aggressive heuristic sharpen the test? We moved this argument from the source code (original line 64) to here so that we can invoke the test differently in generational mode. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1220410522 From cjplummer at openjdk.org Tue Jun 6 21:50:55 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 6 Jun 2023 21:50:55 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v4] In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: <14FIsTgic9d0rLeA9rAVsfPlBdWnyQkvXKGnYg2j9to=.f5e31a08-e29b-4e3a-ba44-15206f05ff83@github.com> On Tue, 6 Jun 2023 21:37:25 GMT, Serguei Spitsyn wrote: >> The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. >> The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. >> Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. >> >> The list of changes is: >> - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. >> - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. >> - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> >> Testing: >> - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> - TBD: tiers 1-6 (all are good) > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > addressed new test related review comments Looks good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14326#pullrequestreview-1466172053 From kdnilsen at openjdk.org Tue Jun 6 21:54:14 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 21:54:14 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v7] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 21:43:36 GMT, Kelvin Nilsen wrote: >> test/hotspot/jtreg/gc/shenandoah/TestEvilSyncBug.java line 33: >> >>> 31: * @modules java.base/jdk.internal.misc >>> 32: * java.management >>> 33: * @run driver/timeout=480 TestEvilSyncBug -XX:ShenandoahGCHeuristics=aggressive >> >> Probably fine, but why this change to non-generational testing? Will aggressive heuristic sharpen the test? > > We moved this argument from the source code (original line 64) to here so that we can invoke the test differently in generational mode. See line 64 of the original source code. We did not change the behavior. We only changed how the behavior is realized. This enables generalization of the test to generational mode. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1220418431 From kdnilsen at openjdk.org Tue Jun 6 21:54:16 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 21:54:16 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v7] In-Reply-To: References: Message-ID: On Thu, 1 Jun 2023 14:25:19 GMT, Thomas Stuefe wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Exit during initialization on unsupported platforms > > test/hotspot/jtreg/gc/shenandoah/oom/TestAllocOutOfMemory.java line 92: > >> 90: expectFailure("-Xmx16m", >> 91: "-XX:+UnlockExperimentalVMOptions", >> 92: "-XX:+UseShenandoahGC", > > Nit: should not need UnlockExperimentalVMOptions anymore. We actually do need to UnlockExperimentalVMOptions because generational mode is currently an experimental feature. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1220417069 From sspitsyn at openjdk.org Tue Jun 6 22:07:14 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:07:14 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v3] In-Reply-To: References: Message-ID: > When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. > The fix adds: > - a special case for passive carrier threads > - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. > > Testing: > - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` > - submitted mach5 tiers 1-5 > - TBD: to submit mach5 tier 6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: removed JVMTI_THREAD_STATE_RUNNABLE from a carrier thread state ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14298/files - new: https://git.openjdk.org/jdk/pull/14298/files/e60da02e..b5eb3835 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=01-02 Stats: 4 lines in 2 files changed: 2 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14298/head:pull/14298 PR: https://git.openjdk.org/jdk/pull/14298 From sspitsyn at openjdk.org Tue Jun 6 22:07:18 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:07:18 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: References: <5-_kURDcYKd5WYKi9B331c8h6okVmGfvjdy_xqi1UqU=.012c360b-1b57-4773-b64b-d727dbf4daeb@github.com> Message-ID: On Tue, 6 Jun 2023 21:22:40 GMT, Alex Menkov wrote: >> Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge >> - minor tweaks in libThreadStateTest.cpp >> - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 764: > >> 762: >> 763: if (is_passive_carrier_thread(jt, thread_oop)) { >> 764: state |= (JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY); > > Not sure I understand this. > I'd expect > `JVMTI_THREAD_STATE_ALIVE | JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY` to be returned in the case. > How can a thread be JVMTI_THREAD_STATE_RUNNABLE and JVMTI_THREAD_STATE_WAITING at the same time? Good catch, thanks. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220428448 From sspitsyn at openjdk.org Tue Jun 6 22:09:54 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:09:54 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v3] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 15:43:02 GMT, Alan Bateman wrote: >> Okay, I see you point. Unfortunately, I've always referred the platform thread with an executed FJP schedular as a carrier thread. The term 'carrier' with this meaning is everywhere in the JVMTI code. It looks very confusing to call a thread to be a carrier thread only during some phases of its execution. > >> Okay, I see you point. Unfortunately, I've always referred the platform thread with an executed FJP schedular as a carrier thread. The term 'carrier' with this meaning is everywhere in the JVMTI code. It looks very confusing to call a thread to be a carrier thread only during some phases of its execution. > > Okay, I'm just pointing out that is_passive_carrier_thread looks a bit strange here as it is testing if a JavaThread is carrying a virtual thread oop - it's not testing if the thread is owned by the virtual thread scheduler. I'm still thinking what identifier to use instead of `is_passive_carrier_thread`. Just `is_carrier_thread` is going to be confusing as well. What about `is_carrier_thread_waiting_for_virtual` or `is_carrier_thread_waiting_for_virtual_to_unmount`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220435738 From amenkov at openjdk.org Tue Jun 6 22:20:59 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 6 Jun 2023 22:20:59 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v3] In-Reply-To: References: Message-ID: <96WXnBOucA3CuQDIfxYFBKhHC1_Uo651LphuRFeLgcI=.6cd798f1-2849-4871-94e3-ab9df30f393d@github.com> On Tue, 6 Jun 2023 22:07:14 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: removed JVMTI_THREAD_STATE_RUNNABLE from a carrier thread state src/hotspot/share/prims/jvmtiEnvBase.cpp line 768: > 766: } > 767: return state; > 768: } You don't need to call get_thread_state_base in case "passive carrier thread": if (is_passive_carrier_thread(jt, thread_oop)) { return JVMTI_THREAD_STATE_ALIVE | JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; } return get_thread_state_base(thread_oop, jt); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220447606 From amenkov at openjdk.org Tue Jun 6 22:39:09 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 6 Jun 2023 22:39:09 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v4] In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: <7-6AydLkwjCOkTHA2sI-f7axGSMpCq5havZC5K6qz5U=.93d29276-ab0e-444e-b22c-d4b5f1c9b68b@github.com> On Tue, 6 Jun 2023 21:37:25 GMT, Serguei Spitsyn wrote: >> The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. >> The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. >> Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. >> >> The list of changes is: >> - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. >> - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. >> - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> >> Testing: >> - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> - TBD: tiers 1-6 (all are good) > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > addressed new test related review comments Marked as reviewed by amenkov (Reviewer). test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 29: > 27: * @summary GetThreadListStackTraces returns wrong state for blocked VirtualThread > 28: * @requires vm.continuations > 29: * @modules java.base/java.lang:+open I think `@modules` it's not needed test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 34: > 32: > 33: import java.util.concurrent.locks.ReentrantLock; > 34: import java.util.concurrent.*; unused imports ------------- PR Review: https://git.openjdk.org/jdk/pull/14326#pullrequestreview-1466246664 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220481289 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220480268 From sspitsyn at openjdk.org Tue Jun 6 22:41:54 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:41:54 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: > When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. > The fix adds: > - a special case for passive carrier threads > - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. > > Testing: > - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` > - submitted mach5 tiers 1-5 > - TBD: to submit mach5 tier 6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: call get_thread_state_base only when needed ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14298/files - new: https://git.openjdk.org/jdk/pull/14298/files/b5eb3835..77771816 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=02-03 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14298/head:pull/14298 PR: https://git.openjdk.org/jdk/pull/14298 From sspitsyn at openjdk.org Tue Jun 6 22:41:58 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:41:58 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v3] In-Reply-To: <96WXnBOucA3CuQDIfxYFBKhHC1_Uo651LphuRFeLgcI=.6cd798f1-2849-4871-94e3-ab9df30f393d@github.com> References: <96WXnBOucA3CuQDIfxYFBKhHC1_Uo651LphuRFeLgcI=.6cd798f1-2849-4871-94e3-ab9df30f393d@github.com> Message-ID: On Tue, 6 Jun 2023 22:17:57 GMT, Alex Menkov wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: removed JVMTI_THREAD_STATE_RUNNABLE from a carrier thread state > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 768: > >> 766: } >> 767: return state; >> 768: } > > You don't need to call get_thread_state_base in case "passive carrier thread": > > if (is_passive_carrier_thread(jt, thread_oop)) { > return JVMTI_THREAD_STATE_ALIVE | JVMTI_THREAD_STATE_WAITING > | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; > } > return get_thread_state_base(thread_oop, jt); Thanks. Yes, noticed it. :) Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220484149 From sspitsyn at openjdk.org Tue Jun 6 22:54:04 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:54:04 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v5] In-Reply-To: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: > The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. > The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. > Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. > > The list of changes is: > - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. > - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. > - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > > Testing: > - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > - TBD: tiers 1-6 (all are good) Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: ThreadListStackTracesTest cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14326/files - new: https://git.openjdk.org/jdk/pull/14326/files/e982f97e..6b685ca3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=03-04 Stats: 3 lines in 1 file changed: 0 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14326.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14326/head:pull/14326 PR: https://git.openjdk.org/jdk/pull/14326 From sspitsyn at openjdk.org Tue Jun 6 22:54:06 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:54:06 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v4] In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: <0U4aaMnv803nuwN5lempsD-eUFIEiC6WbQr0lCnwTKo=.6b13a626-79ff-4968-ae05-6da81ed5e997@github.com> On Tue, 6 Jun 2023 21:37:25 GMT, Serguei Spitsyn wrote: >> The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. >> The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. >> Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. >> >> The list of changes is: >> - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. >> - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. >> - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> >> Testing: >> - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> - TBD: tiers 1-6 (all are good) > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > addressed new test related review comments Chris and Alex, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14326#issuecomment-1579560041 From sspitsyn at openjdk.org Tue Jun 6 22:54:07 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 22:54:07 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v4] In-Reply-To: <7-6AydLkwjCOkTHA2sI-f7axGSMpCq5havZC5K6qz5U=.93d29276-ab0e-444e-b22c-d4b5f1c9b68b@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> <7-6AydLkwjCOkTHA2sI-f7axGSMpCq5havZC5K6qz5U=.93d29276-ab0e-444e-b22c-d4b5f1c9b68b@github.com> Message-ID: On Tue, 6 Jun 2023 22:34:33 GMT, Alex Menkov wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> addressed new test related review comments > > test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 29: > >> 27: * @summary GetThreadListStackTraces returns wrong state for blocked VirtualThread >> 28: * @requires vm.continuations >> 29: * @modules java.base/java.lang:+open > > I think `@modules` it's not needed Thanks. Removed now. > test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 34: > >> 32: >> 33: import java.util.concurrent.locks.ReentrantLock; >> 34: import java.util.concurrent.*; > > unused imports Thanks. Removed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220495535 PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220495047 From dnsimon at openjdk.org Tue Jun 6 23:04:55 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Tue, 6 Jun 2023 23:04:55 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal In-Reply-To: <1vHZFp-j2AKjYTX2bd_1RxQ2Ix252OCbqJ0m-AGaVTs=.b64ea885-8b22-45ad-8a76-b041a764a5de@github.com> References: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> <1vHZFp-j2AKjYTX2bd_1RxQ2Ix252OCbqJ0m-AGaVTs=.b64ea885-8b22-45ad-8a76-b041a764a5de@github.com> Message-ID: On Mon, 5 Jun 2023 18:58:36 GMT, Tom Rodriguez wrote: > I don't really love the hard code parsing of the HashMap. What properties are actually required for JVMCI? It seems to me that the contents of Arguments::system_properties() should contain all the properties we want to advertise to JVMCI. That would have avoid having to decode them after they've been converted into Java objects. I tired this but unfortunately, Graal relies on some properties that are only initialized in Java: Caused by: java.lang.NullPointerException: Cannot invoke "String.compareTo(String)" because "this.javaSpecVersion" is null at jdk.internal.vm.compiler/org.graalvm.compiler.hotspot.JVMCIVersionCheck.run(JVMCIVersionCheck.java:181) at jdk.internal.vm.compiler/org.graalvm.compiler.hotspot.JVMCIVersionCheck.check(JVMCIVersionCheck.java:166) at jdk.internal.vm.compiler/org.graalvm.compiler.hotspot.HotSpotGraalCompilerFactory.initialize(HotSpotGraalCompilerFactory.java:117) at jdk.internal.vm.compiler/org.graalvm.compiler.hotspot.HotSpotGraalCompilerFactory.ensureInitialized(HotSpotGraalCompilerFactory.java:90) at jdk.internal.vm.compiler/org.graalvm.compiler.hotspot.HotSpotGraalCompilerFactory.createCompiler(HotSpotGraalCompilerFactory.java:180) at jdk.internal.vm.compiler/org.graalvm.compiler.hotspot.HotSpotGraalCompilerFactory.createCompiler(HotSpotGraalCompilerFactory.java:53) at jdk.internal.vm.ci/jdk.vm.ci.hotspot.HotSpotJVMCIRuntime.getCompiler(HotSpotJVMCIRuntime.java:810) at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.util.GraalAccess.(GraalAccess.java:50) That code is reading the "java.specification.version" property which is initialized in `java.lang.VersionProps#init`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14291#issuecomment-1579573072 From cslucas at openjdk.org Tue Jun 6 23:14:14 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 6 Jun 2023 23:14:14 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v17] In-Reply-To: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: > Can I please get reviews for this PR? > > The most common and frequent use of NonEscaping Phis merging object allocations is for debugging information. The two graphs below show numbers for Renaissance and DaCapo benchmarks - similar results are obtained for all other applications that I tested. > > With what frequency does each IR node type occurs as an allocation merge user? I.e., if the same node type uses a Phi N times the counter is incremented by N: > > ![image](https://user-images.githubusercontent.com/2249648/222280517-4dcf5871-2564-4207-b49e-22aee47fa49d.png) > > What are the most common users of allocation merges? I.e., if the same node type uses a Phi N times the counter is incremented by 1: > > ![image](https://user-images.githubusercontent.com/2249648/222280608-ca742a4e-1622-4e69-a778-e4db6805ea02.png) > > This PR adds support scalar replacing allocations participating in merges used as debug information OR as a base for field loads. I plan to create subsequent PRs to enable scalar replacement of merges used by other node types (CmpP is next on the list) subsequently. > > The approach I used for _rematerialization_ is pretty straightforward. It consists basically of the following. 1) New IR node (suggested by V. Kozlov), named SafePointScalarMergeNode, to represent a set of SafePointScalarObjectNode; 2) Each scalar replaceable input participating in a merge will get a SafePointScalarObjectNode like if it weren't part of a merge. 3) Add a new Class to support the rematerialization of SR objects that are part of a merge; 4) Patch HotSpot to be able to serialize and deserialize debug information related to allocation merges; 5) Patch C2 to generate unique types for SR objects participating in some allocation merges. > > The approach I used for _enabling the scalar replacement of some of the inputs of the allocation merge_ is also pretty straightforward: call `MemNode::split_through_phi` to, well, split AddP->Load* through the merge which will render the Phi useless. > > I tested this with JTREG tests tier 1-4 (Windows, Linux, and Mac) and didn't see regression. I also experimented with several applications and didn't see any failure. I also ran tests with "-ea -esa -Xbatch -Xcomp -XX:+UnlockExperimentalVMOptions -XX:-TieredCompilation -server -XX:+IgnoreUnrecognizedVMOptions -XX:+UnlockDiagnosticVMOptions -XX:+StressLCM -XX:+StressGCM -XX:+StressCCP" and didn't observe any related failures. Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Rome minor refactorings. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/12897/files - new: https://git.openjdk.org/jdk/pull/12897/files/cb0b6702..14ddb63a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=12897&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=12897&range=15-16 Stats: 12 lines in 5 files changed: 6 ins; 3 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/12897.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/12897/head:pull/12897 PR: https://git.openjdk.org/jdk/pull/12897 From amenkov at openjdk.org Tue Jun 6 23:23:55 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 6 Jun 2023 23:23:55 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v5] In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: On Tue, 6 Jun 2023 22:54:04 GMT, Serguei Spitsyn wrote: >> The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. >> The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. >> Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. >> >> The list of changes is: >> - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. >> - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. >> - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> >> Testing: >> - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` >> - TBD: tiers 1-6 (all are good) > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: ThreadListStackTracesTest cleanup test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 37: > 35: import java.util.List; > 36: import java.lang.reflect.Constructor; > 37: import java.lang.reflect.InvocationTargetException; I tried to comment all this lines, but something went wrong.. AFAIC the only required import is java.util.concurrent.locks.ReentrantLock, the rest are unused ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220526323 From kdnilsen at openjdk.org Tue Jun 6 23:24:33 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 23:24:33 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v8] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Improve efficiency of card-size alignment calculations ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/cc149904..8f9e2a84 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=06-07 Stats: 8 lines in 1 file changed: 0 ins; 4 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From kdnilsen at openjdk.org Tue Jun 6 23:24:34 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 23:24:34 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> Message-ID: On Fri, 2 Jun 2023 17:55:56 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Force PLAB sizes to align on card-table size > > src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp line 1285: > >> 1283: if (unalignment != 0) { >> 1284: word_size = word_size - unalignment + CardTable::card_size_in_words(); >> 1285: } > > Probably not a big deal since this is only used when refilling a PLAB, which is an infrequent operation, but `mod` is an expensive operation, in general, and best to avoid in our code except in assertion checks (or even there given recent experiences with debug tests timing out). Since card size is a power of 2, may be we could use addition and masking instead. Something like defining the following inline in the CardTable class and using it everywhere where card alignment granularity is sought. There may even be a macro or method defined for this already perhaps: > > > (FOO + CardSize - 1) & ~((1 << LogCardSize) - 1) > > > One could even store the mask to avoid the arithmetic to produce the mask although it's pretty cheap. > > This may turn out to be less expensive than mod, test, and branch, but as I said probably not a big deal here. We should make sure we don't overuse mods in our allocation paths much. Thanks for this suggestion. I've modified the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1220523992 From amenkov at openjdk.org Tue Jun 6 23:39:55 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 6 Jun 2023 23:39:55 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 22:06:14 GMT, Serguei Spitsyn wrote: >>> Okay, I see you point. Unfortunately, I've always referred the platform thread with an executed FJP schedular as a carrier thread. The term 'carrier' with this meaning is everywhere in the JVMTI code. It looks very confusing to call a thread to be a carrier thread only during some phases of its execution. >> >> Okay, I'm just pointing out that is_passive_carrier_thread looks a bit strange here as it is testing if a JavaThread is carrying a virtual thread oop - it's not testing if the thread is owned by the virtual thread scheduler. > > I'm still thinking what identifier to use instead of `is_passive_carrier_thread`. > Just `is_carrier_thread` is going to be confusing as well. > What about `is_carrier_thread_waiting_for_virtual` or `is_carrier_thread_waiting_for_virtual_to_unmount`? `is_carrying_carrier_thread`? a bit artificial, but it's a carrier thread and it's carrying a virtual thread ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220541192 From kdnilsen at openjdk.org Tue Jun 6 23:42:08 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Tue, 6 Jun 2023 23:42:08 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v4] In-Reply-To: <-jrieUm3r32vA5At0baw1nTndtNGoxG6EBrcEDjwyZw=.0a95dc08-1259-418d-a9bb-b2ba86b18c51@github.com> References: <2sgbRGVCiStjmAspEqqpyWAM0IzbZfjFC6HHXlhbcyE=.9637c274-1b10-4103-b528-34719037362b@github.com> <7uARcGDHOuSUugc2zRg7JQgC2dSPBDOjeWGPjBPO2qs=.a09479b8-ba9b-4596-bc5a-7ace0968fe31@github.com> <95yaqYTYoGnlkrDMbvZ-NTyVbGmHrL4DUYYIlV3wkwQ=.b6d787d2-3858-4486-b3dc-428a87969109@github.com> <-jrieUm3r32vA5At0baw1nTndtNGoxG6EBrcEDjwyZw=.0a95dc08-1259-418d-a9bb-b2ba86b18c51@github.com> Message-ID: On Sat, 3 Jun 2023 15:17:37 GMT, Y. Srinivas Ramakrishna wrote: >> Yes. And also from files which were changed by non-Amazon employees only, please. > > Thanks, Martin. Yes, we have noted that there were a few other files that were inadvertently caught in a copyright header dragnet. These will be reviewed and fixed in https://bugs.openjdk.org/browse/JDK-8309392 . I'm fixing this copyright notice and others. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1220542561 From amenkov at openjdk.org Tue Jun 6 23:44:55 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Tue, 6 Jun 2023 23:44:55 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 22:41:54 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: call get_thread_state_base only when needed Marked as reviewed by amenkov (Reviewer). src/hotspot/share/prims/jvmtiEnvBase.hpp line 386: > 384: > 385: // get platform thread state > 386: static jint get_thread_state_base(oop thread_oop, JavaThread* jt); maybe rename it to `get_platform_thread_state`? ------------- PR Review: https://git.openjdk.org/jdk/pull/14298#pullrequestreview-1466301122 PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220543652 From sspitsyn at openjdk.org Tue Jun 6 23:44:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 23:44:57 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 23:37:03 GMT, Alex Menkov wrote: > is_carrying_carrier_thread? a bit artificial, but it's a carrier thread and it's carrying a virtual thread I guess, your suggestion is `is_carrying_virtual_thread`. Is it right? If so, I like this suggestion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220546031 From sspitsyn at openjdk.org Tue Jun 6 23:45:22 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 23:45:22 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v6] In-Reply-To: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: > The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. > The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. > Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. > > The list of changes is: > - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. > - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. > - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > > Testing: > - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > - TBD: tiers 1-6 (all are good) Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: remove unused imports from ThreadListStackTracesTest.java ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14326/files - new: https://git.openjdk.org/jdk/pull/14326/files/6b685ca3..7506b539 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14326&range=04-05 Stats: 5 lines in 1 file changed: 0 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14326.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14326/head:pull/14326 PR: https://git.openjdk.org/jdk/pull/14326 From sspitsyn at openjdk.org Tue Jun 6 23:45:22 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 23:45:22 GMT Subject: RFR: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread [v5] In-Reply-To: References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: On Tue, 6 Jun 2023 23:20:53 GMT, Alex Menkov wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: ThreadListStackTracesTest cleanup > > test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest/ThreadListStackTracesTest.java line 37: > >> 35: import java.util.List; >> 36: import java.lang.reflect.Constructor; >> 37: import java.lang.reflect.InvocationTargetException; > > I tried to comment all this lines, but something went wrong.. > AFAIC the only required import is java.util.concurrent.locks.ReentrantLock, the rest are unused Thanks. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14326#discussion_r1220538628 From sspitsyn at openjdk.org Tue Jun 6 23:56:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 6 Jun 2023 23:56:55 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 23:39:54 GMT, Alex Menkov wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: call get_thread_state_base only when needed > > src/hotspot/share/prims/jvmtiEnvBase.hpp line 386: > >> 384: >> 385: // get platform thread state >> 386: static jint get_thread_state_base(oop thread_oop, JavaThread* jt); > > maybe rename it to `get_platform_thread_state`? I was thinking about it. It will be inconsistent with`get_vthread_state`. Ideally, then the `get_vthread_state` needs to be replaced with the `get_virtual_thread_state`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220559583 From never at openjdk.org Wed Jun 7 00:15:57 2023 From: never at openjdk.org (Tom Rodriguez) Date: Wed, 7 Jun 2023 00:15:57 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal In-Reply-To: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> References: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> Message-ID: On Fri, 2 Jun 2023 20:32:14 GMT, Doug Simon wrote: > This PR improves the startup time for libgraal by speeding up how `VM.savedProps` is copied into libgraal. This data structure is now serialized to a native buffer directly from C++ and the native buffer is then directly decoded by libgraal. > > ## Times > > The basic benchmarking below shows that this change brings the time for a nop Java app with eager libgraal initialization (2) down to almost the same time as lazy libgraal initialization (1). The latter typically means no libgraal initialization happens as a top tier JIT compilation is never scheduled in such a short running app. > > > public class Nop { > public static void main(String[] args) {} > } > > > (1) Baseline (no options): > >> for i in (seq 10); java Nop; end > 0.05 real 0.04 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.00 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > > > (2) Eagerly initialize libgraal (with PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.06 real 0.04 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > > > (3) Eagerly initialize libgraal (without PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.11 real 0.08 user 0.02 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real ... HotSpot sets `java.vm.specification.version` which has the same value so we could change Graal to read that instead. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14291#issuecomment-1579632648 From kdnilsen at openjdk.org Wed Jun 7 00:39:52 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 00:39:52 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: References: Message-ID: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Update copyright notices ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/8f9e2a84..f6c073a5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=07-08 Stats: 7 lines in 6 files changed: 0 ins; 5 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From amenkov at openjdk.org Wed Jun 7 01:08:05 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 7 Jun 2023 01:08:05 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 23:42:24 GMT, Serguei Spitsyn wrote: > > is_carrying_carrier_thread? a bit artificial, but it's a carrier thread and it's carrying a virtual thread > > I guess, your suggestion is `is_carrying_virtual_thread`. Is it right? If so, I like this suggestion. Up to you. I think any of this names is better than is_passive_carrier_thread. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220627250 From ysr at openjdk.org Wed Jun 7 03:41:13 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 03:41:13 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices > Hi, I have built this pr based on [aa85a90](https://github.com/openjdk/jdk/commit/aa85a9073e2a71d6bf920409e739d555f9dcf302), Tier1 tests failed on `gc/TestAllocHumongousFragment.java#generational` on Linux/RISC-V with the following output: > > ``` > # > # A fatal error has been detected by the Java Runtime Environment: > # > # Internal Error (shenandoahVerifier.cpp:1244), pid=2951116, tid=2951124 > # Error: Verify init-mark remembered set violation; clean card should be dirty > # > # JRE version: OpenJDK Runtime Environment (21.0) (build 21-internal-adhoc.ubuntu.jdk) > # Java VM: OpenJDK 64-Bit Server VM (21-internal-adhoc.ubuntu.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, shenandoah gc, linux-riscv64) > ``` > > Looks like Generational Shenandoah does not fully support RISC-V port, should we disable this test on RISC-V port for now? Fixed (platform disabled) by @kdnilsen in https://github.com/openjdk/jdk/pull/14185/commits/cc149904d76c78355fc994da171f0f21411e903f ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1579829038 From cjplummer at openjdk.org Wed Jun 7 04:48:59 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 7 Jun 2023 04:48:59 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: <0g78gMSXw683BQh9UYpDXbpDgBOS5DsIP9df5Qr3XUM=.5c557de4-8b29-4a3f-9e83-ed23a853739f@github.com> On Tue, 6 Jun 2023 22:41:54 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: call get_thread_state_base only when needed Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14298#pullrequestreview-1466591721 From thartmann at openjdk.org Wed Jun 7 05:10:06 2023 From: thartmann at openjdk.org (Tobias Hartmann) Date: Wed, 7 Jun 2023 05:10:06 GMT Subject: RFR: 8308966 Add intrinsic for float/double modulo for x86 AVX2 and AVX512 [v10] In-Reply-To: <0lQJvljjXjPCoK8TAVG2wNevqMuErq_tBTsDct7jvuI=.157e6338-4203-4857-9d51-30a6f0ab5083@github.com> References: <0lQJvljjXjPCoK8TAVG2wNevqMuErq_tBTsDct7jvuI=.157e6338-4203-4857-9d51-30a6f0ab5083@github.com> Message-ID: On Tue, 6 Jun 2023 18:06:11 GMT, Sandhya Viswanathan wrote: >> Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix tests; need vlbwdq for vpbroadcastq > > @TobiHartmann @vnkozlov Please advise if we could go ahead and integrate this PR from Scott. @sviswa7 Thanks for the notification. I'll run this through our testing and report back. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14224#issuecomment-1579902442 From alanb at openjdk.org Wed Jun 7 05:53:56 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 7 Jun 2023 05:53:56 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 01:05:07 GMT, Alex Menkov wrote: > I guess, your suggestion is `is_carrying_virtual_thread`. Is it right? If so, I like this suggestion. Good, I think will be easy to understand at the use sites. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1220891958 From alanb at openjdk.org Wed Jun 7 06:44:13 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 7 Jun 2023 06:44:13 GMT Subject: Integrated: 8306647: Implementation of Structured Concurrency (Preview) In-Reply-To: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> References: <6gZZEoP1WXdBcZUiL5890eNsgaRFzZNY_rBItZdXtNc=.5d8f7bd9-44d5-4074-8a5c-35f8203263b2@github.com> Message-ID: On Thu, 11 May 2023 13:08:55 GMT, Alan Bateman wrote: > This is the implementation of: > > - JEP 453: Structured Concurrency (Preview) > - JEP 446: Scoped Values (Preview) > > For the most part, this is just moving code and tests. StructuredTaskScope moves to j.u.concurrent as a preview API, ScopedValue moves to j.lang as a preview API, and module jdk.incubator.concurrent has been removed. The significant API changes since incubator are: > > - StructuredTaskScope.fork returns Subtask instead of Future (JEP 453 has a section on this) > - ScopedValue.where methods are replaced with runWhere, callWhere and getWhere This pull request has now been integrated. Changeset: f1c7afcc Author: Alan Bateman URL: https://git.openjdk.org/jdk/commit/f1c7afcc3fe39622c33ac7bac1ebdd9f96fa333d Stats: 9229 lines in 40 files changed: 4856 ins; 4315 del; 58 mod 8306647: Implementation of Structured Concurrency (Preview) 8306572: Implementation of Scoped Values (Preview) Co-authored-by: Alan Bateman Co-authored-by: Andrew Haley Reviewed-by: psandoz, dfuchs, mchung ------------- PR: https://git.openjdk.org/jdk/pull/13932 From ysr at openjdk.org Wed Jun 7 07:25:16 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 07:25:16 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp line 4: > 2: * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright (c) 2012, 2022 SAP SE. All rights reserved. > 4: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. I believe line 4 should deleted; the copyright header change here is unnecessary. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221032491 From ysr at openjdk.org Wed Jun 7 07:43:20 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 07:43:20 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices src/hotspot/cpu/riscv/gc/shenandoah/shenandoahBarrierSetAssembler_riscv.cpp line 4: > 2: * Copyright (c) 2018, 2020, Red Hat, Inc. All rights reserved. > 3: * Copyright (c) 2020, 2021, Huawei Technologies Co., Ltd. All rights reserved. > 4: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Remove this line; extent of changes doesn't warrant copyright header change. src/hotspot/share/gc/shenandoah/c1/shenandoahBarrierSetC1.hpp line 3: > 1: /* > 2: * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Should probably be removed. src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.hpp line 3: > 1: /* > 2: * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Check if this is necessary. src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp line 4: > 2: * Copyright (c) 2015, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright (C) 2022 THL A29 Limited, a Tencent company. All rights reserved. > 4: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Should be removed. src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.hpp line 3: > 1: /* > 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Can be removed? src/hotspot/share/gc/shenandoah/heuristics/shenandoahPassiveHeuristics.hpp line 3: > 1: /* > 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Should be removed? src/hotspot/share/gc/shenandoah/heuristics/shenandoahStaticHeuristics.hpp line 3: > 1: /* > 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Should be removed? src/hotspot/share/gc/shenandoah/mode/shenandoahPassiveMode.hpp line 3: > 1: /* > 2: * Copyright (c) 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Should be removed? src/hotspot/share/gc/shenandoah/mode/shenandoahSATBMode.cpp line 3: > 1: /* > 2: * Copyright (c) 2019, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Can be removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221051315 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221054767 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221056157 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221056909 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221058681 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221060221 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221060613 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221061900 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221063047 From ysr at openjdk.org Wed Jun 7 07:46:20 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 07:46:20 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices src/hotspot/share/gc/shenandoah/shenandoahBarrierSetClone.inline.hpp line 3: > 1: /* > 2: * Copyright (c) 2013, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary. Delete. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221070505 From stuefe at openjdk.org Wed Jun 7 07:50:20 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 Jun 2023 07:50:20 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices > Thanks Thomas for the feedback: > > These proposed changes represent improvements to both Generational and Non-generational modes of operation. We can revert if that is desired, or we can specialize Generational versions of these parameters so that they can have different values in different modes, but here is a bit of background. We've done considerable testing on a variety of synthetic workloads and some limited testing on production workloads. As we move towards upstream integration, we expect this will help us gain exposure to more production workloads. The following changes were based on results of this testing: > Hi Kelvin, thanks for the thorough explanations! It is a pity that these valuable insights are buried in a GH discussion and these changes inside such a large patch. I also looked at the originating patch in openjdk/shenandoah, which I assume is your development repo for Shenandoah (?). Could I convince you to adapt the JBS issue process in the shenandoah repo (so, opening an issue on JBS, with some clear explanation, then fixing the bug)? Roman convinced me of this for the Lilliput repository, and now I think the added work is well worth it. JBS is a treasure trove of insights, if filled with care, and can help us for many years. Some more questions about `ShenandoahFullGCThreshold`: I am looking at the nice ASCII art in `ShenandoahControlThread::service_concurrent_normal_cycle`. IIUC, the cycle goes: Concurrent GC -> Alloc failure -> n x Degenerated GC -> Alloc Failure -> Full GC right? So the change is now in how often we try a degenerated GC before falling back to a full GC? With GenShen, does a degenerated GC still collect only the young regions? And only FullGC does collect all regions? Are comment and ASCII-art still correct for GenShen? E.g. the comment says: // If second allocation failure happens during Degenerated GC cycle (for example, when GC // tries to evac something and no memory is available), cycle degrades to Full GC. Is "second allocation failure" correct? Since even before this patch, we tried three times before falling back to a Full GC. Thank you, Thomas ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1580127311 From ysr at openjdk.org Wed Jun 7 07:50:20 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 07:50:20 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: <11_EM-LgCnPY9x_t3s6wz3-oh8eN_e-GMlo2mWtiHbc=.4ade74ee-6f36-4621-b911-dce9e127cd98@github.com> On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.hpp line 3: > 1: /* > 2: * Copyright (c) 2013, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Probably unnecessary. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221076190 From sspitsyn at openjdk.org Wed Jun 7 07:55:13 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 07:55:13 GMT Subject: Integrated: 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread In-Reply-To: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> References: <_bbL6afkGfa1lw1UFa26F4lGRiLQaiIkjo0tkDMUHm4=.0079cdec-34c0-45f4-836f-61118c111f43@github.com> Message-ID: On Tue, 6 Jun 2023 00:50:34 GMT, Serguei Spitsyn wrote: > The `GetThreadListStackTraces` returns `JVMTI_THREAD_STATE_RUNNABLE` for a VirtualThread blocked on a monitor when called for more than one thread. When called for a single VirtualThread it correctly returns a state that includes the `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` flag. > The `VM_GetThreadListStackTraces::doit` should call the `get_threadOop_and_JavaThread` instead of `cv_external_thread_to_JavaThread`. But the `get_threadOop_and_JavaThread` has a check for the current thread by comparing with the JavaThread::current() which does not work for a `VM_op`. Some refactoring of the `get_threadOop_and_JavaThread` was made to make it working for a `VM_op`. > Also, a minor bug in the `GetSingleStackTraceClosure::do_thread()` was discovered during testing. > > The list of changes is: > - minor refactoring of the function`get_threadOop_and_JavaThread`: added an overloaded version of this function with the extra parameter `JavaThread* cur_thread`. It is called instead of `JvmtiExport::cv_external_thread_to_JavaThread` from the `VM_GetThreadListStackTraces::doit`. > - `GetSingleStackTraceClosure::do_thread()`: The use of `jt->threadObj()` is replaced with the `JNIHandles::resolve_external_guard(_jthread)`. > - added new test to provide needed coverage: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > > Testing: > - ran new test: `test/hotspot/jtreg/serviceability/jvmti/vthread/ThreadListStackTracesTest` > - TBD: tiers 1-6 (all are good) This pull request has now been integrated. Changeset: a25b7b8b Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/a25b7b8b55f2dcd3c2945193d78f754580421733 Stats: 222 lines in 5 files changed: 215 ins; 2 del; 5 mod 8295976: GetThreadListStackTraces returns wrong state for blocked VirtualThread Reviewed-by: cjplummer, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/14326 From ysr at openjdk.org Wed Jun 7 07:58:22 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 07:58:22 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices src/hotspot/share/gc/shenandoah/shenandoahDegeneratedGC.hpp line 3: > 1: /* > 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary. src/hotspot/share/gc/shenandoah/shenandoahFullGC.hpp line 3: > 1: /* > 2: * Copyright (c) 2014, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary src/hotspot/share/gc/shenandoah/shenandoahGC.cpp line 3: > 1: /* > 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary src/hotspot/share/gc/shenandoah/shenandoahGC.hpp line 3: > 1: /* > 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221087455 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221091340 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221092320 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221093246 From ysr at openjdk.org Wed Jun 7 08:12:28 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 08:12:28 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices src/hotspot/share/gc/shenandoah/shenandoahNMethod.cpp line 3: > 1: /* > 2: * Copyright (c) 2019, 2022, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary? src/hotspot/share/gc/shenandoah/shenandoahNumberSeq.hpp line 3: > 1: /* > 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary? src/hotspot/share/gc/shenandoah/shenandoahSTWMark.hpp line 3: > 1: /* > 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. unnecessary. src/hotspot/share/gc/shenandoah/shenandoahUnload.cpp line 3: > 1: /* > 2: * Copyright (c) 2019, 2021, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Delete src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.cpp line 3: > 1: /* > 2: * Copyright (c) 2017, 2019, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.hpp line 3: > 1: /* > 2: * Copyright (c) 2017, 2022, Red Hat, Inc. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. Unnecessary ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221104857 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221106767 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221110530 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221113191 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221115925 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221116941 From ysr at openjdk.org Wed Jun 7 08:17:22 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 08:17:22 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices test/hotspot/gtest/gc/shenandoah/test_shenandoahNumberSeq.cpp line 2: > 1: /* > 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved. This may be deleted as far as I can tell, or we can just leave it in there. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221124209 From ysr at openjdk.org Wed Jun 7 08:26:18 2023 From: ysr at openjdk.org (Y. Srinivas Ramakrishna) Date: Wed, 7 Jun 2023 08:26:18 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: <47uWbayDcFRkkQ5crcpFyIFGvBwM0yCAXvxeIonOdcI=.5387f07c-d89f-4839-acdd-bb49d686a2d7@github.com> On Wed, 7 Jun 2023 00:39:52 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright notices test/hotspot/jtreg/gc/stress/gcold/TestGCOldWithShenandoah.java line 3: > 1: /* > 2: * Copyright (c) 2016, 2020, Oracle and/or its affiliates. All rights reserved. > 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. General comment: Looking at the history, I might have expected RedHat copyright headers also for many of these tests, but that isn't a change that's happened with generational shenandoah. So, nothing for us to do in this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221136441 From rcastanedalo at openjdk.org Wed Jun 7 08:51:55 2023 From: rcastanedalo at openjdk.org (Roberto =?UTF-8?B?Q2FzdGHDsWVkYQ==?= Lozano) Date: Wed, 7 Jun 2023 08:51:55 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS [v2] In-Reply-To: <5--iQRwxU0JJYMoywpwEEX7dWRhnuGn6d6p2Av9kYDI=.55e444c1-f375-4eac-bc90-503ec61177c8@github.com> References: <5--iQRwxU0JJYMoywpwEEX7dWRhnuGn6d6p2Av9kYDI=.55e444c1-f375-4eac-bc90-503ec61177c8@github.com> Message-ID: On Mon, 5 Jun 2023 08:14:12 GMT, Stefan Karlsson wrote: >> Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) >> >> You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: >> >> >> # Create a JFR recording >> $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp >> >> # Extract the data from that recording >> $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr >> >> Resident Set Size >> >> Time Resident Set Size Resident Set Size Peak Value >> ---------------- ------------------------- ------------------------------------ >> 11:56:07 1.1 GB 1.2 GB >> 11:56:08 333.7 MB 1.2 GB >> 11:56:09 432.4 MB 1.2 GB >> 11:56:10 695.9 MB 1.2 GB >> 11:56:11 1.0 GB 1.2 GB >> 11:56:12 1.3 GB 1.3 GB >> 11:56:13 1.3 GB 1.3 GB >> 11:56:14 1.3 GB 1.3 GB >> 11:56:15 1.3 GB 1.3 GB >> 11:56:16 1.3 GB 1.3 GB >> 11:56:17 1.4 GB 1.4 GB >> 11:56:18 1.8 GB 1.8 GB >> 11:56:19 2.0 GB 2.0 GB >> >> >> The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? >> >> I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused test imports Marked as reviewed by rcastanedalo (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14285#pullrequestreview-1467036406 From stuefe at openjdk.org Wed Jun 7 09:34:59 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 Jun 2023 09:34:59 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 19:46:31 GMT, Ashutosh Mehra wrote: >> This patch is the first step towards having a single set of GC APIs for allocating heap space for the archived objects (See https://bugs.openjdk.org/browse/JDK-8296263). >> It moves some of the G1 specific logic from CDS to G1 gc without changing the functionality. >> >> Changes that add/update GC APIs for handling archive heap would be introduced in upcoming patches. > > Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: > > Review comments - updates to alloc_archive_regions() api > > Signed-off-by: Ashutosh Mehra This looks okay, it is a cleaner separation than before. Small nits and questions only. Approved (not for jdk21 and pending @tschatzl approval) One note, I don't see a clear usage of INCLUDE_CDS and INCLUDE_G1GC, and it predates this patch. Don't we do this anymore? We still want to be buildable without CDS but with G1, and vice versa, right? src/hotspot/share/cds/filemap.cpp line 2124: > 2122: > 2123: size_t word_size = size / HeapWordSize; > 2124: address requested_start = heap_region_requested_address(); Possibly for another RFE, if you intent to make this code move as verbatim as possible: This feels like something that should use `r->mapping_offset()` directly - would make the code easier to understand. src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 551: > 549: HeapWord* start_addr = reserved.end() - align_up(word_size, HeapRegion::GrainWords); > 550: MemRegion range = MemRegion(start_addr, word_size); > 551: HeapWord* last_address = range.last(); Do I understand this - and the old code - correctly: we map the archive region at the end of the heap, therefore, if heap size changed between dump time and runtime, we will always have to relocate? I didn't look but I assume the dumptime heap size is the default heap size? Another, possibly stupid, question: why don't we use the Heap bottom - we'd still avoid fragmentation but have a much higher chance of not relocating? src/hotspot/share/gc/g1/g1CollectedHeap.hpp line 712: > 710: // the location of the archive space in the heap. The returned address may or may > 711: // not be same as the preferred address. > 712: HeapWord* alloc_archive_region(size_t word_size, HeapWord* preferred_addr); The comment s good. This is only supposed to be used at JVM start, CDS runtime, before we start actually using the heap, right? Could we add this limitation to the comment for the casual reader? Also, here and in other places G1: guard with `#ifdef INCLUDE_CDS` ? ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14208#pullrequestreview-1467074758 PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221212920 PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221241979 PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221218276 From sspitsyn at openjdk.org Wed Jun 7 11:31:02 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 11:31:02 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v5] In-Reply-To: References: Message-ID: > When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. > The fix adds: > - a special case for passive carrier threads > - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. > > Testing: > - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` > - submitted mach5 tiers 1-5 > - TBD: to submit mach5 tier 6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains six additional commits since the last revision: - Merge - review: call get_thread_state_base only when needed - review: removed JVMTI_THREAD_STATE_RUNNABLE from a carrier thread state - Merge - minor tweaks in libThreadStateTest.cpp - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14298/files - new: https://git.openjdk.org/jdk/pull/14298/files/77771816..3e7618c4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=03-04 Stats: 14603 lines in 141 files changed: 9240 ins; 4758 del; 605 mod Patch: https://git.openjdk.org/jdk/pull/14298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14298/head:pull/14298 PR: https://git.openjdk.org/jdk/pull/14298 From bulasevich at openjdk.org Wed Jun 7 11:31:07 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 7 Jun 2023 11:31:07 GMT Subject: RFR: 8305959: x86: Improve itable_stub [v6] In-Reply-To: References: Message-ID: On Mon, 5 Jun 2023 10:10:22 GMT, Boris Ulasevich wrote: >> Async profiler shows that applications spend up to 10% in itable_stubs. >> >> The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. >> >> This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). > > Boris Ulasevich has updated the pull request incrementally with one additional commit since the last revision: > > push/pop(temp_get) -> push/pop(rdx) Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13460#issuecomment-1580589791 From sspitsyn at openjdk.org Wed Jun 7 11:31:06 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 11:31:06 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v5] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 05:50:46 GMT, Alan Bateman wrote: >>> > is_carrying_carrier_thread? a bit artificial, but it's a carrier thread and it's carrying a virtual thread >>> >>> I guess, your suggestion is `is_carrying_virtual_thread`. Is it right? If so, I like this suggestion. >> >> Up to you. I think any of this names is better than is_passive_carrier_thread. > >> I guess, your suggestion is `is_carrying_virtual_thread`. Is it right? If so, I like this suggestion. > > Good, I think will be easy to understand at the use sites. Thank you, Alan and Alex. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1221417248 From bulasevich at openjdk.org Wed Jun 7 11:31:09 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Wed, 7 Jun 2023 11:31:09 GMT Subject: Integrated: 8305959: x86: Improve itable_stub In-Reply-To: References: Message-ID: On Thu, 13 Apr 2023 14:33:52 GMT, Boris Ulasevich wrote: > Async profiler shows that applications spend up to 10% in itable_stubs. > > The current inefficiency of itable stubs is as follows. The generated itable_stub scans itable twice: first it checks if the object class is a subtype of the resolved_class, and then it finds the holder_class that implements the method. I suggest doing this in one pass: with a first loop over itable, check pointer equality to both holder_class and resolved_class. Once we have finished searching for resolved_class, continue searching for holder_class in a separate loop if it has not yet been found. > > This approach gives 1-10% improvement on the synthetic benchmarks and 3% improvement on Naive Bayes benchmark from the Renaissance Benchmark Suite (Intel Xeon X5675). This pull request has now been integrated. Changeset: 8cdd95e8 Author: Boris Ulasevich URL: https://git.openjdk.org/jdk/commit/8cdd95e8a2a7814ab7983fb3f41e6fa5793d410f Stats: 295 lines in 5 files changed: 250 ins; 24 del; 21 mod 8305959: x86: Improve itable_stub Reviewed-by: phh, shade, aph ------------- PR: https://git.openjdk.org/jdk/pull/13460 From stefank at openjdk.org Wed Jun 7 11:44:12 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 7 Jun 2023 11:44:12 GMT Subject: RFR: 8307374: Add a JFR event for tracking RSS [v2] In-Reply-To: <5--iQRwxU0JJYMoywpwEEX7dWRhnuGn6d6p2Av9kYDI=.55e444c1-f375-4eac-bc90-503ec61177c8@github.com> References: <5--iQRwxU0JJYMoywpwEEX7dWRhnuGn6d6p2Av9kYDI=.55e444c1-f375-4eac-bc90-503ec61177c8@github.com> Message-ID: On Mon, 5 Jun 2023 08:14:12 GMT, Stefan Karlsson wrote: >> Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) >> >> You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: >> >> >> # Create a JFR recording >> $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp >> >> # Extract the data from that recording >> $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr >> >> Resident Set Size >> >> Time Resident Set Size Resident Set Size Peak Value >> ---------------- ------------------------- ------------------------------------ >> 11:56:07 1.1 GB 1.2 GB >> 11:56:08 333.7 MB 1.2 GB >> 11:56:09 432.4 MB 1.2 GB >> 11:56:10 695.9 MB 1.2 GB >> 11:56:11 1.0 GB 1.2 GB >> 11:56:12 1.3 GB 1.3 GB >> 11:56:13 1.3 GB 1.3 GB >> 11:56:14 1.3 GB 1.3 GB >> 11:56:15 1.3 GB 1.3 GB >> 11:56:16 1.3 GB 1.3 GB >> 11:56:17 1.4 GB 1.4 GB >> 11:56:18 1.8 GB 1.8 GB >> 11:56:19 2.0 GB 2.0 GB >> >> >> The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? >> >> I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Remove unused test imports Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14285#issuecomment-1580615718 From stefank at openjdk.org Wed Jun 7 11:44:14 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 7 Jun 2023 11:44:14 GMT Subject: Integrated: 8307374: Add a JFR event for tracking RSS In-Reply-To: References: Message-ID: On Fri, 2 Jun 2023 13:41:18 GMT, Stefan Karlsson wrote: > Add A JFR event to track the resident set size (RSS) of the running process. This is a good complement to the new Native Memory Tracking events that were added for JDK 20 ([JDK-8157023](https://bugs.openjdk.org/browse/JDK-8157023)) > > You can use the JDK Mission Control tool to extract this data. Or, you can use the new [JFR Views](https://egahlin.github.io/2023/05/30/views.html) tool to get a textual representation of the values: > > > # Create a JFR recording > $ jdk/bin/java -XX:StartFlightRecording=dumponexit=true JavaApp > > # Extract the data from that recording > $ jdk/bin/jfr view ResidentSetSize hotspot-pid-204767-id-1-2023_06_02_11_56_19.jfr > > Resident Set Size > > Time Resident Set Size Resident Set Size Peak Value > ---------------- ------------------------- ------------------------------------ > 11:56:07 1.1 GB 1.2 GB > 11:56:08 333.7 MB 1.2 GB > 11:56:09 432.4 MB 1.2 GB > 11:56:10 695.9 MB 1.2 GB > 11:56:11 1.0 GB 1.2 GB > 11:56:12 1.3 GB 1.3 GB > 11:56:13 1.3 GB 1.3 GB > 11:56:14 1.3 GB 1.3 GB > 11:56:15 1.3 GB 1.3 GB > 11:56:16 1.3 GB 1.3 GB > 11:56:17 1.4 GB 1.4 GB > 11:56:18 1.8 GB 1.8 GB > 11:56:19 2.0 GB 2.0 GB > > > The event has been implemented for Linux, MacOS, and Windows. The name ResidentSetSize isn't a perfect fit for the values extracted from MacOS and Windows, but I think it is better to name this after something that many people are familiar with instead of trying to find a generic name that fits all platforms. Do you agree with that, or should we change it to something else? > > I've manually sanity checked that we get reasonable values on all OS:es. I've also added a jtreg test. This pull request has now been integrated. Changeset: 5722903d Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/5722903d53e90e36b284967aeb60d2f8b65a744c Stats: 211 lines in 11 files changed: 211 ins; 0 del; 0 mod 8307374: Add a JFR event for tracking RSS Reviewed-by: stuefe, rcastanedalo ------------- PR: https://git.openjdk.org/jdk/pull/14285 From sspitsyn at openjdk.org Wed Jun 7 11:48:19 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 11:48:19 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: References: Message-ID: > When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. > The fix adds: > - a special case for passive carrier threads > - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. > > Testing: > - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` > - submitted mach5 tiers 1-5 > - TBD: to submit mach5 tier 6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: one function renaming ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14298/files - new: https://git.openjdk.org/jdk/pull/14298/files/3e7618c4..a6e6c981 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14298&range=04-05 Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/14298.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14298/head:pull/14298 PR: https://git.openjdk.org/jdk/pull/14298 From sspitsyn at openjdk.org Wed Jun 7 11:53:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 11:53:57 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 11:48:19 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: one function renaming Thank you for review, Alex and Chris. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14298#issuecomment-1580639648 From kdnilsen at openjdk.org Wed Jun 7 12:31:18 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 12:31:18 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 07:22:13 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Update copyright notices > > src/hotspot/cpu/ppc/gc/shenandoah/shenandoahBarrierSetAssembler_ppc.cpp line 4: > >> 2: * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright (c) 2012, 2022 SAP SE. All rights reserved. >> 4: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > I believe line 4 should deleted; the copyright header change here is unnecessary. will remove. I noticed that amazon had also contributed to this file, but changes were very minor. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221506981 From kdnilsen at openjdk.org Wed Jun 7 12:37:42 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 12:37:42 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v10] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove one more extraneous Amazon copyright ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/f6c073a5..221c88ff Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From sspitsyn at openjdk.org Wed Jun 7 12:40:08 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 12:40:08 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 Message-ID: This is a minor update of the `jvmti.xml` file. The JVM TI history table needs to be updated to list new capability, functions and events added to support virtual threads as a permanent feature in JDK 21. Also, it should list a minor update with the `Implementation Note` that dynamic loading of agents into a running VM is now specified to print a warning (JEP 451). The JVM TI history table is maintained for convenience only and does not require a CSR. ------------- Commit messages: - 8309602: update JVMTI history table for jdk 21 Changes: https://git.openjdk.org/jdk/pull/14352/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14352&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309602 Stats: 18 lines in 1 file changed: 17 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14352/head:pull/14352 PR: https://git.openjdk.org/jdk/pull/14352 From kdnilsen at openjdk.org Wed Jun 7 12:44:53 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 12:44:53 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v11] In-Reply-To: References: Message-ID: <1g_0xe2WOXuGOGYZ02225hi4yyz6OKWsRxVNrpZjhhE=.bacf7c2e-24d4-4060-8cff-7e27cd6f3721@github.com> > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: JDK-8309322: [GenShen] TestAllocOutOfMemory#large failed When generational Shenandoah is used, there may be an additional alignment related heap size adjustment that the test should be cognizant of. Such alignment might also happen in the non-generational case, but in this case the specific size used in the test was affected on machines with larger than usual os page size settings. The alignment related adjustment would have affected all generational collectors (except perhaps Gen Z). In the future, we might try and relax this alignment constraint.alignment. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/221c88ff..88958669 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=09-10 Stats: 27 lines in 1 file changed: 16 ins; 0 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From sspitsyn at openjdk.org Wed Jun 7 12:57:34 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 12:57:34 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v2] In-Reply-To: References: Message-ID: > This is a minor update of the `jvmti.xml` file. > The JVM TI history table needs to be updated to list new capability, functions and events added to support virtual threads as a permanent feature in JDK 21. Also, it should list a minor update with the `Implementation Note` that dynamic loading of agents into a running VM is now specified to print a warning (JEP 451). > The JVM TI history table is maintained for convenience only and does not require a CSR. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: simplified the latest history entries ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14352/files - new: https://git.openjdk.org/jdk/pull/14352/files/cb95bf11..a9e4f8eb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14352&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14352&range=00-01 Stats: 15 lines in 1 file changed: 0 ins; 13 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14352/head:pull/14352 PR: https://git.openjdk.org/jdk/pull/14352 From alanb at openjdk.org Wed Jun 7 12:58:56 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 7 Jun 2023 12:58:56 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v2] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 12:57:34 GMT, Serguei Spitsyn wrote: >> This is a minor update of the `jvmti.xml` file. >> The JVM TI history table needs to be updated to list: >> - Virtual threads finalized to be a permanent feature. >> - Agent start-up in the live phase now specified to print a warning. >> >> The JVM TI history table is maintained for convenience only and does not require a CSR. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: simplified the latest history entries A small suggestion for the 19.0.0 note, otherwise good. src/hotspot/share/prims/jvmti.xml line 15459: > 15457: > 15458: > 15459: Preview feature - Support for virtual threads: Maybe "Support for virtual threads as preview feature" or "Support for Virtual Threads (Preview)". src/hotspot/share/prims/jvmti.xml line 15474: > 15472: Virtual threads finalized to be a permanent feature. > 15473: Agent start-up in the live phase now specified to print a warning. > 15474: This is just the history text, no normative changes, shouldn't need a CSR. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14352#pullrequestreview-1467588938 PR Review Comment: https://git.openjdk.org/jdk/pull/14352#discussion_r1221548406 PR Review Comment: https://git.openjdk.org/jdk/pull/14352#discussion_r1221549609 From sspitsyn at openjdk.org Wed Jun 7 13:08:00 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 13:08:00 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v3] In-Reply-To: References: Message-ID: <2cVTkVoKxRkH6B_a7B6Toc6hK5IWhnwJkBX0hnP_g-0=.e169eba0-bf85-4a9c-b227-211c5315d74a@github.com> > This is a minor update of the `jvmti.xml` file. > The JVM TI history table needs to be updated to list: > - Virtual threads finalized to be a permanent feature. > - Agent start-up in the live phase now specified to print a warning. > > The JVM TI history table is maintained for convenience only and does not require a CSR. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: minor history table tweak ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14352/files - new: https://git.openjdk.org/jdk/pull/14352/files/a9e4f8eb..11db4f4f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14352&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14352&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14352/head:pull/14352 PR: https://git.openjdk.org/jdk/pull/14352 From sspitsyn at openjdk.org Wed Jun 7 13:08:03 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 13:08:03 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v2] In-Reply-To: References: Message-ID: <3iju4FBlvcDEeRLIAUCPoV-nZEIuJnrka8cSN-qFmUA=.ab40cb6b-31be-4ecc-aa15-040f1dc9ffac@github.com> On Wed, 7 Jun 2023 12:55:27 GMT, Alan Bateman wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: simplified the latest history entries > > src/hotspot/share/prims/jvmti.xml line 15459: > >> 15457: >> 15458: >> 15459: Preview feature - Support for virtual threads: > > Maybe "Support for virtual threads as preview feature" or "Support for Virtual Threads (Preview)". Thanks. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14352#discussion_r1221558022 From sspitsyn at openjdk.org Wed Jun 7 13:09:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 13:09:57 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v2] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 12:56:18 GMT, Alan Bateman wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: simplified the latest history entries > > src/hotspot/share/prims/jvmti.xml line 15474: > >> 15472: Virtual threads finalized to be a permanent feature. >> 15473: Agent start-up in the live phase now specified to print a warning. >> 15474: > > This is just the history text, no normative changes, shouldn't need a CSR. Okay, thanks. Updated the description. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14352#discussion_r1221564482 From shade at openjdk.org Wed Jun 7 13:15:01 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 7 Jun 2023 13:15:01 GMT Subject: Integrated: 8309543: Micro-optimize x86 assembler UseCondCardMark In-Reply-To: References: Message-ID: On Tue, 6 Jun 2023 14:24:14 GMT, Aleksey Shipilev wrote: > Noticed this while explaining a related code: there is no need to make a full jump, and a short jump would suffice. Assembler does not know about this shortening, because it is a forward branch. > > Makes a slightly more compact interpreter code. This pull request has now been integrated. Changeset: f0236edf Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/f0236edfba1303207e46b5b292cf4c6a18b87d1d Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8309543: Micro-optimize x86 assembler UseCondCardMark Reviewed-by: kvn, mdoerr ------------- PR: https://git.openjdk.org/jdk/pull/14335 From sspitsyn at openjdk.org Wed Jun 7 13:20:02 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 13:20:02 GMT Subject: Integrated: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING In-Reply-To: References: Message-ID: On Sat, 3 Jun 2023 10:53:04 GMT, Serguei Spitsyn wrote: > When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. > The fix adds: > - a special case for passive carrier threads > - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. > > Testing: > - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` > - submitted mach5 tiers 1-5 > - TBD: to submit mach5 tier 6 This pull request has now been integrated. Changeset: 177e8327 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/177e8327d685444d63235567f2a9bde0ec3d51cf Stats: 82 lines in 4 files changed: 65 ins; 0 del; 17 mod 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING Reviewed-by: amenkov, cjplummer ------------- PR: https://git.openjdk.org/jdk/pull/14298 From alanb at openjdk.org Wed Jun 7 13:27:06 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 7 Jun 2023 13:27:06 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 11:48:19 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: one function renaming src/hotspot/share/prims/jvmtiEnvBase.cpp line 1741: > 1739: "sanity check"); > 1740: > 1741: // An attempt to handshake-suspend a passive carrier thread will result in The rename from is_passive_carrier_thread to is_thread_carrying_vthread looks fine. There are a few stray comments that still say "passive carrier thread" that probably should be cleaned up. I see you've just integrated this change but maybe the next change in the area that do this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1221598356 From sspitsyn at openjdk.org Wed Jun 7 13:36:06 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 13:36:06 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 13:24:32 GMT, Alan Bateman wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: one function renaming > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1741: > >> 1739: "sanity check"); >> 1740: >> 1741: // An attempt to handshake-suspend a passive carrier thread will result in > > The rename from is_passive_carrier_thread to is_thread_carrying_vthread looks fine. There are a few stray comments that still say "passive carrier thread" that probably should be cleaned up. I see you've just integrated this change but maybe the next change in the area that do this. Thanks. Alan. Will do this cleanup when there is a chance. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14298#discussion_r1221612240 From kdnilsen at openjdk.org Wed Jun 7 13:37:44 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 13:37:44 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v12] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove more extraneous copyright notices ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/88958669..8e5c3b73 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=10-11 Stats: 10 lines in 10 files changed: 0 ins; 10 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From stefank at openjdk.org Wed Jun 7 13:38:09 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 7 Jun 2023 13:38:09 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size Message-ID: ZGC has separated the committing of physical memory from the mapping of the committed memory to virtual memory. It also has asynchronous, lazy unmapping of virtual memory from physical memory. This leads to a situation where multiple virtual memory areas can be mapped to the same physical memory. NMT has a strong assumption that there's a 1-to-1 correspondence between committed memory and its virtual memory areas. Because of this NMT and ZGC is not entirely compatible. ZGC has worked around this by adding NMT hooks where the virtual memory is mapped to the committed memory. This mostly works, but there are situations where we have multiple virtual memory areas mapped to the same physical memory, and that causes the NMT values to be inflated. I propose that we move the NMT committed memory tracking from the mapping of virtual memory to the actual committing of physical memory. FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. The interesting number is the amount of committed memory, not the exact addresses, IMHO. This isn't something that we change with this patch, but it can be worth understanding while looking at this Bug and the associated PR. ------------- Commit messages: - 8306841: ZGC: NMT reports Java heap size larger than max heap size Changes: https://git.openjdk.org/jdk/pull/14355/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14355&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8306841 Stats: 128 lines in 2 files changed: 104 ins; 18 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14355.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14355/head:pull/14355 PR: https://git.openjdk.org/jdk/pull/14355 From kdnilsen at openjdk.org Wed Jun 7 13:37:57 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 13:37:57 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 07:54:17 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Update copyright notices > > src/hotspot/share/gc/shenandoah/shenandoahFullGC.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2014, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary fixed. > src/hotspot/share/gc/shenandoah/shenandoahGC.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary fixed. > src/hotspot/share/gc/shenandoah/shenandoahGC.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary fixed. > src/hotspot/share/gc/shenandoah/shenandoahNMethod.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 2019, 2022, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary? fixed. > src/hotspot/share/gc/shenandoah/shenandoahNumberSeq.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary? fixed. > src/hotspot/share/gc/shenandoah/shenandoahSTWMark.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > unnecessary. fixed. > src/hotspot/share/gc/shenandoah/shenandoahUnload.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 2019, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Delete fixed. > src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 2017, 2019, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary fixed. > src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2017, 2022, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary fixed. > test/hotspot/gtest/gc/shenandoah/test_shenandoahNumberSeq.cpp line 2: > >> 1: /* >> 2: * Copyright (c) 2022, 2023, Oracle and/or its affiliates. All rights reserved. > > This may be deleted as far as I can tell, or we can just leave it in there. will leave as is. > test/hotspot/jtreg/gc/stress/gcold/TestGCOldWithShenandoah.java line 3: > >> 1: /* >> 2: * Copyright (c) 2016, 2020, Oracle and/or its affiliates. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > General comment: Looking at the history, I might have expected RedHat copyright headers also for many of these tests, but that isn't a change that's happened with generational shenandoah. So, nothing for us to do in this PR. ok. will leave as is. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221595485 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221596525 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221597200 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221597834 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221598462 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221599944 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221599533 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221600615 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221601596 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221602442 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221604163 From duke at openjdk.org Wed Jun 7 13:39:57 2023 From: duke at openjdk.org (JoKern65) Date: Wed, 7 Jun 2023 13:39:57 GMT Subject: RFR: JDK-8308288: Fix xlc17 clang warnings and build errors in hotspot In-Reply-To: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> References: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> Message-ID: On Fri, 2 Jun 2023 11:28:45 GMT, JoKern65 wrote: > This pr is a split off from JDK-8308288 : Fix xlc17 clang warnings in shared code https://github.com/openjdk/jdk/pull/14146 > It handles the part in hotspot. > > It handles the error introduced by a redefine of malloc in stdlib.h resulting in the following build error: > > /data/d042520/pr/jdk/src/hotspot/share/runtime/os.cpp:616:5: error: no member named '_vec_malloc' in 'LogTag'; did you mean 'vec_malloc'? > log_warning(malloc, free)("ptr caught: " PTR_FORMAT, p2i(ptr)); > ^~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:46:28: note: expanded from macro 'log_warning' > #define log_warning(...) (!log_is_enabled(Warning, __VA_ARGS__)) ? (void)0 : LogImpl::write > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:68:45: note: expanded from macro 'log_is_enabled' > #define log_is_enabled(level, ...) (LogImpl::is_level(LogLevel::level)) > ^~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:221:38: note: expanded from macro 'LOG_TAGS' > #define LOG_TAGS(...) EXPAND_VARARGS(LOG_TAGS_EXPANDED(__VA_ARGS__, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG)) > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:217:57: note: expanded from macro 'LOG_TAGS_EXPANDED' > #define LOG_TAGS_EXPANDED(T0, T1, T2, T3, T4, T5, ...) PREFIX_LOG_TAG(T0), PREFIX_LOG_TAG(T1), PREFIX_LOG_TAG(T2), \ > ^~~~~~~~~~~~~~~~~~ > ... (rest of output omitted) > > > Additionally it solves the need for an #include on AIX for any usage of the alloca function, by adding the include to globalDefinitions_xlc.hpp /Integrate ------------- PR Comment: https://git.openjdk.org/jdk/pull/14283#issuecomment-1580842441 From duke at openjdk.org Wed Jun 7 13:48:04 2023 From: duke at openjdk.org (JoKern65) Date: Wed, 7 Jun 2023 13:48:04 GMT Subject: Integrated: JDK-8308288: Fix xlc17 clang warnings and build errors in hotspot In-Reply-To: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> References: <_mG48I6TlpqcdrS5N6DOIpRPpw6ZTrwUgMWzPzDjZ4o=.1eeaffd1-54ed-4344-b66e-a4a4a0583c4d@github.com> Message-ID: On Fri, 2 Jun 2023 11:28:45 GMT, JoKern65 wrote: > This pr is a split off from JDK-8308288 : Fix xlc17 clang warnings in shared code https://github.com/openjdk/jdk/pull/14146 > It handles the part in hotspot. > > It handles the error introduced by a redefine of malloc in stdlib.h resulting in the following build error: > > /data/d042520/pr/jdk/src/hotspot/share/runtime/os.cpp:616:5: error: no member named '_vec_malloc' in 'LogTag'; did you mean 'vec_malloc'? > log_warning(malloc, free)("ptr caught: " PTR_FORMAT, p2i(ptr)); > ^~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:46:28: note: expanded from macro 'log_warning' > #define log_warning(...) (!log_is_enabled(Warning, __VA_ARGS__)) ? (void)0 : LogImpl::write > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/log.hpp:68:45: note: expanded from macro 'log_is_enabled' > #define log_is_enabled(level, ...) (LogImpl::is_level(LogLevel::level)) > ^~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:221:38: note: expanded from macro 'LOG_TAGS' > #define LOG_TAGS(...) EXPAND_VARARGS(LOG_TAGS_EXPANDED(__VA_ARGS__, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG, _NO_TAG)) > ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > /data/d042520/pr/jdk/src/hotspot/share/logging/logTag.hpp:217:57: note: expanded from macro 'LOG_TAGS_EXPANDED' > #define LOG_TAGS_EXPANDED(T0, T1, T2, T3, T4, T5, ...) PREFIX_LOG_TAG(T0), PREFIX_LOG_TAG(T1), PREFIX_LOG_TAG(T2), \ > ^~~~~~~~~~~~~~~~~~ > ... (rest of output omitted) > > > Additionally it solves the need for an #include on AIX for any usage of the alloca function, by adding the include to globalDefinitions_xlc.hpp This pull request has now been integrated. Changeset: 5b147eb5 Author: JoKern65 Committer: Martin Doerr URL: https://git.openjdk.org/jdk/commit/5b147eb5e46ac7fa637ed997c6da8f238f685ea4 Stats: 13 lines in 2 files changed: 10 ins; 0 del; 3 mod 8308288: Fix xlc17 clang warnings and build errors in hotspot Reviewed-by: goetz, mbaesken ------------- PR: https://git.openjdk.org/jdk/pull/14283 From kdnilsen at openjdk.org Wed Jun 7 14:09:23 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 14:09:23 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v12] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 13:37:44 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Remove more extraneous copyright notices Hi Thomas, Thank you for your followup comments. I am in total agreement that it is a shame the challenges we have faced and the progress we have made is not better documented in the history of JBS tickets. I have been the worst offender. I apologize. One aspect of this problem is that our work has included a large degree of uncertainty and "research", and it is not always clear to us what needs to be addressed until after we finish and test certain fixes as integrated with a variety of other fixes. We will commit to being more engaged with JBS from this point forward, both for any further work done on the Shenandoah branch, and definitely for work done on tip. You are correct that the change is to N, the number of times in a row that we perform degenerated GC before we automatically upgrade to Full GC. It is still possible that we will upgrade to Full GC before N is reached, because there are other situations, such as lack of progress by degenerated GC, that will cause us to upgrade to Full even before N is reached. The comment is still valid as written. During degenerated GC, the mutator threads are all blocked, so the ONLY kind of allocation failure that can occur during degenerated GC is a GC-worker-thread allocation for the purpose of evacuating memory. If we experience an "evacuation failure" during degenerated GC. we will upgrade to Full GC. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1580896496 From kdnilsen at openjdk.org Wed Jun 7 14:09:25 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 14:09:25 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: <4oGdxjol_Xzl_3bE1LgSwsvWzfNgt7yli7dqVBc5yb8=.bab01852-5037-4b19-bc81-f67b6dd13611@github.com> On Wed, 7 Jun 2023 07:52:35 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Update copyright notices > > src/hotspot/share/gc/shenandoah/shenandoahDegeneratedGC.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary. Thanks Ramki for sifting through these again. Sorry I missed so many. I'm making your suggested fixes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221612109 From eosterlund at openjdk.org Wed Jun 7 14:14:55 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Wed, 7 Jun 2023 14:14:55 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 13:30:05 GMT, Stefan Karlsson wrote: > ZGC has separated the committing of physical memory from the mapping of the committed memory to virtual memory. It also has asynchronous, lazy unmapping of virtual memory from physical memory. This leads to a situation where multiple virtual memory areas can be mapped to the same physical memory. NMT has a strong assumption that there's a 1-to-1 correspondence between committed memory and its virtual memory areas. Because of this NMT and ZGC is not entirely compatible. ZGC has worked around this by adding NMT hooks where the virtual memory is mapped to the committed memory. This mostly works, but there are situations where we have multiple virtual memory areas mapped to the same physical memory, and that causes the NMT values to be inflated. > > I propose that we move the NMT committed memory tracking from the mapping of virtual memory to the actual committing of physical memory. > > FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. The interesting number is the amount of committed memory, not the exact addresses, IMHO. This isn't something that we change with this patch, but it can be worth understanding while looking at this Bug and the associated PR. > > I've written a small sanity test for the NMT Java Heap values, however it's non-trivial to write a test that efficiently provokes this. I've verified this fix by manually running an over-provisioned SPECjbb2015 run, which results in a lot of splitting of ZGC heap regions, which in turn gives us multiple virtual memory area mapping for the same physical memory. > > Side note: the lazy unmapping of virtual memory can cause other problems with too many virtual memory areas. The inflated NMT numbers have been a smoking gun showing us that issue. We are tracking that issue with [JDK-8308783](https://bugs.openjdk.org/browse/JDK-8308783). Looks good. ------------- Marked as reviewed by eosterlund (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14355#pullrequestreview-1467790288 From stuefe at openjdk.org Wed Jun 7 14:23:59 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 Jun 2023 14:23:59 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: References: Message-ID: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> On Wed, 7 Jun 2023 13:30:05 GMT, Stefan Karlsson wrote: > ZGC has separated the committing of physical memory from the mapping of the committed memory to virtual memory. It also has asynchronous, lazy unmapping of virtual memory from physical memory. This leads to a situation where multiple virtual memory areas can be mapped to the same physical memory. NMT has a strong assumption that there's a 1-to-1 correspondence between committed memory and its virtual memory areas. Because of this NMT and ZGC is not entirely compatible. ZGC has worked around this by adding NMT hooks where the virtual memory is mapped to the committed memory. This mostly works, but there are situations where we have multiple virtual memory areas mapped to the same physical memory, and that causes the NMT values to be inflated. > > I propose that we move the NMT committed memory tracking from the mapping of virtual memory to the actual committing of physical memory. > > FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. The interesting number is the amount of committed memory, not the exact addresses, IMHO. This isn't something that we change with this patch, but it can be worth understanding while looking at this Bug and the associated PR. > > I've written a small sanity test for the NMT Java Heap values, however it's non-trivial to write a test that efficiently provokes this. I've verified this fix by manually running an over-provisioned SPECjbb2015 run, which results in a lot of splitting of ZGC heap regions, which in turn gives us multiple virtual memory area mapping for the same physical memory. > > Side note: the lazy unmapping of virtual memory can cause other problems with too many virtual memory areas. The inflated NMT numbers have been a smoking gun showing us that issue. We are tracking that issue with [JDK-8308783](https://bugs.openjdk.org/browse/JDK-8308783). Looks good. Question: why is this limited to generational ZGC? Just a decision not to fix old ZGC, or does it not happen with old ZGC? > FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. We do, but it is not such an important use case: in hs_err file "unknown pointer" printing, I use NMT to make sense of an otherwise unknown address. src/hotspot/share/gc/z/zPhysicalMemory.cpp line 285: > 283: // When this function is called we don't know where in the virtual memory > 284: // this physical memory will be mapped. So we fake that the virtual memory > 285: // address is the heap base + the given offset. Question of a casual ZGC source reader: when you talk about physical vs virtual here, you are not talking about the real physical vs virtual, right? You are talking about offsets into the ZGC backing file vs attach points of said offsets in the virtual address space? ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14355#pullrequestreview-1467796287 PR Review Comment: https://git.openjdk.org/jdk/pull/14355#discussion_r1221676342 From duke at openjdk.org Wed Jun 7 14:25:58 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 7 Jun 2023 14:25:58 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 09:20:17 GMT, Thomas Stuefe wrote: > we map the archive region at the end of the heap, therefore, if heap size changed between dump time and runtime, we will always have to relocate Not always. For small heap sizes under 2G with compressedoops the heap is mapped such that top of the heap is at 4G boundary. So by keeping the archive regions towards the end, we can get same offset even if the heap size changes. If we map towards the bottom, then the offsets would change as the heap size changes. For example, the default archive is created with 128m. So executing: $ java -Xlog:cds -Xmx128m -version [0.003s][info][cds] The current max heap size = 128M, HeapRegion::GrainBytes = 1048576 [0.003s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 [0.003s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 [0.003s][info][cds] heap range = [0x00000000f8000000 - 0x0000000100000000] [0.003s][info][cds] Preferred address to map heap data (to avoid relocation) is 0x00000000ffe00000 [0.003s][info][cds] Heap data mapped at 0x00000000ffe00000, size = 1071752 bytes [0.003s][info][cds] CDS heap data relocation delta = 0 bytes Changing the heap size to 256m: $ java -Xlog:cds -Xmx256m -version [0.003s][info][cds] The current max heap size = 256M, HeapRegion::GrainBytes = 1048576 [0.003s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 [0.003s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 [0.003s][info][cds] heap range = [0x00000000f0000000 - 0x0000000100000000] [0.003s][info][cds] Preferred address to map heap data (to avoid relocation) is 0x00000000ffe00000 [0.003s][info][cds] Heap data mapped at 0x00000000ffe00000, size = 1071752 bytes [0.003s][info][cds] CDS heap data relocation delta = 0 bytes Notice that relocation delta is 0 in both cases and the heap data is mapped at same address. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221694835 From stuefe at openjdk.org Wed Jun 7 14:33:05 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 Jun 2023 14:33:05 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 14:23:20 GMT, Ashutosh Mehra wrote: >> src/hotspot/share/gc/g1/g1CollectedHeap.cpp line 551: >> >>> 549: HeapWord* start_addr = reserved.end() - align_up(word_size, HeapRegion::GrainWords); >>> 550: MemRegion range = MemRegion(start_addr, word_size); >>> 551: HeapWord* last_address = range.last(); >> >> Do I understand this - and the old code - correctly: we map the archive region at the end of the heap, therefore, if heap size changed between dump time and runtime, we will always have to relocate? I didn't look but I assume the dumptime heap size is the default heap size? >> >> Another, possibly stupid, question: why don't we use the Heap bottom - we'd still avoid fragmentation but have a much higher chance of not relocating? > >> we map the archive region at the end of the heap, therefore, if heap size changed between dump time and runtime, we will always have to relocate > > Not always. For small heap sizes under 2G with compressedoops the heap is mapped such that top of the heap is at 4G boundary. So by keeping the archive regions towards the end, we can get same offset even if the heap size changes. If we map towards the bottom, then the offsets would change as the heap size changes. > > For example, the default archive is created with 128m. So executing: > > $ java -Xlog:cds -Xmx128m -version > [0.003s][info][cds] The current max heap size = 128M, HeapRegion::GrainBytes = 1048576 > [0.003s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 > [0.003s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 > [0.003s][info][cds] heap range = [0x00000000f8000000 - 0x0000000100000000] > [0.003s][info][cds] Preferred address to map heap data (to avoid relocation) is 0x00000000ffe00000 > [0.003s][info][cds] Heap data mapped at 0x00000000ffe00000, size = 1071752 bytes > [0.003s][info][cds] CDS heap data relocation delta = 0 bytes > > > Changing the heap size to 256m: > > $ java -Xlog:cds -Xmx256m -version > [0.003s][info][cds] The current max heap size = 256M, HeapRegion::GrainBytes = 1048576 > [0.003s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 > [0.003s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 > [0.003s][info][cds] heap range = [0x00000000f0000000 - 0x0000000100000000] > [0.003s][info][cds] Preferred address to map heap data (to avoid relocation) is 0x00000000ffe00000 > [0.003s][info][cds] Heap data mapped at 0x00000000ffe00000, size = 1071752 bytes > [0.003s][info][cds] CDS heap data relocation delta = 0 bytes > > > Notice that relocation delta is 0 in both cases and the heap data is mapped at same address. @ashu-mehra Ah, thank you. I remembered this wrongly as the heap having a preferred fixed start address when the fix point was actually the end of the zero-based encoding range. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221704559 From duke at openjdk.org Wed Jun 7 14:33:06 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 7 Jun 2023 14:33:06 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: <_8f8oqpdYT56hxKf3MfApsT658ARTF0jXIMnP3UQo7E=.4936b832-9b5f-4853-be3d-c21daa3720c6@github.com> On Wed, 7 Jun 2023 14:29:07 GMT, Thomas Stuefe wrote: >>> we map the archive region at the end of the heap, therefore, if heap size changed between dump time and runtime, we will always have to relocate >> >> Not always. For small heap sizes under 2G with compressedoops the heap is mapped such that top of the heap is at 4G boundary. So by keeping the archive regions towards the end, we can get same offset even if the heap size changes. If we map towards the bottom, then the offsets would change as the heap size changes. >> >> For example, the default archive is created with 128m. So executing: >> >> $ java -Xlog:cds -Xmx128m -version >> [0.003s][info][cds] The current max heap size = 128M, HeapRegion::GrainBytes = 1048576 >> [0.003s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 >> [0.003s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 >> [0.003s][info][cds] heap range = [0x00000000f8000000 - 0x0000000100000000] >> [0.003s][info][cds] Preferred address to map heap data (to avoid relocation) is 0x00000000ffe00000 >> [0.003s][info][cds] Heap data mapped at 0x00000000ffe00000, size = 1071752 bytes >> [0.003s][info][cds] CDS heap data relocation delta = 0 bytes >> >> >> Changing the heap size to 256m: >> >> $ java -Xlog:cds -Xmx256m -version >> [0.003s][info][cds] The current max heap size = 256M, HeapRegion::GrainBytes = 1048576 >> [0.003s][info][cds] narrow_klass_base = 0x0000000800000000, narrow_klass_shift = 0 >> [0.003s][info][cds] narrow_oop_mode = 0, narrow_oop_base = 0x0000000000000000, narrow_oop_shift = 0 >> [0.003s][info][cds] heap range = [0x00000000f0000000 - 0x0000000100000000] >> [0.003s][info][cds] Preferred address to map heap data (to avoid relocation) is 0x00000000ffe00000 >> [0.003s][info][cds] Heap data mapped at 0x00000000ffe00000, size = 1071752 bytes >> [0.003s][info][cds] CDS heap data relocation delta = 0 bytes >> >> >> Notice that relocation delta is 0 in both cases and the heap data is mapped at same address. > > @ashu-mehra Ah, thank you. I remembered this wrongly as the heap having a preferred fixed start address when the fix point was actually the end of the zero-based encoding range. > I didn't look but I assume the dumptime heap size is the default heap size? I am wondering why do you think dumptime heap size is the default heap size? When creating an archive the heap size can be anything. For default archive it is 128m. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221706723 From duke at openjdk.org Wed Jun 7 14:33:03 2023 From: duke at openjdk.org (Ashutosh Mehra) Date: Wed, 7 Jun 2023 14:33:03 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 09:05:24 GMT, Thomas Stuefe wrote: >> Ashutosh Mehra has updated the pull request incrementally with one additional commit since the last revision: >> >> Review comments - updates to alloc_archive_regions() api >> >> Signed-off-by: Ashutosh Mehra > > src/hotspot/share/cds/filemap.cpp line 2124: > >> 2122: >> 2123: size_t word_size = size / HeapWordSize; >> 2124: address requested_start = heap_region_requested_address(); > > Possibly for another RFE, if you intent to make this code move as verbatim as possible: > > This feels like something that should use `r->mapping_offset()` directly - would make the code easier to understand. Not sure I get this. Are you suggesting replacing `heap_region_requested_address` with `r->mapping_offset()`. But they are not the same, right? > Could we add this limitation to the comment for the casual reader? Yes, good point. I will update the comment to add this limitation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221704233 PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221705900 From kdnilsen at openjdk.org Wed Jun 7 14:38:29 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 14:38:29 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v9] In-Reply-To: References: <5tvdtUZnFnmoUyyVFYoOZm_KtVSzjfzBI7aPXpCpgVw=.2c09ff27-4004-44f3-b86d-a88d2f43a2a8@github.com> Message-ID: On Wed, 7 Jun 2023 07:34:44 GMT, Y. Srinivas Ramakrishna wrote: >> Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: >> >> Update copyright notices > > src/hotspot/share/gc/shenandoah/c1/shenandoahBarrierSetC1.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Should probably be removed. removed. > src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2018, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Check if this is necessary. ok. i'll remove. > src/hotspot/share/gc/shenandoah/heuristics/shenandoahCompactHeuristics.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Can be removed? removed. > src/hotspot/share/gc/shenandoah/heuristics/shenandoahPassiveHeuristics.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Should be removed? removed. > src/hotspot/share/gc/shenandoah/heuristics/shenandoahStaticHeuristics.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2018, 2019, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Should be removed? removed. > src/hotspot/share/gc/shenandoah/mode/shenandoahPassiveMode.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2019, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Should be removed? removed. > src/hotspot/share/gc/shenandoah/mode/shenandoahSATBMode.cpp line 3: > >> 1: /* >> 2: * Copyright (c) 2019, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Can be removed. removed. > src/hotspot/share/gc/shenandoah/shenandoahBarrierSetClone.inline.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2013, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Unnecessary. Delete. removed. > src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.hpp line 3: > >> 1: /* >> 2: * Copyright (c) 2013, 2021, Red Hat, Inc. All rights reserved. >> 3: * Copyright Amazon.com Inc. or its affiliates. All Rights Reserved. > > Probably unnecessary. removed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221700458 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221704676 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221706236 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221707142 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221709229 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221710192 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221711143 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221712173 PR Review Comment: https://git.openjdk.org/jdk/pull/14185#discussion_r1221713033 From mdoerr at openjdk.org Wed Jun 7 14:40:25 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 7 Jun 2023 14:40:25 GMT Subject: RFR: 8309613: [Windows] hs_err files sometimes miss information about the code containing the error Message-ID: We have seen hs_err files for errors triggered by C2 compiled methods which miss the most relevant information: the C2 method (see JBS issue for more details). I have found a possibility to add it. Please take a look and provide feedback. ------------- Commit messages: - 8309613: [Windows] hs_err files sometimes miss information about the code containing the error Changes: https://git.openjdk.org/jdk/pull/14358/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14358&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309613 Stats: 33 lines in 10 files changed: 14 ins; 4 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/14358.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14358/head:pull/14358 PR: https://git.openjdk.org/jdk/pull/14358 From dcubed at openjdk.org Wed Jun 7 14:44:10 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 7 Jun 2023 14:44:10 GMT Subject: RFR: 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: References: Message-ID: <89TTwIzHO2gB9XwBttdaPsa09Y00QHzAGZ3YG3IG2Z8=.8d977a17-207c-4728-b6d2-f3d7c12aa13c@github.com> On Wed, 7 Jun 2023 11:48:19 GMT, Serguei Spitsyn wrote: >> When a virtual thread is mounted, the carrier thread should be reported as "waiting" until the virtual thread unmounts. Right now, GetThreadState reports a state based the JavaThread status when it should return JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY. >> The fix adds: >> - a special case for passive carrier threads >> - necessary test coverage to the existing JVMTI test: `serviceability/jvmti/vthread/ThreadStateTest`. >> >> Testing: >> - tested with the updated test: `serviceability/jvmti/vthread/ThreadStateTest` >> - submitted mach5 tiers 1-5 >> - TBD: to submit mach5 tier 6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: one function renaming Looks like this PR has caused regression failures in Tier1. We have between 2 and 5 failures per Tier1. See: [JDK-8309612](https://bugs.openjdk.org/browse/JDK-8309612) serviceability/jvmti/vthread/SuspendResume1/SuspendResume1.java#default fails after JDK-8307153 Because this failure is happening in Tier1, combined with the fact that we get much more JVM/TI testing in the upper Tiers, and tomorrow is the code-fork I'm proceeding with a [BACKOUT] and am testing that [BACKOUT] with an urgent Tier1 right now. See: [JDK-8309614](https://bugs.openjdk.org/browse/JDK-8309614) [BACKOUT] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING ------------- PR Comment: https://git.openjdk.org/jdk/pull/14298#issuecomment-1580968712 From stuefe at openjdk.org Wed Jun 7 14:45:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 Jun 2023 14:45:06 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 14:28:54 GMT, Ashutosh Mehra wrote: >> src/hotspot/share/cds/filemap.cpp line 2124: >> >>> 2122: >>> 2123: size_t word_size = size / HeapWordSize; >>> 2124: address requested_start = heap_region_requested_address(); >> >> Possibly for another RFE, if you intent to make this code move as verbatim as possible: >> >> This feels like something that should use `r->mapping_offset()` directly - would make the code easier to understand. > > Not sure I get this. Are you suggesting replacing `heap_region_requested_address` with `r->mapping_offset()`. But they are not the same, right? Never mind, it is not important at all. What I originally meant was: The requested address is calculated from information baked into the archive we are loading (original heap base and region offset), and we already have the archive in hand (`r`) and use it for subsequent operations, so we could calculate the requested address right here; but I was missing that heap_region_requested_address is called from several sites, so it makes sense to have this functionality in a utility function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1221724217 From kdnilsen at openjdk.org Wed Jun 7 14:47:13 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 14:47:13 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v13] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Remove a few more unneeded copyright notices ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/8e5c3b73..01c62516 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=11-12 Stats: 9 lines in 9 files changed: 0 ins; 9 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From kdnilsen at openjdk.org Wed Jun 7 15:02:34 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 15:02:34 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v14] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Simplify test logic, fail if name of Shenandoah young gen pool changes (#3) ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/01c62516..240d413d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=12-13 Stats: 13 lines in 1 file changed: 0 ins; 9 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From iris at openjdk.org Wed Jun 7 15:41:55 2023 From: iris at openjdk.org (Iris Clark) Date: Wed, 7 Jun 2023 15:41:55 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v3] In-Reply-To: <2cVTkVoKxRkH6B_a7B6Toc6hK5IWhnwJkBX0hnP_g-0=.e169eba0-bf85-4a9c-b227-211c5315d74a@github.com> References: <2cVTkVoKxRkH6B_a7B6Toc6hK5IWhnwJkBX0hnP_g-0=.e169eba0-bf85-4a9c-b227-211c5315d74a@github.com> Message-ID: On Wed, 7 Jun 2023 13:08:00 GMT, Serguei Spitsyn wrote: >> This is a minor update of the `jvmti.xml` file. >> The JVM TI history table needs to be updated to list: >> - Virtual threads finalized to be a permanent feature. >> - Agent start-up in the live phase now specified to print a warning. >> >> The JVM TI history table has no normative changes. This update does not need a CSR. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: minor history table tweak Thanks! ------------- Marked as reviewed by iris (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14352#pullrequestreview-1468021106 From kdnilsen at openjdk.org Wed Jun 7 16:00:37 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 16:00:37 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v15] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with 174 additional commits since the last revision: - 8309614: [BACKOUT] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING Reviewed-by: azvegint - 8308288: Fix xlc17 clang warnings and build errors in hotspot Reviewed-by: goetz, mbaesken - 8309225: Fix xlc17 clang 15 warnings in security and servicability Reviewed-by: goetz, mdoerr, clanger - 8309219: Fix xlc17 clang 15 warnings in java.base Reviewed-by: goetz, mdoerr - 8307153: JVMTI GetThreadState on carrier should return STATE_WAITING Reviewed-by: amenkov, cjplummer - 8309543: Micro-optimize x86 assembler UseCondCardMark Reviewed-by: kvn, mdoerr - 8280982: [Wayland] [XWayland] java.awt.Robot taking screenshots Reviewed-by: prr, kizune, psadhukhan - 8309550: jdk.jfr.internal.Utils::formatDataAmount method should gracefully handle amounts equal to Long.MIN_VALUE Reviewed-by: stuefe, mgronlun - 8308445: Linker should check that capture state segment is big enough Reviewed-by: mcimadamore - 8308031: Linkers should reject unpromoted variadic parameters Reviewed-by: mcimadamore - ... and 164 more: https://git.openjdk.org/jdk/compare/240d413d...8b2edd9c ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/240d413d..8b2edd9c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=13-14 Stats: 55913 lines in 825 files changed: 45254 ins; 7582 del; 3077 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From kdnilsen at openjdk.org Wed Jun 7 16:51:38 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 16:51:38 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v16] In-Reply-To: References: Message-ID: <9vJprjFu5K2ZymOM-lzCm_IAA2pAnGBgCDAtwQ0sOvw=.47abdb57-237b-4c57-a5e1-25f11ee14a80@github.com> > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 305 commits: - Merge branch 'master' of https://git.openjdk.org/jdk into merge-generational-shenandoah - Simplify test logic, fail if name of Shenandoah young gen pool changes (#3) - Remove a few more unneeded copyright notices - Remove more extraneous copyright notices - JDK-8309322: [GenShen] TestAllocOutOfMemory#large failed When generational Shenandoah is used, there may be an additional alignment related heap size adjustment that the test should be cognizant of. Such alignment might also happen in the non-generational case, but in this case the specific size used in the test was affected on machines with larger than usual os page size settings. The alignment related adjustment would have affected all generational collectors (except perhaps Gen Z). In the future, we might try and relax this alignment constraint.alignment. - Remove one more extraneous Amazon copyright - Update copyright notices - Improve efficiency of card-size alignment calculations - Exit during initialization on unsupported platforms - Remove an inappropriate copyright notice - ... and 295 more: https://git.openjdk.org/jdk/compare/33bb64f2...612072a4 ------------- Changes: https://git.openjdk.org/jdk/pull/14185/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=15 Stats: 20143 lines in 202 files changed: 18218 ins; 916 del; 1009 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From stuefe at openjdk.org Wed Jun 7 18:21:48 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 Jun 2023 18:21:48 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v16] In-Reply-To: <9vJprjFu5K2ZymOM-lzCm_IAA2pAnGBgCDAtwQ0sOvw=.47abdb57-237b-4c57-a5e1-25f11ee14a80@github.com> References: <9vJprjFu5K2ZymOM-lzCm_IAA2pAnGBgCDAtwQ0sOvw=.47abdb57-237b-4c57-a5e1-25f11ee14a80@github.com> Message-ID: On Wed, 7 Jun 2023 16:51:38 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 305 commits: > > - Merge branch 'master' of https://git.openjdk.org/jdk into merge-generational-shenandoah > - Simplify test logic, fail if name of Shenandoah young gen pool changes (#3) > > - Remove a few more unneeded copyright notices > - Remove more extraneous copyright notices > - JDK-8309322: [GenShen] TestAllocOutOfMemory#large failed > > When generational Shenandoah is used, there may be an additional > alignment related heap size adjustment that the test should be cognizant > of. Such alignment might also happen in the non-generational case, but > in this case the specific size used in the test was affected on machines > with larger than usual os page size settings. > > The alignment related adjustment would have affected all generational > collectors (except perhaps Gen Z). In the future, we might try and relax > this alignment constraint.alignment. > - Remove one more extraneous Amazon copyright > - Update copyright notices > - Improve efficiency of card-size alignment calculations > - Exit during initialization on unsupported platforms > - Remove an inappropriate copyright notice > - ... and 295 more: https://git.openjdk.org/jdk/compare/33bb64f2...612072a4 I won't be able to give reasonable input here in the short time left before RDP1. Nor am I the most qualified to do so. Just wanted to re-iterate that I see this rushed review with worry. Nothing I have not said already, and of course, it does not diminish the massive effort behind this JEP. Process-wise, when talking about Lilliput integration we touched on the idea of moving RDP1 down to a sooner date. I still think this makes sense. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1581244415 From kdnilsen at openjdk.org Wed Jun 7 18:21:43 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 18:21:43 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v17] In-Reply-To: References: Message-ID: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: Fix budgeting assertion to allow equal or greater than ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14185/files - new: https://git.openjdk.org/jdk/pull/14185/files/612072a4..19e62fe0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14185&range=15-16 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/14185.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14185/head:pull/14185 PR: https://git.openjdk.org/jdk/pull/14185 From stuefe at openjdk.org Wed Jun 7 18:21:45 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 7 Jun 2023 18:21:45 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v12] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 14:06:31 GMT, Kelvin Nilsen wrote: > Hi Thomas, > > Thank you for your followup comments. I am in total agreement that it is a shame the challenges we have faced and the progress we have made is not better documented in the history of JBS tickets. I have been the worst offender. I apologize. Please, no need to apologize. I understand that during early development one needs to move quickly. I just thought that your team's experience with tuning Shenandoah is valuable, and it is regrettable when it is lost. > You are correct that the change is to N, the number of times in a row that we perform degenerated GC before we automatically upgrade to Full GC. It is still possible that we will upgrade to Full GC before N is reached, because there are other situations, such as lack of progress by degenerated GC, that will cause us to upgrade to Full even before N is reached. > > The comment is still valid as written. During degenerated GC, the mutator threads are all blocked, so the ONLY kind of allocation failure that can occur during degenerated GC is a GC-worker-thread allocation for the purpose of evacuating memory. If we experience an "evacuation failure" during degenerated GC. we will upgrade to Full GC. Thank you for the thorough explanation. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1581232734 From sspitsyn at openjdk.org Wed Jun 7 18:51:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 18:51:55 GMT Subject: RFR: 8309612: [R[REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING Message-ID: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). The first push/patch is the original fix of JDK-8307153. The fix of the SUSPEND bit issue will be in the incremental update. It is to simplify the review. Testing: - TBD: mach5 tiers 1-5 ------------- Commit messages: - 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING Changes: https://git.openjdk.org/jdk/pull/14366/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309612 Stats: 82 lines in 4 files changed: 65 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/14366.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14366/head:pull/14366 PR: https://git.openjdk.org/jdk/pull/14366 From sspitsyn at openjdk.org Wed Jun 7 19:27:05 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 19:27:05 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v2] In-Reply-To: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: > This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). > The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. > The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). > > The first push/patch is the original fix of JDK-8307153. > The fix of the SUSPEND bit issue will be in the incremental update. > It is to simplify the review. > > Testing: > - TBD: mach5 tiers 1-5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: fixed the SUSPEND bit issue in JVMTI thread state of carrier threads ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14366/files - new: https://git.openjdk.org/jdk/pull/14366/files/00f51d34..29adb0af Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=00-01 Stats: 5 lines in 1 file changed: 0 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14366.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14366/head:pull/14366 PR: https://git.openjdk.org/jdk/pull/14366 From sspitsyn at openjdk.org Wed Jun 7 19:32:23 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 19:32:23 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v3] In-Reply-To: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: > This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). > The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. > The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). > > The first push/patch is the original fix of JDK-8307153. > The fix of the SUSPEND bit issue will be in the incremental update. > It is to simplify the review. > > Testing: > - TBD: mach5 tiers 1-5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: cleanup in comments: replace confusing term: passive carrier thread ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14366/files - new: https://git.openjdk.org/jdk/pull/14366/files/29adb0af..094b5f28 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=01-02 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14366.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14366/head:pull/14366 PR: https://git.openjdk.org/jdk/pull/14366 From sspitsyn at openjdk.org Wed Jun 7 20:05:45 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 20:05:45 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: > This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). > The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. > The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). > > The first push/patch is the original fix of JDK-8307153. > The fix of the SUSPEND bit issue will be in the incremental update. > It is to simplify the review. > > Testing: > - TBD: mach5 tiers 1-5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: fix trailing space in jvmtiEnvBase.cpp ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14366/files - new: https://git.openjdk.org/jdk/pull/14366/files/094b5f28..4defcf2e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14366.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14366/head:pull/14366 PR: https://git.openjdk.org/jdk/pull/14366 From cjplummer at openjdk.org Wed Jun 7 20:15:54 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 7 Jun 2023 20:15:54 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: On Wed, 7 Jun 2023 20:05:45 GMT, Serguei Spitsyn wrote: >> This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). >> The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. >> The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). >> >> The first push/patch is the original fix of JDK-8307153. >> The fix of the SUSPEND bit issue will be in the incremental update. >> It is to simplify the review. >> >> Testing: >> - TBD: mach5 tiers 1-5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix trailing space in jvmtiEnvBase.cpp Changes requested by cjplummer (Reviewer). src/hotspot/share/prims/jvmtiEnvBase.cpp line 765: > 763: if (is_thread_carrying_vthread(jt, thread_oop)) { > 764: state &= ~JVMTI_THREAD_STATE_RUNNABLE; > 765: state |= JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; How about a comment here: "Clear RUNNABLE state and add WAITING state because..." src/hotspot/share/prims/jvmtiEnvBase.cpp line 1739: > 1737: "sanity check"); > 1738: > 1739: // An attempt to handshake-suspend a thread carrying virtual thread will result in Suggestion: // An attempt to handshake-suspend a thread carrying a virtual thread will result in src/hotspot/share/prims/jvmtiEnvBase.hpp line 99: > 97: static bool is_in_thread_list(jint count, const jthread* list, oop jt_oop); > 98: > 99: // check if thread_oop represents a thread carrying virtual thread Suggestion: // check if thread_oop represents a thread carrying a virtual thread src/hotspot/share/prims/jvmtiEnvBase.hpp line 183: > 181: > 182: // Return true if the thread identified with a pair is current. > 183: // A thread carrying virtual thread is not treated as current. Suggestion: // A thread carrying a virtual thread is not treated as current. ------------- PR Review: https://git.openjdk.org/jdk/pull/14366#pullrequestreview-1468479443 PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222104282 PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222104787 PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222105165 PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222105551 From cslucas at openjdk.org Wed Jun 7 20:22:01 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Wed, 7 Jun 2023 20:22:01 GMT Subject: RFR: JDK-8287061: Support for rematerializing scalar replaced objects participating in allocation merges [v13] In-Reply-To: References: <7nqFW-lgT1FzuMHPMUQiCj1ATcV_bQtroolf4V_kCc4=.ccd12605-aad0-433e-ba44-5772d972f05d@github.com> Message-ID: On Tue, 23 May 2023 17:19:23 GMT, Vladimir Ivanov wrote: >>> I verified that the new test cases do trigger SR+NSR scenario. >>> >>> How do you test that deoptimization works as expected? >>> >> >> I have a copy of the tests in AllocationMergesTests.java in a separate file (not included in this PR) and I run the tests with a tool that compares the output of the test with RAM enabled and disabled. So, the way I test that deoptimization worked is basically just making sure the tests that "deoptimize" have the same output with RAM enabled and disabled. >> >>> Diagnostic output is still hard to read. On one hand, it's too verbose when it comes to PcDesc/ScopeDesc sections ("pc-bytecode offsets" and "scopes") in nmethod output (enabled either w/ `-XX:+PrintAssembly` or `-XX:CompileCommand=print,...`). On the other hand, it lacks some important details, like `selector` and `merge_ptr` location information which is essential to make sense of debug information at a safepoint in the code. >>> >> >> I'll take care of that. I was testing only with PrintDebugInfo. >> >>> FTR `_skip_rematerialization` flag is unused now. >>> >> >> yeah, I forgot to remove that. Thanks. >> >>> Speaking of `_only_merge_candidate` flag, I find it easier about the code when the property being tracked is whether the `ObjectValue` is referenced from corresponding JVM state or not. (Maybe call it `is_root()`?) So, `ScopeDesc::objects_to_rematerialize()` would skip everything not referenced from JVM state, but then unconditionally accept anything returned by `ObjectMergeValue::select()` which doesn't need to adjust the flag before returning selected object. Also, it's safer to track the flag status for every `ObjectValues`, even for `ObjectMergeValue`. >>> >> >> Sounds like a good idea. I'll do that. Thanks. >> >>> Are you sure there's no way to end up with nested `ObjectMergeValue`s in presence of iterative EA? >> >> I don't think so. This current patch only handle Phis that don't have NULL as input. As part of the reduction process we set at least one of the reducible Phi inputs to NULL. Therefore, subsequent iterations of EA won't reduce the same Phi. > >> So, the way I test that deoptimization worked is basically just making sure the tests that "deoptimize" have the same output with RAM enabled and disabled. > > Please, enhance `AllocationMergesTests` to cover deoptimization (e.g., using WhiteBox API or additional run w/ -XX:+DeoptimizeALot) and ensure that tests are sensitive enough to fail when wrong state is rematerialized. @iwanowww - I pushed some changes to address your feedback. Please let me know if you have any more comments. ------------- PR Comment: https://git.openjdk.org/jdk/pull/12897#issuecomment-1581452155 From kdnilsen at openjdk.org Wed Jun 7 21:07:39 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 21:07:39 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v17] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 18:21:43 GMT, Kelvin Nilsen wrote: >> OpenJDK Colleagues: >> >> Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. >> >> Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: >> >> 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. >> 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. >> 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. >> 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. >> >> We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. >> >> **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. > > Kelvin Nilsen has updated the pull request incrementally with one additional commit since the last revision: > > Fix budgeting assertion to allow equal or greater than We would like to thank everyone who has taken time to review and provide feedback on our pull request. Given the risks identified during the review process and the lack of time available to perform the thorough review that such a large contribution of code requires, we have decided to close this PR at the current time. We will seek to target JDK 22. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1581509386 From kdnilsen at openjdk.org Wed Jun 7 21:07:41 2023 From: kdnilsen at openjdk.org (Kelvin Nilsen) Date: Wed, 7 Jun 2023 21:07:41 GMT Subject: Withdrawn: JDK-8307314: Implementation: Generational Shenandoah (Experimental) In-Reply-To: References: Message-ID: On Fri, 26 May 2023 20:46:29 GMT, Kelvin Nilsen wrote: > OpenJDK Colleagues: > > Please review this proposed integration of Generational mode for Shenandoah GC under https://bugs.openjdk.org/browse/JDK-8307314. > > Generational mode of Shenandoah is enabled by adding `-XX:+UnlockExperimentalVMOptions -XX:ShenandoahGCMode=generational` to a command line that already specifies ` -XX:+UseShenandoahGC`. The implementation automatically adjusts the sizes of old generation and young generation to efficiently utilize the entire heap capacity. Generational mode of Shenandoah resembles G1 in the following regards: > > 1. Old-generation marking runs concurrently during the time that multiple young generation collections run to completion. > 2. After old-generation marking completes, we perform a sequence of mixed collections. Each mixed collection combines collection of young generation with evacuation of a portion of the old-generation regions identified for collection based on old-generation marking information. > 3. Unlike G1, young-generation collections and evacuations are entirely concurrent, as with single-generation Shenandoah. > 4. As with single-generation Shenandoah, there is no explicit notion of eden and survivor space within the young generation. In practice, regions that were most recently allocated tend to have large amounts of garbage and these regions tend to be collected with very little effort. Young-generation objects that survive garbage collection tend to accumulate in regions that hold survivor objects. These regions tend to have smaller amounts of garbage, and are less likely to be collected. If they survive a sufficient number of young-generation collections, the ?survivor? regions are promoted into the old generation. > > We expect to refine heuristics as we gain experience with more production workloads. In the future, we plan to remove the ?experimental? qualifier from generational mode, at which time we expect that generational mode will become the default mode for Shenandoah. > > **Testing**: We continuously run jtreg tiers 1-4 + hotspot_gc_shenandoah, gcstress, jck compiler, jck runtime, Dacapo, SpecJBB, SpecVM, Extremem, HyperAlloc, and multiple AWS production workload simulators. We test on Linux x64 and aarch64, Alpine x64 and aarch64, macOS x64 and aarch64, and Windows x64. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/14185 From sspitsyn at openjdk.org Wed Jun 7 21:22:49 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 21:22:49 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: On Wed, 7 Jun 2023 20:10:24 GMT, Chris Plummer wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> fix trailing space in jvmtiEnvBase.cpp > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 765: > >> 763: if (is_thread_carrying_vthread(jt, thread_oop)) { >> 764: state &= ~JVMTI_THREAD_STATE_RUNNABLE; >> 765: state |= JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; > > How about a comment here: > > "Clear RUNNABLE state and add WAITING state because..." Thanks. Added comment. > src/hotspot/share/prims/jvmtiEnvBase.cpp line 1739: > >> 1737: "sanity check"); >> 1738: >> 1739: // An attempt to handshake-suspend a thread carrying virtual thread will result in > > Suggestion: > > // An attempt to handshake-suspend a thread carrying a virtual thread will result in Thanks. Updated now. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 99: > >> 97: static bool is_in_thread_list(jint count, const jthread* list, oop jt_oop); >> 98: >> 99: // check if thread_oop represents a thread carrying virtual thread > > Suggestion: > > // check if thread_oop represents a thread carrying a virtual thread Thanks. Updated now. > src/hotspot/share/prims/jvmtiEnvBase.hpp line 183: > >> 181: >> 182: // Return true if the thread identified with a pair is current. >> 183: // A thread carrying virtual thread is not treated as current. > > Suggestion: > > // A thread carrying a virtual thread is not treated as current. Thanks. Updated now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222185513 PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222185817 PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222185985 PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222186135 From amenkov at openjdk.org Wed Jun 7 21:22:44 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 7 Jun 2023 21:22:44 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> On Wed, 7 Jun 2023 20:05:45 GMT, Serguei Spitsyn wrote: >> This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). >> The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. >> The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). >> >> The first push/patch is the original fix of JDK-8307153. >> The fix of the SUSPEND bit issue will be in the incremental update. >> It is to simplify the review. >> >> Testing: >> - TBD: mach5 tiers 1-5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fix trailing space in jvmtiEnvBase.cpp src/hotspot/share/prims/jvmtiEnvBase.cpp line 765: > 763: if (is_thread_carrying_vthread(jt, thread_oop)) { > 764: state &= ~JVMTI_THREAD_STATE_RUNNABLE; > 765: state |= JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; This does not look correct. GetThreadState spec provides hierarchical set of questions to interpret thread state value. JVMTI_THREAD_STATE_ALIVE | JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY is only one branch and I'd expect all other bits are not set for this state. Need to decide what do we want to report as carrier thread state for all possible values returned by get_thread_state_base(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222182733 From sspitsyn at openjdk.org Wed Jun 7 21:22:41 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 21:22:41 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v5] In-Reply-To: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: > This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). > The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. > The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). > > The first push/patch is the original fix of JDK-8307153. > The fix of the SUSPEND bit issue will be in the incremental update. > It is to simplify the review. > > Testing: > - TBD: mach5 tiers 1-5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: added/adjusted some comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14366/files - new: https://git.openjdk.org/jdk/pull/14366/files/4defcf2e..8f26e277 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=03-04 Stats: 7 lines in 2 files changed: 4 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/14366.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14366/head:pull/14366 PR: https://git.openjdk.org/jdk/pull/14366 From mchung at openjdk.org Wed Jun 7 21:37:04 2023 From: mchung at openjdk.org (Mandy Chung) Date: Wed, 7 Jun 2023 21:37:04 GMT Subject: RFR: 8305104: Remove old core reflection implementation Message-ID: JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. ------------- Commit messages: - clean up - Merge branch 'master' of https://github.com/openjdk/jdk into remove-old-reflection - Merge branch 'master' of https://github.com/openjdk/jdk into remove-old-reflection - Merge branch 'master' of https://github.com/openjdk/jdk into remove-old-reflection - merge - Merge branch 'master' of https://github.com/openjdk/jdk into remove-old-reflection - 8305104: Remove the old core reflection implementation Changes: https://git.openjdk.org/jdk/pull/14371/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14371&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8305104 Stats: 6319 lines in 78 files changed: 13 ins; 6238 del; 68 mod Patch: https://git.openjdk.org/jdk/pull/14371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14371/head:pull/14371 PR: https://git.openjdk.org/jdk/pull/14371 From sspitsyn at openjdk.org Wed Jun 7 21:57:49 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 21:57:49 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> Message-ID: On Wed, 7 Jun 2023 21:12:24 GMT, Alex Menkov wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> fix trailing space in jvmtiEnvBase.cpp > > src/hotspot/share/prims/jvmtiEnvBase.cpp line 765: > >> 763: if (is_thread_carrying_vthread(jt, thread_oop)) { >> 764: state &= ~JVMTI_THREAD_STATE_RUNNABLE; >> 765: state |= JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; > > This does not look correct. > GetThreadState spec provides hierarchical set of questions to interpret thread state value. > JVMTI_THREAD_STATE_ALIVE | JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY is only one branch and I'd expect all other bits are not set for this state. > Need to decide what do we want to report as carrier thread state for all possible values returned by get_thread_state_base(). Good concern. There are two bits (and the related RUNNABLE bit) that we care in this sub-tree of state bits: `SUSPENDED` and `INTERRUPTED`. This update clones these two bits. The RUNNABLE bit must be cleared. A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. The state returned by the `get_thread_state_base` is based on the call: ` state = (jint)java_lang_Thread::get_thread_status(thread_oop);` and addition of the derived from JavaThread bits: `SUSPENDED`, `INTERRUPTED` and `IN_NATIVE`. The three bit derived from the JavaThread are not relevant. This call has to be made directly: ` state = (jint)java_lang_Thread::get_thread_status(thread_oop);` The SUSPEND bit has to be based on the call: ` jt->is_carrier_thread_suspended();` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222212064 From mchung at openjdk.org Wed Jun 7 22:02:57 2023 From: mchung at openjdk.org (Mandy Chung) Date: Wed, 7 Jun 2023 22:02:57 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: References: Message-ID: > JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: fix merge issue ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14371/files - new: https://git.openjdk.org/jdk/pull/14371/files/73340aa9..d161a384 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14371&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14371&range=00-01 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/14371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14371/head:pull/14371 PR: https://git.openjdk.org/jdk/pull/14371 From cjplummer at openjdk.org Wed Jun 7 22:23:49 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 7 Jun 2023 22:23:49 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v5] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: On Wed, 7 Jun 2023 21:22:41 GMT, Serguei Spitsyn wrote: >> This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). >> The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. >> The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). >> >> The first push/patch is the original fix of JDK-8307153. >> The fix of the SUSPEND bit issue will be in the incremental update. >> It is to simplify the review. >> >> Testing: >> - TBD: mach5 tiers 1-5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: added/adjusted some comments Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14366#pullrequestreview-1468678777 From amenkov at openjdk.org Wed Jun 7 22:45:49 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 7 Jun 2023 22:45:49 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> Message-ID: On Wed, 7 Jun 2023 21:52:33 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/prims/jvmtiEnvBase.cpp line 765: >> >>> 763: if (is_thread_carrying_vthread(jt, thread_oop)) { >>> 764: state &= ~JVMTI_THREAD_STATE_RUNNABLE; >>> 765: state |= JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; >> >> This does not look correct. >> GetThreadState spec provides hierarchical set of questions to interpret thread state value. >> JVMTI_THREAD_STATE_ALIVE | JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY is only one branch and I'd expect all other bits are not set for this state. >> Need to decide what do we want to report as carrier thread state for all possible values returned by get_thread_state_base(). > > Good concern. > There are two bits (and the related RUNNABLE bit) that we care in this sub-tree of state bits: `SUSPENDED` and `INTERRUPTED`. This update clones these two bits. The RUNNABLE bit must be cleared. > A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. > The state returned by the `get_thread_state_base` is based on the call: > ` state = (jint)java_lang_Thread::get_thread_status(thread_oop);` > and addition of the derived from JavaThread bits: `SUSPENDED`, `INTERRUPTED` and `IN_NATIVE`. > The three bits derived from the JavaThread are not relevant. > This call has to be made directly: > ` state = (jint)java_lang_Thread::get_thread_status(thread_oop);` > The SUSPEND bit has to be based on the call: > ` jt->is_carrier_thread_suspended();` > > The function `get_thread_state` will look as below: > > if (is_thread_carrying_vthread(jt, thread_oop)) { > jint state = (jint)java_lang_Thread::get_thread_status(thread_oop); > if (jt->is_carrier_thread_suspended()) { > state |= JVMTI_THREAD_STATE_SUSPENDED; > } > // It's okay for the JVMTI state to be reported as WAITING when waiting > // for something other than an Object.wait. So, we treat a thread carrying > // a virtual thread as waiting indefinitely which is not runnable. > // It is why the RUNNABLE bit is cleared and the WAITING bits are added. > state &= ~JVMTI_THREAD_STATE_RUNNABLE; > state |= JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; > return state; > } else { > return get_thread_state_base(thread_oop, jt); > } Do you need to check jt->is_interrupted(false) and set INTERRUPTED bit? It looks like java_lang_Thread::get_thread_status(thread_oop) can only return RUNNABLE in the case and we clear it, so the call is not needed: if (is_thread_carrying_vthread(jt, thread_oop)) { jint state = JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; if (jt->is_carrier_thread_suspended()) { state |= JVMTI_THREAD_STATE_SUSPENDED; } if (jt->is_interrupted(false)) { state |= JVMTI_THREAD_STATE_INTERRUPTED; } return state; } else ... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222252628 From amenkov at openjdk.org Wed Jun 7 22:51:49 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 7 Jun 2023 22:51:49 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> Message-ID: <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> On Wed, 7 Jun 2023 22:42:49 GMT, Alex Menkov wrote: >> Good concern. >> There are two bits (and the related RUNNABLE bit) that we care in this sub-tree of state bits: `SUSPENDED` and `INTERRUPTED`. This update clones these two bits. The RUNNABLE bit must be cleared. >> A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. >> The state returned by the `get_thread_state_base` is based on the call: >> ` state = (jint)java_lang_Thread::get_thread_status(thread_oop);` >> and addition of the derived from JavaThread bits: `SUSPENDED`, `INTERRUPTED` and `IN_NATIVE`. >> The three bits derived from the JavaThread are not relevant. >> This call has to be made directly: >> ` state = (jint)java_lang_Thread::get_thread_status(thread_oop);` >> The SUSPEND bit has to be based on the call: >> ` jt->is_carrier_thread_suspended();` >> >> The function `get_thread_state` will look as below: >> >> if (is_thread_carrying_vthread(jt, thread_oop)) { >> jint state = (jint)java_lang_Thread::get_thread_status(thread_oop); >> if (jt->is_carrier_thread_suspended()) { >> state |= JVMTI_THREAD_STATE_SUSPENDED; >> } >> // It's okay for the JVMTI state to be reported as WAITING when waiting >> // for something other than an Object.wait. So, we treat a thread carrying >> // a virtual thread as waiting indefinitely which is not runnable. >> // It is why the RUNNABLE bit is cleared and the WAITING bits are added. >> state &= ~JVMTI_THREAD_STATE_RUNNABLE; >> state |= JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; >> return state; >> } else { >> return get_thread_state_base(thread_oop, jt); >> } > > Do you need to check jt->is_interrupted(false) and set INTERRUPTED bit? > It looks like java_lang_Thread::get_thread_status(thread_oop) can only return RUNNABLE in the case and we clear it, so the call is not needed: > > if (is_thread_carrying_vthread(jt, thread_oop)) { > jint state = JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; > if (jt->is_carrier_thread_suspended()) { > state |= JVMTI_THREAD_STATE_SUSPENDED; > } > if (jt->is_interrupted(false)) { > state |= JVMTI_THREAD_STATE_INTERRUPTED; > } > return state; > } else ... > A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. Actually it can be in native. And if I remember correctly synchronized block pins virtual thread, so inside synchronized we can get other states ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222255498 From dnsimon at openjdk.org Wed Jun 7 22:56:23 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 7 Jun 2023 22:56:23 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal [v2] In-Reply-To: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> References: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> Message-ID: > This PR improves the startup time for libgraal by speeding up how `VM.savedProps` is copied into libgraal. This data structure is now serialized to a native buffer directly from C++ and the native buffer is then directly decoded by libgraal. > > ## Times > > The basic benchmarking below shows that this change brings the time for a nop Java app with eager libgraal initialization (2) down to almost the same time as lazy libgraal initialization (1). The latter typically means no libgraal initialization happens as a top tier JIT compilation is never scheduled in such a short running app. > > > public class Nop { > public static void main(String[] args) {} > } > > > (1) Baseline (no options): > >> for i in (seq 10); java Nop; end > 0.05 real 0.04 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.00 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > > > (2) Eagerly initialize libgraal (with PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.06 real 0.04 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > > > (3) Eagerly initialize libgraal (without PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.11 real 0.08 user 0.02 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real ... Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: - [skip ci] snapshot Arguments::system_properties() in JVMCI instead of System.savedProps - more efficient copying of system properties into libjvmci ------------- Changes: https://git.openjdk.org/jdk/pull/14291/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14291&range=01 Stats: 277 lines in 17 files changed: 103 ins; 89 del; 85 mod Patch: https://git.openjdk.org/jdk/pull/14291.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14291/head:pull/14291 PR: https://git.openjdk.org/jdk/pull/14291 From dnsimon at openjdk.org Wed Jun 7 22:56:23 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 7 Jun 2023 22:56:23 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal In-Reply-To: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> References: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> Message-ID: On Fri, 2 Jun 2023 20:32:14 GMT, Doug Simon wrote: > This PR improves the startup time for libgraal by speeding up how `VM.savedProps` is copied into libgraal. This data structure is now serialized to a native buffer directly from C++ and the native buffer is then directly decoded by libgraal. > > ## Times > > The basic benchmarking below shows that this change brings the time for a nop Java app with eager libgraal initialization (2) down to almost the same time as lazy libgraal initialization (1). The latter typically means no libgraal initialization happens as a top tier JIT compilation is never scheduled in such a short running app. > > > public class Nop { > public static void main(String[] args) {} > } > > > (1) Baseline (no options): > >> for i in (seq 10); java Nop; end > 0.05 real 0.04 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.04 real 0.03 user 0.00 sys > 0.03 real 0.03 user 0.00 sys > 0.04 real 0.03 user 0.01 sys > 0.03 real 0.03 user 0.00 sys > > > (2) Eagerly initialize libgraal (with PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.06 real 0.04 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > 0.05 real 0.03 user 0.01 sys > > > (3) Eagerly initialize libgraal (without PR): > >> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end > 0.11 real 0.08 user 0.02 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.10 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.07 user 0.01 sys > 0.08 real 0.06 user 0.01 sys > 0.08 real ... Ok, I've pushed a change with your suggestion Tom and it seems to work. It assumes `Arguments::system_properties()` is fully initialized and effectively read-only by the time `JVMCIEnv::init_saved_properties` is called. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14291#issuecomment-1581611494 From dnsimon at openjdk.org Wed Jun 7 22:56:23 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Wed, 7 Jun 2023 22:56:23 GMT Subject: RFR: 8309390: [JVMCI] improve copying system properties into libgraal [v2] In-Reply-To: References: <9bsjzlbHK31VVyGwzyhpSBjSILWFxmAX0IfiWK6Wb_w=.197d2b45-dba5-43bc-ac4e-4f993d3e777a@github.com> Message-ID: <3jeOthuzzSjWn2lHwJmkJGK3gzuK7JutCyQDoo4xrKQ=.577bc443-26fc-4da1-9143-1a00bc4c5d91@github.com> On Wed, 7 Jun 2023 22:51:48 GMT, Doug Simon wrote: >> This PR improves the startup time for libgraal by speeding up how `VM.savedProps` is copied into libgraal. This data structure is now serialized to a native buffer directly from C++ and the native buffer is then directly decoded by libgraal. >> >> ## Times >> >> The basic benchmarking below shows that this change brings the time for a nop Java app with eager libgraal initialization (2) down to almost the same time as lazy libgraal initialization (1). The latter typically means no libgraal initialization happens as a top tier JIT compilation is never scheduled in such a short running app. >> >> >> public class Nop { >> public static void main(String[] args) {} >> } >> >> >> (1) Baseline (no options): >> >>> for i in (seq 10); java Nop; end >> 0.05 real 0.04 user 0.01 sys >> 0.04 real 0.03 user 0.01 sys >> 0.04 real 0.03 user 0.01 sys >> 0.04 real 0.03 user 0.01 sys >> 0.03 real 0.03 user 0.00 sys >> 0.04 real 0.03 user 0.01 sys >> 0.04 real 0.03 user 0.00 sys >> 0.03 real 0.03 user 0.00 sys >> 0.04 real 0.03 user 0.01 sys >> 0.03 real 0.03 user 0.00 sys >> >> >> (2) Eagerly initialize libgraal (with PR): >> >>> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end >> 0.06 real 0.04 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> 0.05 real 0.03 user 0.01 sys >> >> >> (3) Eagerly initialize libgraal (without PR): >> >>> for i in (seq 10); /usr/bin/time java -XX:+EagerJVMCI Nop; end >> 0.11 real 0.08 user 0.02 sys >> 0.08 real 0.06 user 0.01 sys >> 0.08 real 0.07 user 0.01 sys >> 0.10 real 0.07 user 0.01 sys >> 0.08 real 0.06 user 0.01 sys >> 0.10 real 0.07 user 0.01 sys >> 0.08 real 0.07 user 0.01 sys >> 0.08 real ... > > Doug Simon has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains two commits: > > - [skip ci] snapshot Arguments::system_properties() in JVMCI instead of System.savedProps > - more efficient copying of system properties into libjvmci src/hotspot/share/jvmci/jvmciJavaClasses.hpp line 60: > 58: jvmci_constructor) \ > 59: start_class(Services, jdk_vm_ci_services_Services) \ > 60: jvmci_method(CallStaticVoidMethod, GetStaticMethodID, call_static, void, Services, initializeSavedProperties, byte_array_void_signature, (JVMCIObject serializedProperties)) \ The final arg for the `jvmci_method` macro is never used so I removed it everywhere. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14291#discussion_r1222257086 From sspitsyn at openjdk.org Wed Jun 7 23:11:48 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 7 Jun 2023 23:11:48 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Wed, 7 Jun 2023 22:48:47 GMT, Alex Menkov wrote: >> Do you need to check jt->is_interrupted(false) and set INTERRUPTED bit? >> It looks like java_lang_Thread::get_thread_status(thread_oop) can only return RUNNABLE in the case and we clear it, so the call is not needed: >> >> if (is_thread_carrying_vthread(jt, thread_oop)) { >> jint state = JVMTI_THREAD_STATE_WAITING | JVMTI_THREAD_STATE_WAITING_INDEFINITELY; >> if (jt->is_carrier_thread_suspended()) { >> state |= JVMTI_THREAD_STATE_SUSPENDED; >> } >> if (jt->is_interrupted(false)) { >> state |= JVMTI_THREAD_STATE_INTERRUPTED; >> } >> return state; >> } else ... > >> A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. > > Actually it can be in native. > And if I remember correctly synchronized block pins virtual thread, so inside synchronized we can get other states The INTERRUPTED bit we need has to be returned by the `java_lang_Thread::get_thread_status`. Not completely sure but the bit jt->is_interrupted(false) can be set for the mounted virtual thread. The JVMTI InterruptThread calls this function to set interrupt bit for non-virtual threads: ` java_lang_Thread::set_interrupted(thread_obj, true);` ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222268031 From liach at openjdk.org Thu Jun 8 01:39:50 2023 From: liach at openjdk.org (Chen Liang) Date: Thu, 8 Jun 2023 01:39:50 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 22:02:57 GMT, Mandy Chung wrote: >> JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. > > Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: > > fix merge issue test/jdk/java/lang/reflect/Field/NegativeTest.java line 27: > 25: * @test > 26: * @bug 8277451 > 27: * @run testng/othervm NegativeTest Does this still need othervm if it doesn't set system properties? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14371#discussion_r1222350708 From sspitsyn at openjdk.org Thu Jun 8 01:42:10 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 Jun 2023 01:42:10 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: > This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). > The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. > The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). > > The first push/patch is the original fix of JDK-8307153. > The fix of the SUSPEND bit issue will be in the incremental update. > It is to simplify the review. > > Testing: > - TBD: mach5 tiers 1-5 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: corrected the function get_thread_state for safety ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14366/files - new: https://git.openjdk.org/jdk/pull/14366/files/8f26e277..5fd74f39 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14366&range=04-05 Stats: 13 lines in 1 file changed: 10 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14366.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14366/head:pull/14366 PR: https://git.openjdk.org/jdk/pull/14366 From sspitsyn at openjdk.org Thu Jun 8 01:42:25 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 Jun 2023 01:42:25 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Wed, 7 Jun 2023 23:08:52 GMT, Serguei Spitsyn wrote: >>> A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. >> >> Actually it can be in native. >> And if I remember correctly synchronized block pins virtual thread, so inside synchronized we can get other states > > The INTERRUPTED bit we need has to be returned by the `java_lang_Thread::get_thread_status`. > Not completely sure but the bit jt->is_interrupted(false) can be set for the mounted virtual thread. > The JVMTI InterruptThread calls this function to set interrupt bit for non-virtual threads: > ` java_lang_Thread::set_interrupted(thread_obj, true);` Corrected the function `get_thread_state()` to make it more safe. Only `ALIVE` and `INTERRUPTED` bits are taken from result of `java_lang_Thread::get_thread_status(thread_oop)`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222352148 From dholmes at openjdk.org Thu Jun 8 01:53:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 8 Jun 2023 01:53:52 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: References: Message-ID: <99mbKbiEdzbFp3PyMA-I8UX3LPWriM6PZFX3itSTObw=.c3238b5d-1552-4455-8ebb-923ccd60480d@github.com> On Wed, 7 Jun 2023 22:02:57 GMT, Mandy Chung wrote: >> JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. > > Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: > > fix merge issue Hotspot code and test changes look fine. Thanks. src/hotspot/share/classfile/verifier.cpp line 298: > 296: // NOTE: this is called too early in the bootstrapping process to be > 297: // guarded by Universe::is_gte_jdk14x_version(). > 298: // Also for lambda generated code, gte jdk8 While you are here could you delete these version comments please - they are meaningless these days. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14371#pullrequestreview-1468854747 PR Review Comment: https://git.openjdk.org/jdk/pull/14371#discussion_r1222355124 From dholmes at openjdk.org Thu Jun 8 02:09:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 8 Jun 2023 02:09:48 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Thu, 8 Jun 2023 01:40:06 GMT, Serguei Spitsyn wrote: >> The INTERRUPTED bit we need has to be returned by the `java_lang_Thread::get_thread_status`. >> Not completely sure but the bit jt->is_interrupted(false) can be set for the mounted virtual thread. >> The JVMTI InterruptThread calls this function to set interrupt bit for non-virtual threads: >> ` java_lang_Thread::set_interrupted(thread_obj, true);` > > Corrected the function `get_thread_state()` to make it more safe. > Only `ALIVE` and `INTERRUPTED` bits are taken from result of `java_lang_Thread::get_thread_status(thread_oop)`. > A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. A virtual thread can call native code, be blocked on an object monitor, or waiting on an object monitor. Only parking and sleeping are specialized for virtual threads in the list you gave. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222364495 From dholmes at openjdk.org Thu Jun 8 02:27:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 8 Jun 2023 02:27:48 GMT Subject: RFR: 8309613: [Windows] hs_err files sometimes miss information about the code containing the error In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 14:32:13 GMT, Martin Doerr wrote: > We have seen hs_err files for errors triggered by C2 compiled methods which miss the most relevant information: the C2 method (see JBS issue for more details). I have found a possibility to add it. Please take a look and provide feedback. > > Testing: > > diff --git a/src/hotspot/share/opto/parse1.cpp b/src/hotspot/share/opto/parse1.cpp > index f179d3ba88d..c35a1ac595e 100644 > --- a/src/hotspot/share/opto/parse1.cpp > +++ b/src/hotspot/share/opto/parse1.cpp > @@ -1210,6 +1210,12 @@ void Parse::do_method_entry() { > make_dtrace_method_entry(method()); > } > > + if (UseNewCode) { > + Node* halt = _gvn.transform(new HaltNode(control(), frameptr(), "Requested Halt!")); > + C->root()->add_req(halt); > + set_control(halt); > + } > + > #ifdef ASSERT > // Narrow receiver type when it is too broad for the method being parsed. > if (!method()->is_static()) { IIUC the basic fix here is track the last pc that was found before things "got stuck" so we print it. Though I'm unclear on the details - didn't we already print this pc in the stack trace as it was the last good pc? If so what does the new `print_code` show in addition? An example of the before/after output in the hs_err file would be helpful. Thanks. src/hotspot/share/utilities/vmError.cpp line 680: > 678: // keep track of which code has already been printed > 679: const int printed_capacity = max_error_log_print_code; > 680: address printed[printed_capacity]; Does this buffer get reused/overwritten by the "printing code blobs" logic? src/hotspot/share/utilities/vmError.cpp line 976: > 974: // We have printed the native stack in platform-specific code > 975: // Windows/x64 needs special handling. > 976: // Stack walking may got stuck. Try to print the calling code. Nit: s/got/get/ ------------- PR Review: https://git.openjdk.org/jdk/pull/14358#pullrequestreview-1468870840 PR Review Comment: https://git.openjdk.org/jdk/pull/14358#discussion_r1222365748 PR Review Comment: https://git.openjdk.org/jdk/pull/14358#discussion_r1222366172 From sspitsyn at openjdk.org Thu Jun 8 03:39:47 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 Jun 2023 03:39:47 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Thu, 8 Jun 2023 02:07:00 GMT, David Holmes wrote: > > A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. >A virtual thread can call native code, be blocked on an object monitor, or waiting on an object monitor. Only parking and sleeping are specialized for virtual threads in the list you gave. This statement was about carrier thread when there is a virtual thread executed at the top. We are getting state bits with the `java_lang_Thread::get_thread_status(thread_oop)` where the `thread_oop` belongs to the carrier thread. But you are talking about a virtual thread which, of course, can be in almost any state. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222413499 From sspitsyn at openjdk.org Thu Jun 8 04:01:56 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 Jun 2023 04:01:56 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v4] In-Reply-To: References: Message-ID: <7xXMXXSeQLwosyHqOBA9N6cIKIIzHRfGTOVNxYLcMOY=.4a7ce215-bf6b-4659-a502-fb001293a5c7@github.com> > This is a minor update of the `jvmti.xml` file. > The JVM TI history table needs to be updated to list: > - Virtual threads finalized to be a permanent feature. > - Agent start-up in the live phase now specified to print a warning. > > The JVM TI history table has no normative changes. This update does not need a CSR. Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: improved formatting in the jvmti.xml history table ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14352/files - new: https://git.openjdk.org/jdk/pull/14352/files/11db4f4f..8b0997a2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14352&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14352&range=02-03 Stats: 43 lines in 1 file changed: 0 ins; 14 del; 29 mod Patch: https://git.openjdk.org/jdk/pull/14352.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14352/head:pull/14352 PR: https://git.openjdk.org/jdk/pull/14352 From sspitsyn at openjdk.org Thu Jun 8 04:06:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 Jun 2023 04:06:55 GMT Subject: RFR: 8309602: update JVMTI history table for jdk 21 [v4] In-Reply-To: <7xXMXXSeQLwosyHqOBA9N6cIKIIzHRfGTOVNxYLcMOY=.4a7ce215-bf6b-4659-a502-fb001293a5c7@github.com> References: <7xXMXXSeQLwosyHqOBA9N6cIKIIzHRfGTOVNxYLcMOY=.4a7ce215-bf6b-4659-a502-fb001293a5c7@github.com> Message-ID: On Thu, 8 Jun 2023 04:01:56 GMT, Serguei Spitsyn wrote: >> This is a minor update of the `jvmti.xml` file. >> The JVM TI history table needs to be updated to list: >> - Virtual threads finalized to be a permanent feature. >> - Agent start-up in the live phase now specified to print a warning. >> >> The JVM TI history table has no normative changes. This update does not need a CSR. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > improved formatting in the jvmti.xml history table Alan and Iris, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14352#issuecomment-1581857901 From sspitsyn at openjdk.org Thu Jun 8 04:06:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 Jun 2023 04:06:57 GMT Subject: Integrated: 8309602: update JVMTI history table for jdk 21 In-Reply-To: References: Message-ID: <2z5JUAtnKoIy2wAx3tSYZP_SYkRlvmEm334SnqUzmxI=.5886d218-223f-4a3d-902a-5b92721b20d1@github.com> On Wed, 7 Jun 2023 12:32:14 GMT, Serguei Spitsyn wrote: > This is a minor update of the `jvmti.xml` file. > The JVM TI history table needs to be updated to list: > - Virtual threads finalized to be a permanent feature. > - Agent start-up in the live phase now specified to print a warning. > > The JVM TI history table has no normative changes. This update does not need a CSR. This pull request has now been integrated. Changeset: 5af9d2a0 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/5af9d2a0ac82ad83dc83461e5b8ce793cc995ad3 Stats: 44 lines in 1 file changed: 0 ins; 10 del; 34 mod 8309602: update JVMTI history table for jdk 21 Reviewed-by: alanb, iris ------------- PR: https://git.openjdk.org/jdk/pull/14352 From dholmes at openjdk.org Thu Jun 8 04:32:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 8 Jun 2023 04:32:48 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Thu, 8 Jun 2023 03:36:52 GMT, Serguei Spitsyn wrote: >>> A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. >> >> A virtual thread can call native code, be blocked on an object monitor, or waiting on an object monitor. Only parking and sleeping are specialized for virtual threads in the list you gave. > >> > A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. > >>A virtual thread can call native code, be blocked on an object monitor, or waiting on an object monitor. Only parking and sleeping are specialized for virtual threads in the list you gave. > > This statement was about a carrier thread (not a `JavaThread` and not a ` java.lang.VirtualThread`) when there is a virtual thread executed at the top. We are getting state bits with the `java_lang_Thread::get_thread_status(thread_oop)` where the `thread_oop` belongs to the carrier thread. But you are talking about a virtual thread which, of course, can be in almost any state. Thanks for clarifying - it gets very confusing as to which "thread" is being talked about. But if a virtual thread is mounted on this JavaThread then I thought the carrier thread's thread-oop is supposed to be in a blocked state? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222437953 From sspitsyn at openjdk.org Thu Jun 8 04:46:50 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 8 Jun 2023 04:46:50 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Thu, 8 Jun 2023 04:29:47 GMT, David Holmes wrote: >>> > A thread carrying a virtual thread can not be in native, blocked, parked, sleeping or waiting on some object. >> >>>A virtual thread can call native code, be blocked on an object monitor, or waiting on an object monitor. Only parking and sleeping are specialized for virtual threads in the list you gave. >> >> This statement was about a carrier thread (not a `JavaThread` and not a ` java.lang.VirtualThread`) when there is a virtual thread executed at the top. We are getting state bits with the `java_lang_Thread::get_thread_status(thread_oop)` where the `thread_oop` belongs to the carrier thread. But you are talking about a virtual thread which, of course, can be in almost any state. > > Thanks for clarifying - it gets very confusing as to which "thread" is being talked about. But if a virtual thread is mounted on this JavaThread then I thought the carrier thread's thread-oop is supposed to be in a blocked state? It was decided with Alan that it is okay to be in a waiting state. The `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` state requires a monitor to be blocked on, so it can be confusing. Alan's comment in the original PR [https://github.com/openjdk/jdk/pull/14298](https://github.com/openjdk/jdk/pull/14298) was: > if the jt is carrying thread_oop and it's okay for the JVMTI state to reported as WAITING when waiting for something other than Object.wait. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222442892 From iklam at openjdk.org Thu Jun 8 05:08:53 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 8 Jun 2023 05:08:53 GMT Subject: RFR: 8309065: Move the logic to determine archive heap location from CDS to G1 GC [v3] In-Reply-To: <_8f8oqpdYT56hxKf3MfApsT658ARTF0jXIMnP3UQo7E=.4936b832-9b5f-4853-be3d-c21daa3720c6@github.com> References: <_8f8oqpdYT56hxKf3MfApsT658ARTF0jXIMnP3UQo7E=.4936b832-9b5f-4853-be3d-c21daa3720c6@github.com> Message-ID: <8pwA_2qyJgwWaeeCNihqyrFI8vPPd7_Y2CMsxw32YrI=.fd4e81e9-0387-4e2e-affb-8cc279fe66c3@github.com> On Wed, 7 Jun 2023 14:30:32 GMT, Ashutosh Mehra wrote: > > I didn't look but I assume the dumptime heap size is the default heap size? > > > > I am wondering why do you think dumptime heap size is the default heap size? When creating an archive the heap size can be anything. For default archive it is 128m. The dump time heap size is set to 128mb for the default archive so that it can be mapped without relocation for small workloads with heaps that can fit under the 4gb boundary. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14208#discussion_r1222455270 From alanb at openjdk.org Thu Jun 8 06:27:49 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 8 Jun 2023 06:27:49 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Thu, 8 Jun 2023 04:41:10 GMT, Serguei Spitsyn wrote: >> Thanks for clarifying - it gets very confusing as to which "thread" is being talked about. But if a virtual thread is mounted on this JavaThread then I thought the carrier thread's thread-oop is supposed to be in a blocked state? > > It was decided with Alan that it is okay to be in a waiting state. The `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` state requires a monitor to be blocked on, so it can be confusing. Alan's comment in the original PR [https://github.com/openjdk/jdk/pull/14298](https://github.com/openjdk/jdk/pull/14298) was: >> if the jt is carrying thread_oop and it's okay for the JVMTI state to reported as WAITING when waiting for something other than Object.wait. The mental model is that the carrier is blocked so this is what an observer using the APIs should see. My recollection is that JVMTI_THREAD_STATE_WAITING was okay because there is a wriggle room in the JVM TI spec, it only uses Object.wait as an example. There may be a few rough edges to smooth down in this area. It's okay to take time with this PR and expand the tests to cover more cases and get more confident that there aren't more issues. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1222511997 From stefank at openjdk.org Thu Jun 8 06:40:49 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 8 Jun 2023 06:40:49 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> References: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> Message-ID: On Wed, 7 Jun 2023 14:20:57 GMT, Thomas Stuefe wrote: > Question: why is this limited to generational ZGC? Just a decision not to fix old ZGC, or does it not happen with old ZGC? It's a little bit of both. We are mainly focusing on improving Generational ZGC, but I actually did create a Bug for the Singlegen ZGC (JDK-8309607) for this. The fix in this PR was actually an old patch for Generational ZGC that I had laying around that I had already tested. We can probably still fix this for Singlegen ZGC, but it needs a little bit more care since we have yet another layer with the multi-mapping. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14355#issuecomment-1581973726 From stefank at openjdk.org Thu Jun 8 06:44:48 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 8 Jun 2023 06:44:48 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> References: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> Message-ID: On Wed, 7 Jun 2023 14:20:57 GMT, Thomas Stuefe wrote: > We do, but it is not such an important use case: in hs_err file "unknown pointer" printing, I use NMT to make sense of an otherwise unknown address. OK. I didn't think about that newish feature. However, if those pointers were actually into the Java Heap they wouldn't be "unknown pointers" but instead reported as uncolored ZGC pointers by the code in ZCollectedHeap::print_location (IIUC). Though I see your point, and maybe there's a way to rewrite NMT in the future. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14355#issuecomment-1581976917 From stefank at openjdk.org Thu Jun 8 06:54:52 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 8 Jun 2023 06:54:52 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> References: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> Message-ID: On Wed, 7 Jun 2023 14:14:45 GMT, Thomas Stuefe wrote: >> ZGC has separated the committing of physical memory from the mapping of the committed memory to virtual memory. It also has asynchronous, lazy unmapping of virtual memory from physical memory. This leads to a situation where multiple virtual memory areas can be mapped to the same physical memory. NMT has a strong assumption that there's a 1-to-1 correspondence between committed memory and its virtual memory areas. Because of this NMT and ZGC is not entirely compatible. ZGC has worked around this by adding NMT hooks where the virtual memory is mapped to the committed memory. This mostly works, but there are situations where we have multiple virtual memory areas mapped to the same physical memory, and that causes the NMT values to be inflated. >> >> I propose that we move the NMT committed memory tracking from the mapping of virtual memory to the actual committing of physical memory. >> >> FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. The interesting number is the amount of committed memory, not the exact addresses, IMHO. This isn't something that we change with this patch, but it can be worth understanding while looking at this Bug and the associated PR. >> >> I've written a small sanity test for the NMT Java Heap values, however it's non-trivial to write a test that efficiently provokes this. I've verified this fix by manually running an over-provisioned SPECjbb2015 run, which results in a lot of splitting of ZGC heap regions, which in turn gives us multiple virtual memory area mapping for the same physical memory. >> >> Side note: the lazy unmapping of virtual memory can cause other problems with too many virtual memory areas. The inflated NMT numbers have been a smoking gun showing us that issue. We are tracking that issue with [JDK-8308783](https://bugs.openjdk.org/browse/JDK-8308783). > > src/hotspot/share/gc/z/zPhysicalMemory.cpp line 285: > >> 283: // When this function is called we don't know where in the virtual memory >> 284: // this physical memory will be mapped. So we fake that the virtual memory >> 285: // address is the heap base + the given offset. > > Question of a casual ZGC source reader: when you talk about physical vs virtual here, you are not talking about the real physical vs virtual, right? You are talking about offsets into the ZGC backing file vs attach points of said offsets in the virtual address space? Right, but the backing file is backed by physical memory and that's where the name comes from. The backing file is just the way for us to get hold of physical memory, which we can map into the virtual address space. I hope that makes sense. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14355#discussion_r1222536953 From stuefe at openjdk.org Thu Jun 8 07:06:49 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 8 Jun 2023 07:06:49 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: References: <7q-C6XTmgkOMgXGYjDNXop5dKrTdvhh52jagvS8YfO0=.9fde096e-1889-4ce1-80c2-9ee9608a91d4@github.com> Message-ID: On Thu, 8 Jun 2023 06:52:09 GMT, Stefan Karlsson wrote: >> src/hotspot/share/gc/z/zPhysicalMemory.cpp line 285: >> >>> 283: // When this function is called we don't know where in the virtual memory >>> 284: // this physical memory will be mapped. So we fake that the virtual memory >>> 285: // address is the heap base + the given offset. >> >> Question of a casual ZGC source reader: when you talk about physical vs virtual here, you are not talking about the real physical vs virtual, right? You are talking about offsets into the ZGC backing file vs attach points of said offsets in the virtual address space? > > Right, but the backing file is backed by physical memory and that's where the name comes from. The backing file is just the way for us to get hold of physical memory, which we can map into the virtual address space. I hope that makes sense. It does, thank you. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14355#discussion_r1222548008 From jwaters at openjdk.org Thu Jun 8 07:46:59 2023 From: jwaters at openjdk.org (Julian Waters) Date: Thu, 8 Jun 2023 07:46:59 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v16] In-Reply-To: References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: On Sat, 3 Jun 2023 13:45:21 GMT, Julian Waters wrote: >> C++11 added the alignas attribute, for the purpose of specifying alignment on types, much like compiler specific syntax such as gcc's __attribute__((aligned(x))) or Visual C++'s __declspec(align(x)). >> >> We can phase out the use of the macro in favor of the standard attribute. In the meantime, we can replace the compiler specific definitions of ATTRIBUTE_ALIGNED with a portable definition. We might deprecate the use of the macro but changing its implementation quickly and cleanly applies the feature where the macro is being used. >> >> Note: With certain parts of HotSpot using ATTRIBUTE_ALIGNED so indiscriminately, this commit will likely take some time to get right >> >> This will require adding the alignas attribute to the list of language features approved for use in HotSpot code. (Completed with [8297912](https://github.com/openjdk/jdk/pull/11446)) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into alignas > - Merge branch 'openjdk:master' into alignas > - alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - ... and 6 more: https://git.openjdk.org/jdk/compare/6edd786b...48d816d7 Anyone? There's confirmation that no other cases of the macro is in a position that causes issues without compiler errors... ------------- PR Comment: https://git.openjdk.org/jdk/pull/11431#issuecomment-1582067464 From aph at openjdk.org Thu Jun 8 08:25:35 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 8 Jun 2023 08:25:35 GMT Subject: RFR: JDK-8307314: Implementation: Generational Shenandoah (Experimental) [v17] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 21:03:47 GMT, Kelvin Nilsen wrote: > We would like to thank everyone who has taken time to review and provide feedback on our pull request. Given the risks identified during the review process and the lack of time available to perform the thorough review that such a large contribution of code requires, we have decided to close this PR at the current time. We will seek to target JDK 22. Thank you for this. It's the right decision. In hindsight, there never was a highly-likely prospect of getting such a substantial and interwoven patch successfully reviewed in such a short time, even with the most skilful and experienced team. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14185#issuecomment-1582124942 From alanb at openjdk.org Thu Jun 8 08:45:50 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 8 Jun 2023 08:45:50 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 22:02:57 GMT, Mandy Chung wrote: >> JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. > > Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: > > fix merge issue src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java line 578: > 576: // then switch to the bytecode-based implementations. > 577: > 578: private static final Config DEFAULT_CONFIG = new Config(false, // useNativeAccessorOnly The block comment just before this will need updating. There's another one in Config.config where it uses the default/native implementation in early startup. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14371#discussion_r1222659670 From alanb at openjdk.org Thu Jun 8 08:50:48 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 8 Jun 2023 08:50:48 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 22:02:57 GMT, Mandy Chung wrote: >> JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. > > Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: > > fix merge issue The old + new implementations have been in four releases, which is good for shaking out any regressions/issues. Removing the old implementation in JDK 22 is good. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14371#pullrequestreview-1469340126 From mdoerr at openjdk.org Thu Jun 8 09:34:52 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 8 Jun 2023 09:34:52 GMT Subject: RFR: 8309613: [Windows] hs_err files sometimes miss information about the code containing the error In-Reply-To: References: Message-ID: On Thu, 8 Jun 2023 02:10:00 GMT, David Holmes wrote: >> We have seen hs_err files for errors triggered by C2 compiled methods which miss the most relevant information: the C2 method (see JBS issue for more details). I have found a possibility to add it. Please take a look and provide feedback. >> >> Testing: >> >> diff --git a/src/hotspot/share/opto/parse1.cpp b/src/hotspot/share/opto/parse1.cpp >> index f179d3ba88d..c35a1ac595e 100644 >> --- a/src/hotspot/share/opto/parse1.cpp >> +++ b/src/hotspot/share/opto/parse1.cpp >> @@ -1210,6 +1210,12 @@ void Parse::do_method_entry() { >> make_dtrace_method_entry(method()); >> } >> >> + if (UseNewCode) { >> + Node* halt = _gvn.transform(new HaltNode(control(), frameptr(), "Requested Halt!")); >> + C->root()->add_req(halt); >> + set_control(halt); >> + } >> + >> #ifdef ASSERT >> // Narrow receiver type when it is too broad for the method being parsed. >> if (!method()->is_static()) { >> >> >> "java -XX:+UseNewCode -version" shows the following output (when no hsdis lib is provided): >> >> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) >> V [jvm.dll+0x6ca5b9] os::win32::platform_print_native_stack+0xd9 (os_windows_x86.cpp:236) >> V [jvm.dll+0x8a3afa] VMError::report+0xd6a (vmError.cpp:973) >> V [jvm.dll+0x8a5cde] VMError::report_and_die+0x5fe (vmError.cpp:1765) >> V [jvm.dll+0x283061] report_fatal+0x71 (debug.cpp:212) >> V [jvm.dll+0x621c3e] MacroAssembler::debug64+0x8e (macroAssembler_x86.cpp:829) >> C 0x000001635fe021f4 >> >> >> called by the following code: >> Compiled method (c2) 87 16 4 java.lang.Object:: (1 bytes) >> total in heap [0x000001635fe02010,0x000001635fe02250] = 576 >> relocation [0x000001635fe02170,0x000001635fe02188] = 24 >> main code [0x000001635fe021a0,0x000001635fe02200] = 96 >> stub code [0x000001635fe02200,0x000001635fe02218] = 24 >> metadata [0x000001635fe02218,0x000001635fe02220] = 8 >> scopes data [0x000001635fe02220,0x000001635fe02228] = 8 >> scopes pcs [0x000001635fe02228,0x000001635fe02248] = 32 >> dependencies [0x000001635fe02248,0x000001635fe02250] = 8 >> >> [Constant Pool (empty)] >> >> [MachCode] >> [Entry Point] >> # {method} {0x0000000800478d78} '' '()V' in 'java/lang/Object' >> # [sp+0x20] (sp of caller) >> 0x000001635fe021a0: 448b 5208 | 49bb 0000 | 0000 0800 | 0000 4d03 | d349 3bc2 >> >> 0x000001635fe021b4: ; {runtime_call ic_miss_stub} >> 0x000001635fe021b4: 0f85 c6c4 | 8fff 6690 | 0f1f 4000 >> [Verified Ent... > > src/hotspot/share/utilities/vmError.cpp line 680: > >> 678: // keep track of which code has already been printed >> 679: const int printed_capacity = max_error_log_print_code; >> 680: address printed[printed_capacity]; > > Does this buffer get reused/overwritten by the "printing code blobs" logic? The purpose of the buffer is to keep track of what has already been printed. We don't want to print the same code again if we encounter the address again. This can be used across several error reporting steps. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14358#discussion_r1222716743 From stefank at openjdk.org Thu Jun 8 14:06:51 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 8 Jun 2023 14:06:51 GMT Subject: RFR: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 13:30:05 GMT, Stefan Karlsson wrote: > ZGC has separated the committing of physical memory from the mapping of the committed memory to virtual memory. It also has asynchronous, lazy unmapping of virtual memory from physical memory. This leads to a situation where multiple virtual memory areas can be mapped to the same physical memory. NMT has a strong assumption that there's a 1-to-1 correspondence between committed memory and its virtual memory areas. Because of this NMT and ZGC is not entirely compatible. ZGC has worked around this by adding NMT hooks where the virtual memory is mapped to the committed memory. This mostly works, but there are situations where we have multiple virtual memory areas mapped to the same physical memory, and that causes the NMT values to be inflated. > > I propose that we move the NMT committed memory tracking from the mapping of virtual memory to the actual committing of physical memory. > > FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. The interesting number is the amount of committed memory, not the exact addresses, IMHO. This isn't something that we change with this patch, but it can be worth understanding while looking at this Bug and the associated PR. > > I've written a small sanity test for the NMT Java Heap values, however it's non-trivial to write a test that efficiently provokes this. I've verified this fix by manually running an over-provisioned SPECjbb2015 run, which results in a lot of splitting of ZGC heap regions, which in turn gives us multiple virtual memory area mapping for the same physical memory. > > Side note: the lazy unmapping of virtual memory can cause other problems with too many virtual memory areas. The inflated NMT numbers have been a smoking gun showing us that issue. We are tracking that issue with [JDK-8308783](https://bugs.openjdk.org/browse/JDK-8308783). Thanks for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14355#issuecomment-1582638155 From stefank at openjdk.org Thu Jun 8 14:10:04 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 8 Jun 2023 14:10:04 GMT Subject: Integrated: 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 13:30:05 GMT, Stefan Karlsson wrote: > ZGC has separated the committing of physical memory from the mapping of the committed memory to virtual memory. It also has asynchronous, lazy unmapping of virtual memory from physical memory. This leads to a situation where multiple virtual memory areas can be mapped to the same physical memory. NMT has a strong assumption that there's a 1-to-1 correspondence between committed memory and its virtual memory areas. Because of this NMT and ZGC is not entirely compatible. ZGC has worked around this by adding NMT hooks where the virtual memory is mapped to the committed memory. This mostly works, but there are situations where we have multiple virtual memory areas mapped to the same physical memory, and that causes the NMT values to be inflated. > > I propose that we move the NMT committed memory tracking from the mapping of virtual memory to the actual committing of physical memory. > > FWIW, given that NMT and ZGC doesn't agree about how memory is committed, we have to fake the virtual memory addresses reported to NMT. This could probably be noticed if you look for the Java heap addresses in the NMT details output, but I don't see why anyone should be looking for those address for the Java heap in NMT. The interesting number is the amount of committed memory, not the exact addresses, IMHO. This isn't something that we change with this patch, but it can be worth understanding while looking at this Bug and the associated PR. > > I've written a small sanity test for the NMT Java Heap values, however it's non-trivial to write a test that efficiently provokes this. I've verified this fix by manually running an over-provisioned SPECjbb2015 run, which results in a lot of splitting of ZGC heap regions, which in turn gives us multiple virtual memory area mapping for the same physical memory. > > Side note: the lazy unmapping of virtual memory can cause other problems with too many virtual memory areas. The inflated NMT numbers have been a smoking gun showing us that issue. We are tracking that issue with [JDK-8308783](https://bugs.openjdk.org/browse/JDK-8308783). This pull request has now been integrated. Changeset: bb377b26 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/bb377b26730f3d9da7c76e0d171517e811cef3ce Stats: 128 lines in 2 files changed: 104 ins; 18 del; 6 mod 8306841: Generational ZGC: NMT reports Java heap size larger than max heap size Reviewed-by: eosterlund, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/14355 From mchung at openjdk.org Thu Jun 8 16:46:55 2023 From: mchung at openjdk.org (Mandy Chung) Date: Thu, 8 Jun 2023 16:46:55 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v3] In-Reply-To: References: Message-ID: > JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: review feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14371/files - new: https://git.openjdk.org/jdk/pull/14371/files/d161a384..fc164c71 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14371&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14371&range=01-02 Stats: 49 lines in 4 files changed: 15 ins; 28 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/14371.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14371/head:pull/14371 PR: https://git.openjdk.org/jdk/pull/14371 From mchung at openjdk.org Thu Jun 8 16:46:57 2023 From: mchung at openjdk.org (Mandy Chung) Date: Thu, 8 Jun 2023 16:46:57 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: <99mbKbiEdzbFp3PyMA-I8UX3LPWriM6PZFX3itSTObw=.c3238b5d-1552-4455-8ebb-923ccd60480d@github.com> References: <99mbKbiEdzbFp3PyMA-I8UX3LPWriM6PZFX3itSTObw=.c3238b5d-1552-4455-8ebb-923ccd60480d@github.com> Message-ID: On Thu, 8 Jun 2023 01:46:55 GMT, David Holmes wrote: >> Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: >> >> fix merge issue > > src/hotspot/share/classfile/verifier.cpp line 298: > >> 296: // NOTE: this is called too early in the bootstrapping process to be >> 297: // guarded by Universe::is_gte_jdk14x_version(). >> 298: // Also for lambda generated code, gte jdk8 > > While you are here could you delete these version comments please - they are meaningless these days. Thanks. Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14371#discussion_r1223305314 From mchung at openjdk.org Thu Jun 8 16:47:00 2023 From: mchung at openjdk.org (Mandy Chung) Date: Thu, 8 Jun 2023 16:47:00 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: References: Message-ID: <6kpeFY6P4c-vULMHVmKx1moS_087FvmsMDlsWBeSC6o=.88b3c7a7-803b-466f-8223-dd459a9eb98c@github.com> On Thu, 8 Jun 2023 08:42:41 GMT, Alan Bateman wrote: >> Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: >> >> fix merge issue > > src/java.base/share/classes/jdk/internal/reflect/ReflectionFactory.java line 578: > >> 576: // then switch to the bytecode-based implementations. >> 577: >> 578: private static final Config DEFAULT_CONFIG = new Config(false, // useNativeAccessorOnly > > The block comment just before this will need updating. There's another one in Config.config where it uses the default/native implementation in early startup. Thanks for pointing this out. Updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14371#discussion_r1223305554 From mchung at openjdk.org Thu Jun 8 16:47:03 2023 From: mchung at openjdk.org (Mandy Chung) Date: Thu, 8 Jun 2023 16:47:03 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v2] In-Reply-To: References: Message-ID: On Thu, 8 Jun 2023 01:36:40 GMT, Chen Liang wrote: >> Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: >> >> fix merge issue > > test/jdk/java/lang/reflect/Field/NegativeTest.java line 27: > >> 25: * @test >> 26: * @bug 8277451 >> 27: * @run testng/othervm NegativeTest > > Does this still need othervm if it doesn't set system properties? Updated. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14371#discussion_r1223305257 From alanb at openjdk.org Thu Jun 8 17:32:18 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 8 Jun 2023 17:32:18 GMT Subject: RFR: 8305104: Remove the old core reflection implementation [v3] In-Reply-To: References: Message-ID: <1U7a_JyO7dEQz3VP5nxeNyEVXn5zVT4p0CO3iLgAa1w=.bbfda9c7-39d9-4127-8b4a-98bf4a15bc76@github.com> On Thu, 8 Jun 2023 16:46:55 GMT, Mandy Chung wrote: >> JEP 416 integrated in JDK 18 and since then, only a couple minor issues has been reported. Those issues were related with exception being thrown with invalid arguments. We propose to remove the old core reflection implementation in JDK 22. The `-Djdk.reflect.useDirectMethodHandle=false` workaround to revert to the old implementation will stop to work. > > Mandy Chung has updated the pull request incrementally with one additional commit since the last revision: > > review feedback Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14371#pullrequestreview-1470455536 From amenkov at openjdk.org Thu Jun 8 18:21:44 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Thu, 8 Jun 2023 18:21:44 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: On Thu, 8 Jun 2023 01:42:10 GMT, Serguei Spitsyn wrote: >> This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). >> The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. >> The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). >> >> The first push/patch is the original fix of JDK-8307153. >> The fix of the SUSPEND bit issue will be in the incremental update. >> It is to simplify the review. >> >> Testing: >> - TBD: mach5 tiers 1-5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: corrected the function get_thread_state for safety Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/14366#pullrequestreview-1470533353 From kbarrett at openjdk.org Thu Jun 8 18:47:45 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 8 Jun 2023 18:47:45 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v16] In-Reply-To: References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: On Thu, 8 Jun 2023 07:44:08 GMT, Julian Waters wrote: > Anyone? There's confirmation that no other cases of the macro is in a position that causes issues without compiler errors... I plan to look at this again soon-ish, but swamped right now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/11431#issuecomment-1583156443 From dnsimon at openjdk.org Thu Jun 8 19:08:53 2023 From: dnsimon at openjdk.org (Doug Simon) Date: Thu, 8 Jun 2023 19:08:53 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v8] In-Reply-To: References: Message-ID: <9RJb8PvlZxLYs2zsDb-lDqoILDkiFwPx54vq3NpIdQQ=.c158dc31-68b0-49bd-8ac4-1ff1069be454@github.com> On Tue, 2 May 2023 02:01:44 GMT, Serguei Spitsyn wrote: >> This refactoring to separate ThreadStart/ThreadEnd events posting code in the JVMTI VTMS transitions is needed for future work on JVMTI scalability and performance improvements. It is to easier put this code on slow path. >> >> Testing: mach5 tiers 1-6 were successful. > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > update copyright comments src/hotspot/share/runtime/sharedRuntime.cpp line 641: > 639: JRT_ENTRY(void, SharedRuntime::notify_jvmti_vthread_start(oopDesc* vt, jboolean hide, JavaThread* current)) > 640: assert(hide == JNI_FALSE, "must be VTMS transition finish"); > 641: jobject vthread = JNIHandles::make_local(const_cast(vt)); Since the current thread is in the `current` arg, it could be used here when creating the local handle. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13484#discussion_r1223444559 From phh at openjdk.org Thu Jun 8 19:20:40 2023 From: phh at openjdk.org (Paul Hohensee) Date: Thu, 8 Jun 2023 19:20:40 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives In-Reply-To: References: Message-ID: On Wed, 24 May 2023 00:38:27 GMT, Dmitry Chuyko wrote: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives. Methods in general are often inlined, and this information is hard to track down. > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Obviously there is a performance penalty, so it should be applied with care. Hot code will most likely be recompiled soon, as nothing happens to its hotness. > > A new flag '`-d`' has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and marks for deoptimization those methods that have any active non-default matching compiler directives. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be deoptimized. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be deoptimized, but this can be achieved by having rules for them. > > In addition, a new diagnistic command `Compiler.replace_directives`, has been added for convenience. It's like a combinatio... "refresh" (-r) would be better than "deoptimize" (-d). The latter implies a specific implementation, the former is generic. If the method is to be recompiled, perhaps rather than deopt and wait, add it to the compile queue immediately and deopt the old version when the new compilation is complete, similar to what happens when the c1 version of the method is replaced by the c2 version. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14111#issuecomment-1583199824 From vladimir.petko at canonical.com Thu Jun 8 22:12:49 2023 From: vladimir.petko at canonical.com (Vladimir Petko) Date: Fri, 9 Jun 2023 10:12:49 +1200 Subject: src/hotspot/share/adlc/formsopt.cpp: FrameForm constructor does not initialise all the members Message-ID: Dear Maintainers, src/hotspot/share/adlc/formsopt.cpp contains the FrameForm() constructor that does not initialise all of its members. Would it be possible to consider a patch that fixes FrameForm initialisation? Alternatively this class could benefit from adding default member initialisers, e.g. --------cut------------ class FrameForm : public Form { private: public: // Public Data char *_sync_stack_slots{}; char *_inline_cache_reg{}; .... ----------cut-------------- and removing initialisation from the constructor. In this case it would be harder to leave a new member uninitialised and it would be immediately obvious. Best Regards, Vladimir. -------------- next part -------------- A non-text attachment was scrubbed... Name: formopt.cpp.patch Type: text/x-patch Size: 727 bytes Desc: not available URL: From vladimir.petko at canonical.com Fri Jun 9 02:07:22 2023 From: vladimir.petko at canonical.com (Vladimir Petko) Date: Fri, 9 Jun 2023 14:07:22 +1200 Subject: hotspot/jtreg/runtime/NMT/VirtualAllocCommitMerge.java fails after JDK-8299089 on S390X Message-ID: Dear Maintainers, test/hotspot/jtreg/runtime/NMT/VirtualAllocCommitMerge.java started failing in release configuration after commit ?c7056737e33 8299089: Instrument global jni handles with tag to make them distinguishable? on S390X platform. Please see attached VirtualAllocCommitMerge.jtr log. Note: this test fails in slowdebug mode before this commit (see VirtualAllocComitMerge-slowdebug.jtr) Best Regards, Vladimir. -------------- next part -------------- A non-text attachment was scrubbed... Name: VirtualAllocCommitMerge.jtr.gz Type: application/gzip Size: 14557 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: VirtualAllocCommitMerge-slowdebug.jtr.gz Type: application/gzip Size: 14397 bytes Desc: not available URL: From david.holmes at oracle.com Fri Jun 9 02:34:34 2023 From: david.holmes at oracle.com (David Holmes) Date: Fri, 9 Jun 2023 12:34:34 +1000 Subject: hotspot/jtreg/runtime/NMT/VirtualAllocCommitMerge.java fails after JDK-8299089 on S390X In-Reply-To: References: Message-ID: Hi Vladimir On 9/06/2023 12:07 pm, Vladimir Petko wrote: > Dear Maintainers, > > test/hotspot/jtreg/runtime/NMT/VirtualAllocCommitMerge.java started > failing in release configuration after commit ?c7056737e33 8299089: > Instrument global jni handles with tag to make them distinguishable? > on S390X platform. > > Please see attached VirtualAllocCommitMerge.jtr log. > > Note: this test fails in slowdebug mode before this commit (see > VirtualAllocComitMerge-slowdebug.jtr) > I have filed https://bugs.openjdk.org/browse/JDK-8309698 for this issue. Thanks for reporting it. David ----- > Best Regards, > Vladimir. From dholmes at openjdk.org Fri Jun 9 04:11:41 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 9 Jun 2023 04:11:41 GMT Subject: RFR: 8309613: [Windows] hs_err files sometimes miss information about the code containing the error In-Reply-To: References: Message-ID: On Thu, 8 Jun 2023 09:32:00 GMT, Martin Doerr wrote: >> src/hotspot/share/utilities/vmError.cpp line 680: >> >>> 678: // keep track of which code has already been printed >>> 679: const int printed_capacity = max_error_log_print_code; >>> 680: address printed[printed_capacity]; >> >> Does this buffer get reused/overwritten by the "printing code blobs" logic? > > The purpose of the buffer is to keep track of what has already been printed. We don't want to print the same code again if we encounter the address again. This can be used across several error reporting steps. Okay, I just wanted to make sure the different usages don't interfere with each other. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14358#discussion_r1223820488 From sspitsyn at openjdk.org Fri Jun 9 05:15:56 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 9 Jun 2023 05:15:56 GMT Subject: RFR: 8306028: separate ThreadStart/ThreadEnd events posting code in JVMTI VTMS transitions [v8] In-Reply-To: <9RJb8PvlZxLYs2zsDb-lDqoILDkiFwPx54vq3NpIdQQ=.c158dc31-68b0-49bd-8ac4-1ff1069be454@github.com> References: <9RJb8PvlZxLYs2zsDb-lDqoILDkiFwPx54vq3NpIdQQ=.c158dc31-68b0-49bd-8ac4-1ff1069be454@github.com> Message-ID: On Thu, 8 Jun 2023 19:05:54 GMT, Doug Simon wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> update copyright comments > > src/hotspot/share/runtime/sharedRuntime.cpp line 641: > >> 639: JRT_ENTRY(void, SharedRuntime::notify_jvmti_vthread_start(oopDesc* vt, jboolean hide, JavaThread* current)) >> 640: assert(hide == JNI_FALSE, "must be VTMS transition finish"); >> 641: jobject vthread = JNIHandles::make_local(const_cast(vt)); > > Since the current thread is in the `current` arg, it could be used here when creating the local handle. That's right. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/13484#discussion_r1223851214 From sspitsyn at openjdk.org Fri Jun 9 06:17:49 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 9 Jun 2023 06:17:49 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v4] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> <8pd-OWsWDM6RK-_X479uitxD-NNERoHXggUi4F7iemM=.229a79e8-5140-456b-80ba-2916e7dd95d9@github.com> <_52Zqcx-rftOA6EzeCRNEcUPBNI-fK3nvYQJ4TItudM=.00584b6b-f4a8-486a-b0c4-40d77bca5f2b@github.com> Message-ID: On Thu, 8 Jun 2023 06:25:20 GMT, Alan Bateman wrote: >> It was decided with Alan that it is okay to be in a waiting state. The `JVMTI_THREAD_STATE_BLOCKED_ON_MONITOR_ENTER` state requires a monitor to be blocked on, so it can be confusing. Alan's comment in the original PR [https://github.com/openjdk/jdk/pull/14298](https://github.com/openjdk/jdk/pull/14298) was: >>> if the jt is carrying thread_oop and it's okay for the JVMTI state to reported as WAITING when waiting for something other than Object.wait. > > The mental model is that the carrier is blocked so this is what an observer using the APIs should see. My recollection is that JVMTI_THREAD_STATE_WAITING was okay because there is a wriggle room in the JVM TI spec, it only uses Object.wait as an example. There may be a few rough edges to smooth down in this area. It's okay to take time with this PR and expand the tests to cover more cases and get more confident that there aren't more issues. We agreed with Alex to file a test RFE to improve test coverage in this area. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14366#discussion_r1223883294 From sspitsyn at openjdk.org Fri Jun 9 06:17:51 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 9 Jun 2023 06:17:51 GMT Subject: Integrated: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING In-Reply-To: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: On Wed, 7 Jun 2023 18:42:34 GMT, Serguei Spitsyn wrote: > This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). > The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. > The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). > > The first push/patch is the original fix of JDK-8307153. > The fix of the SUSPEND bit issue will be in the incremental update. > It is to simplify the review. > > Testing: > - TBD: mach5 tiers 1-5 This pull request has now been integrated. Changeset: f91e9ba7 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/f91e9ba757f04983655c23542e06973805465249 Stats: 96 lines in 4 files changed: 76 ins; 0 del; 20 mod 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING Reviewed-by: cjplummer, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/14366 From sspitsyn at openjdk.org Fri Jun 9 06:26:50 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 9 Jun 2023 06:26:50 GMT Subject: RFR: 8309612: [REDO] JDK-8307153 JVMTI GetThreadState on carrier should return STATE_WAITING [v6] In-Reply-To: References: <2yWxf_TT2Dw2LLUa9fs8GZM-EfYIAOD-mTv1GLmg6o4=.705caff7-40a4-4dcb-862f-cebac0be68db@github.com> Message-ID: On Thu, 8 Jun 2023 01:42:10 GMT, Serguei Spitsyn wrote: >> This is REDO the fix of [JDK-8307153](https://bugs.openjdk.org/browse/JDK-8307153). >> The last update of the fix in the review cycle was incorrect and incorrectly tested, so the issue has not been noticed. It is why the fix was backed out. >> The issue is that the SUSPEND bit was missed in the JVMTI thread state of platform/carrier threads carrying virtual threads (see`JvmtiEnvBase::get_thread_state` function). >> >> The first push/patch is the original fix of JDK-8307153. >> The fix of the SUSPEND bit issue will be in the incremental update. >> It is to simplify the review. >> >> Testing: >> - TBD: mach5 tiers 1-5 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: corrected the function get_thread_state for safety Chris and Alex, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/14366#issuecomment-1584039912 From jwaters at openjdk.org Fri Jun 9 07:30:55 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 9 Jun 2023 07:30:55 GMT Subject: RFR: 8250269: Replace ATTRIBUTE_ALIGNED with alignas [v16] In-Reply-To: References: <9QKV9cYFTo_1D8R-mI80lnewNkA0ceJNKFPbrvICxl4=.d6736b76-8324-4084-bede-6e144b4f6c04@github.com> Message-ID: On Sat, 3 Jun 2023 13:45:21 GMT, Julian Waters wrote: >> C++11 added the alignas attribute, for the purpose of specifying alignment on types, much like compiler specific syntax such as gcc's __attribute__((aligned(x))) or Visual C++'s __declspec(align(x)). >> >> We can phase out the use of the macro in favor of the standard attribute. In the meantime, we can replace the compiler specific definitions of ATTRIBUTE_ALIGNED with a portable definition. We might deprecate the use of the macro but changing its implementation quickly and cleanly applies the feature where the macro is being used. >> >> Note: With certain parts of HotSpot using ATTRIBUTE_ALIGNED so indiscriminately, this commit will likely take some time to get right >> >> This will require adding the alignas attribute to the list of language features approved for use in HotSpot code. (Completed with [8297912](https://github.com/openjdk/jdk/pull/11446)) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: > > - Merge branch 'master' into alignas > - Merge branch 'openjdk:master' into alignas > - alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - Merge branch 'openjdk:master' into alignas > - ... and 6 more: https://git.openjdk.org/jdk/compare/6edd786b...48d816d7 Alright, thanks Kim :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/11431#issuecomment-1584104874 From mdoerr at openjdk.org Fri Jun 9 10:24:08 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 9 Jun 2023 10:24:08 GMT Subject: RFR: 8309613: [Windows] hs_err files sometimes miss information about the code containing the error [v2] In-Reply-To: References: Message-ID: > We have seen hs_err files for errors triggered by C2 compiled methods which miss the most relevant information: the C2 method (see JBS issue for more details). I have found a possibility to add it. Please take a look and provide feedback. > > Testing: > > diff --git a/src/hotspot/share/opto/parse1.cpp b/src/hotspot/share/opto/parse1.cpp > index f179d3ba88d..c35a1ac595e 100644 > --- a/src/hotspot/share/opto/parse1.cpp > +++ b/src/hotspot/share/opto/parse1.cpp > @@ -1210,6 +1210,12 @@ void Parse::do_method_entry() { > make_dtrace_method_entry(method()); > } > > + if (UseNewCode) { > + Node* halt = _gvn.transform(new HaltNode(control(), frameptr(), "Requested Halt!")); > + C->root()->add_req(halt); > + set_control(halt); > + } > + > #ifdef ASSERT > // Narrow receiver type when it is too broad for the method being parsed. > if (!method()->is_static()) { > > > "java -XX:+UseNewCode -version" shows the following output (when no hsdis lib is provided): > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [jvm.dll+0x6ca5b9] os::win32::platform_print_native_stack+0xd9 (os_windows_x86.cpp:236) > V [jvm.dll+0x8a3afa] VMError::report+0xd6a (vmError.cpp:973) > V [jvm.dll+0x8a5cde] VMError::report_and_die+0x5fe (vmError.cpp:1765) > V [jvm.dll+0x283061] report_fatal+0x71 (debug.cpp:212) > V [jvm.dll+0x621c3e] MacroAssembler::debug64+0x8e (macroAssembler_x86.cpp:829) > C 0x000001635fe021f4 > > > called by the following code: > Compiled method (c2) 87 16 4 java.lang.Object:: (1 bytes) > total in heap [0x000001635fe02010,0x000001635fe02250] = 576 > relocation [0x000001635fe02170,0x000001635fe02188] = 24 > main code [0x000001635fe021a0,0x000001635fe02200] = 96 > stub code [0x000001635fe02200,0x000001635fe02218] = 24 > metadata [0x000001635fe02218,0x000001635fe02220] = 8 > scopes data [0x000001635fe02220,0x000001635fe02228] = 8 > scopes pcs [0x000001635fe02228,0x000001635fe02248] = 32 > dependencies [0x000001635fe02248,0x000001635fe02250] = 8 > > [Constant Pool (empty)] > > [MachCode] > [Entry Point] > # {method} {0x0000000800478d78} '' '()V' in 'java/lang/Object' > # [sp+0x20] (sp of caller) > 0x000001635fe021a0: 448b 5208 | 49bb 0000 | 0000 0800 | 0000 4d03 | d349 3bc2 > > 0x000001635fe021b4: ; {runtime_call ic_miss_stub} > 0x000001635fe021b4: 0f85 c6c4 | 8fff 6690 | 0f1f 4000 > [Verified Entry Point] > 0x000001635fe021c0: 4881 ec18 | 0000 0048 | 896c 2410 | 4181 7f20 | 0100 0000 | 0f85 1b00 > > 0x0000... Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Check result of print_code and update printed_len. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14358/files - new: https://git.openjdk.org/jdk/pull/14358/files/5f7bcc5d..3bbd2a04 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14358&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14358&range=00-01 Stats: 6 lines in 1 file changed: 4 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/14358.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14358/head:pull/14358 PR: https://git.openjdk.org/jdk/pull/14358 From mdoerr at openjdk.org Fri Jun 9 10:24:09 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 9 Jun 2023 10:24:09 GMT Subject: RFR: 8309613: [Windows] hs_err files sometimes miss information about the code containing the error In-Reply-To: References: Message-ID: On Wed, 7 Jun 2023 14:32:13 GMT, Martin Doerr wrote: > We have seen hs_err files for errors triggered by C2 compiled methods which miss the most relevant information: the C2 method (see JBS issue for more details). I have found a possibility to add it. Please take a look and provide feedback. > > Testing: > > diff --git a/src/hotspot/share/opto/parse1.cpp b/src/hotspot/share/opto/parse1.cpp > index f179d3ba88d..c35a1ac595e 100644 > --- a/src/hotspot/share/opto/parse1.cpp > +++ b/src/hotspot/share/opto/parse1.cpp > @@ -1210,6 +1210,12 @@ void Parse::do_method_entry() { > make_dtrace_method_entry(method()); > } > > + if (UseNewCode) { > + Node* halt = _gvn.transform(new HaltNode(control(), frameptr(), "Requested Halt!")); > + C->root()->add_req(halt); > + set_control(halt); > + } > + > #ifdef ASSERT > // Narrow receiver type when it is too broad for the method being parsed. > if (!method()->is_static()) { > > > "java -XX:+UseNewCode -version" shows the following output (when no hsdis lib is provided): > > Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) > V [jvm.dll+0x6ca5b9] os::win32::platform_print_native_stack+0xd9 (os_windows_x86.cpp:236) > V [jvm.dll+0x8a3afa] VMError::report+0xd6a (vmError.cpp:973) > V [jvm.dll+0x8a5cde] VMError::report_and_die+0x5fe (vmError.cpp:1765) > V [jvm.dll+0x283061] report_fatal+0x71 (debug.cpp:212) > V [jvm.dll+0x621c3e] MacroAssembler::debug64+0x8e (macroAssembler_x86.cpp:829) > C 0x000001635fe021f4 > > > called by the following code: > Compiled method (c2) 87 16 4 java.lang.Object:: (1 bytes) > total in heap [0x000001635fe02010,0x000001635fe02250] = 576 > relocation [0x000001635fe02170,0x000001635fe02188] = 24 > main code [0x000001635fe021a0,0x000001635fe02200] = 96 > stub code [0x000001635fe02200,0x000001635fe02218] = 24 > metadata [0x000001635fe02218,0x000001635fe02220] = 8 > scopes data [0x000001635fe02220,0x000001635fe02228] = 8 > scopes pcs [0x000001635fe02228,0x000001635fe02248] = 32 > dependencies [0x000001635fe02248,0x000001635fe02250] = 8 > > [Constant Pool (empty)] > > [MachCode] > [Entry Point] > # {method} {0x0000000800478d78} '' '()V' in 'java/lang/Object' > # [sp+0x20] (sp of caller) > 0x000001635fe021a0: 448b 5208 | 49bb 0000 | 0000 0800 | 0000 4d03 | d349 3bc2 > > 0x000001635fe021b4: ; {runtime_call ic_miss_stub} > 0x000001635fe021b4: 0f85 c6c4 | 8fff 6690 | 0f1f 4000 > [Verified Entry Point] > 0x000001635fe021c0: 4881 ec18 | 0000 0048 | 896c 2410 | 4181 7f20 | 0100 0000 | 0f85 1b00 > > 0x0000... I noticed that we should check the result of `print_code` and update `printed_len`. Fixed with 2nd commit. Example output added to the description. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14358#issuecomment-1584338427 From mdoerr at openjdk.org Fri Jun 9 10:24:10 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 9 Jun 2023 10:24:10 GMT Subject: RFR: 8309613: [Windows] hs_err files sometimes miss information about the code containing the error [v2] In-Reply-To: References: Message-ID: On Thu, 8 Jun 2023 02:10:49 GMT, David Holmes wrote: >> Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: >> >> Check result of print_code and update printed_len. > > src/hotspot/share/utilities/vmError.cpp line 976: > >> 974: // We have printed the native stack in platform-specific code >> 975: // Windows/x64 needs special handling. >> 976: // Stack walking may got stuck. Try to print the calling code. > > Nit: s/got/get/ Done. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14358#discussion_r1224127180 From jsjolen at openjdk.org Fri Jun 9 10:25:12 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 Jun 2023 10:25:12 GMT Subject: RFR: 8309717: C2: Remove Arena::move_contents usage Message-ID: Hi, Instead of using `Arena::move_contents` we can just see the arena swap as a form of double buffering, reducing this to a pointer swap and a clear. This allows us to remove `Arena::move_contents`, cleaning up the arena code. Since this requires allocating another pointer for `Compile`, I took the time to move some members around in order to reduce the padding. This means that this patch does *not* introduce a size change for `Compile`. I'm currently running tier1-3 tests. Thanks for considering this, Johan ------------- Commit messages: - Rewrite the comments to reflect current source - Move around members to reduce padding - Do a pointer swap instead Changes: https://git.openjdk.org/jdk/pull/14391/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14391&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309717 Stats: 53 lines in 5 files changed: 19 ins; 22 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/14391.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14391/head:pull/14391 PR: https://git.openjdk.org/jdk/pull/14391 From jsjolen at openjdk.org Fri Jun 9 11:57:43 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Fri, 9 Jun 2023 11:57:43 GMT Subject: RFR: 8309717: C2: Remove Arena::move_contents usage In-Reply-To: References: Message-ID: On Fri, 9 Jun 2023 10:17:46 GMT, Johan Sj?len wrote: > Hi, > > Instead of using `Arena::move_contents` we can just see the arena swap as a form of double buffering, reducing this to a pointer swap and a clear. This allows us to remove `Arena::move_contents`, cleaning up the arena code. > > Since this requires allocating another pointer for `Compile`, I took the time to move some members around in order to reduce the padding. This means that this patch does *not* introduce a size change for `Compile`. > > I'm currently running tier1-3 tests. > > Thanks for considering this, > Johan Passes the tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14391#issuecomment-1584458476 From tholenstein at openjdk.org Fri Jun 9 12:16:44 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Fri, 9 Jun 2023 12:16:44 GMT Subject: RFR: JDK-8282797: CompileCommand parsing errors should exit VM [v3] In-Reply-To: <9JYXI2snHKlZ3prTgMdCUsugWnLCiWwc2_9DBUQoBrU=.bbd26296-2d87-4ac4-a915-c57ae2691be3@github.com> References: <4ZMRDekcK4Yc3dS9rHvpqj9NcU6dH30aR7u2wtrc_Ac=.4f6c76cd-636a-445f-99e3-973e6fc51360@github.com> <9JYXI2snHKlZ3prTgMdCUsugWnLCiWwc2_9DBUQoBrU=.bbd26296-2d87-4ac4-a915-c57ae2691be3@github.com> Message-ID: On Wed, 31 May 2023 18:51:42 GMT, Vladimir Kozlov wrote: >> Tobias Holenstein has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update Scenario.java >> - Update compilerOracle.cpp > > Update is good. Thanks @vnkozlov , @TobiHartmann and @chhagedorn for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13753#issuecomment-1584480044 From tholenstein at openjdk.org Fri Jun 9 12:21:45 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Fri, 9 Jun 2023 12:21:45 GMT Subject: RFR: JDK-8282797: CompileCommand parsing errors should exit VM [v3] In-Reply-To: <9JYXI2snHKlZ3prTgMdCUsugWnLCiWwc2_9DBUQoBrU=.bbd26296-2d87-4ac4-a915-c57ae2691be3@github.com> References: <4ZMRDekcK4Yc3dS9rHvpqj9NcU6dH30aR7u2wtrc_Ac=.4f6c76cd-636a-445f-99e3-973e6fc51360@github.com> <9JYXI2snHKlZ3prTgMdCUsugWnLCiWwc2_9DBUQoBrU=.bbd26296-2d87-4ac4-a915-c57ae2691be3@github.com> Message-ID: On Wed, 31 May 2023 18:51:42 GMT, Vladimir Kozlov wrote: >> Tobias Holenstein has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update Scenario.java >> - Update compilerOracle.cpp > > Update is good. Thanks @vnkozlov , @TobiHartmann and @chhagedorn for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13753#issuecomment-1584484867 From tholenstein at openjdk.org Fri Jun 9 13:03:51 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Fri, 9 Jun 2023 13:03:51 GMT Subject: RFR: JDK-8282797: CompileCommand parsing errors should exit VM [v3] In-Reply-To: <9JYXI2snHKlZ3prTgMdCUsugWnLCiWwc2_9DBUQoBrU=.bbd26296-2d87-4ac4-a915-c57ae2691be3@github.com> References: <4ZMRDekcK4Yc3dS9rHvpqj9NcU6dH30aR7u2wtrc_Ac=.4f6c76cd-636a-445f-99e3-973e6fc51360@github.com> <9JYXI2snHKlZ3prTgMdCUsugWnLCiWwc2_9DBUQoBrU=.bbd26296-2d87-4ac4-a915-c57ae2691be3@github.com> Message-ID: <-JXh4mBUe792JGWakWq-nxf5uA739OIgfyR8YxDyobY=.681b54fc-d145-4224-9919-9162eadd8182@github.com> On Wed, 31 May 2023 18:51:42 GMT, Vladimir Kozlov wrote: >> Tobias Holenstein has updated the pull request incrementally with two additional commits since the last revision: >> >> - Update Scenario.java >> - Update compilerOracle.cpp > > Update is good. Thanks @vnkozlov , @TobiHartmann and @chhagedorn for the reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/13753#issuecomment-1584538189 From tholenstein at openjdk.org Fri Jun 9 13:03:53 2023 From: tholenstein at openjdk.org (Tobias Holenstein) Date: Fri, 9 Jun 2023 13:03:53 GMT Subject: Integrated: JDK-8282797: CompileCommand parsing errors should exit VM In-Reply-To: References: Message-ID: <4jvG6LBFzDziGmAtliU9j9VpxzrKEwchDa5UTklVYHg=.83c2eaa6-18d4-428f-9db1-74b96a504587@github.com> On Tue, 2 May 2023 11:35:54 GMT, Tobias Holenstein wrote: > Currently, errors during compile command parsing just print an error but don't exit the VM. As a result, issues go unnoticed. > > With this PR the behavior is changed to exit the VM when an error occurs. > > E.g. `java -XX:CompileCommand=compileonly,HashMap:: -version` will exit the VM after a parsing occurred. > > CompileCommand: An error occurred during parsing > Error: Could not parse method pattern > Line: 'compileonly,HashMap::' > > Usage: '-XX:CompileCommand=