From xgong at openjdk.org Fri Dec 1 01:16:15 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 1 Dec 2023 01:16:15 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> References: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> Message-ID: On Thu, 30 Nov 2023 11:13:14 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Rename vmath to sleef in configure > > make/autoconf/lib-sleef.m4 line 56: > >> 54: AC_MSG_CHECKING([for the specified LIBSLEEF]) >> 55: if test -e ${with_libsleef}/lib/libsleef.so && >> 56: test -e ${with_libsleef}/include/sleef.h; then > > This fails on my system because libsleef is in `/usr/local/lib64/`. This is the correct place to look according to the Linux FHS. You should _not_ hard-code `/lib` Did you try to find the libsleef by passing `--with-libsleef=` ? Currently `--with-libsleef=` can only work for people manually built from sleef source code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1411488447 From xgong at openjdk.org Fri Dec 1 01:22:15 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 1 Dec 2023 01:22:15 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 20:13:06 GMT, Magnus Ihse Bursie wrote: > Not having a build time dependency on libsleef means you cannot really verify that the functions you want to call are correct, but maybe you feel secure that they will never change? I'm not sure. The main reason that we add such a wrapper library is to catch the sleef's ABI version changing earlier (i.e. at build time). So using .s code and not including sleef at built time can not match this requirement? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835248759 From xgong at openjdk.org Fri Dec 1 01:33:12 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 1 Dec 2023 01:33:12 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 06:39:43 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Rename vmath to sleef in configure > Okay, now I found a few more of your comments that I missed before. I apologize, the Github PR review UI can be a bit confusing when discussions are taking place in multiple locations. So, here's a revision to my list above: > > 1. An aach64 CPU can have both Neon and SVE present at the same time. > 2. You are assuming that Neon is always present, and what I referred to as the fallback case is in fact using Neon instead of SVE. > 3. You would like to split vect_math.c into two parts, e.g. vect_math_neon.c and vect_math_sve.c. > 4. You will then, use heuristics in hotspot to determine at runtime if SVE or Neon functionality should be used. Even if SVE is present on the runtime machine, heuristics can chose to use the Neon implementation anyway in some cases. > 5. Only vect_math_sve.c. need the -march+sve. Yes, all the above are true. > 6. The neon part do not need the -march+sve flag, and will fail if built with this flag. (???) Current neon code does not need the `-march=arm-a+sve` flag, but it will not fail if we built with it. Because the C compiler (GCC/Clang) can also supports SVE and NEON at the same time. My concert is that if we build the neon code with sve flags, it has the possibility that compiler may generate SVE specific instructions inside the NEON functions in future (e.g. if sve has new features related to method call) , although it doesn't happen now. If so, calling the neon stubs (which contains sve instructions) on non-sve supported machines in runtime may crash. Hence, it's more safe if we can separate neon and sve code and use different flags for them. > Anyway, it is straightforward to add compiler flags to individual files. You do it like this: > > $(eval $(call SetupJdkLibrary, BUILD_LIBVMATH, \ NAME := vmath, \ CFLAGS := $(CFLAGS_JDKLIB) $(LIBSLEEF_CFLAGS) -fvisibility=default, \ vect_math_sve.c_CFLAGS := $(SVE_CFLAGS), \ ... Thanks so much for this. That's very helpful! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835258683 From cjplummer at openjdk.org Fri Dec 1 01:58:11 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 1 Dec 2023 01:58:11 GMT Subject: RFR: 8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 [v5] In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 21:27:27 GMT, Serguei Spitsyn wrote: >> This is a fix of a performance/scalability related issue. The `JvmtiThreadState` objects for virtual thread filtered events enabled globally are created eagerly because it is needed when the `interp_only_mode` is enabled. Otherwise, some events which are generated in `interp_only_mode` from the debugging version of interpreter chunks can be missed. >> However, it has to be okay to avoid eager creation of these object if no `interp_only_mode` has ever been requested. >> It seems to be an extremely important optimization to create JvmtiThreadState objects lazily in such cases. >> It is done by introducing the flag `JvmtiThreadState::_seen_interp_only_mode` which indicates when the `JvmtiThreadState` objects have to be created eagerly. >> >> Additionally, the fix includes the following related changes: >> - Use condition double checking idiom for `MutexLocker mu(JvmtiThreadState_lock)` in the function `JvmtiVTMSTransitionDisabler::VTMS_mount_end` which is on a performance-critical path and looks like this: >> >> JvmtiThreadState* state = thread->jvmti_thread_state(); >> if (state != nullptr && state->is_pending_interp_only_mode()) { >> MutexLocker mu(JvmtiThreadState_lock); >> state = thread->jvmti_thread_state(); >> if (state != nullptr && state->is_pending_interp_only_mode()) { >> JvmtiEventController::enter_interp_only_mode(); >> } >> } >> >> >> - Add extra check of `JvmtiExport::can_support_virtual_threads()` when virtual thread mount and unmount are posted. >> - Minor: Added a `ThreadsListHandle` to the `JvmtiEventControllerPrivate::enter_interp_only_mode`. It is needed because of the dynamic creation of compensating carrier threads which is racy for JVMTI `SetEventNotificationMode` implementation. >> >> Performance mesurements: >> - Without this fix the test provided by the bug submitter gives execution numbers: >> - no ClassLoad events enabled: 3251 ms >> - ClassLoad events are enabled: 40534 ms >> >> - With the fix: >> - no ClassLoad events enabled: 3270 ms >> - ClassLoad events are enabled: 3385 ms >> >> Testing: >> - Ran mach5 tiers 1-6, no regressions are noticed > > Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: > > - review: one more minor correction of a comment > - review: minor correction of a comment Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16686#pullrequestreview-1758873109 From stuefe at openjdk.org Fri Dec 1 05:18:26 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 1 Dec 2023 05:18:26 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: References: Message-ID: On Wed, 29 Nov 2023 11:49:31 GMT, Jaroslav Bachorik wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` instance. Once that method gets redefined, the `jmethodID` is updated to point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original `jmethodID` will be redirected to the latest `Method` version and at the same time the 'previous' `Method` version will receive a new `jmethodID` pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` JVMTI calls while this previous `Method` version is still on stack we will have the corresponding frame identified by a `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for cleanup at what time all contained `Method` instances. The cleanup process will not perform the `jmethodID` pointer maintenance and we will end up with pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to `ClassLoaderData` instance and all relevant `jmethodID`s will get batch-updated when the class loader is being released and all its classes are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being deallocate the associated `jmethodID` (if any) must not point to the deallocated instance once we are finished. Unfortunately, we can not just update the `jmethodID` values in bulk when purging an old class version - the per `InstanceKlass` jmethodID cache is present only for the main class version and contains `jmethodID` values for both the old and current method versions. >> >> ~Therefore we need to perform `jmethodID` lookup when we are about to deallocate a `Method` instance and clean up the pointer only if that `jmethodID` is pointing to the `Method` instance which is being deallocated.~ >> >> Therefore, we need to perform `jmethodID` lookup for each method in an old class version that is getting purged, and null out the pointer of that `jmethodID` to break the link from `jmethodID` to the method instance that is about to get deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Restrict cleanup to obsolete methods only I won't be able to review this this week, too snowed in atm. I can take a look next week. We can always just revert the change if needed. Thinking about Skara, I think as long as we have this confusing mixture of rules (hotspot wants 2 reviewers that are Reviewer/Committer, but some jdk libs only want one, but then you need two for desktop I think otherwise Phil gets angry) - we should hard-code the 2-reviewer rule into skara as default since it affects the lion's share of all changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1835474161 From haosun at openjdk.org Fri Dec 1 05:24:08 2023 From: haosun at openjdk.org (Hao Sun) Date: Fri, 1 Dec 2023 05:24:08 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v4] In-Reply-To: References: Message-ID: On Wed, 29 Nov 2023 22:40:35 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | | | >> | ARM32 | | | | >> | x86 | | | | >> | x64 | | | | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | | | | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 1246: > 1244: if (_next_call_type == INVOKESTATIC || _next_call_type == INVOKESPECIAL) { > 1245: // Need a static call stub for transitions from compiled to interpreted. > 1246: C2_MacroAssembler masm(&buffer); Hi, I encountered one build failure: JDK build **without C2** fails on Linux/AArch64. The configure I used --with-debug-level=release --with-jvm-features=-compiler2 --disable-precompiled-headers The error log === Output from failing command(s) repeated here === * For target hotspot_variant-server_libjvm_gtest_objs_BUILD_GTEST_LIBJVM_link: /usr/bin/ld: /tmp/build-release/hotspot/variant-server/libjvm/objs/jvmciCodeInstaller.o: in function `C2_MacroAssembler::C2_MacroAssembler(CodeBuffer*)': make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' /usr/bin/ld: make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' collect2: error: ld returned 1 exit status * For target hotspot_variant-server_libjvm_objs_BUILD_LIBJVM_link: /usr/bin/ld: /tmp/build-release/hotspot/variant-server/libjvm/objs/jvmciCodeInstaller.o: in function `C2_MacroAssembler::C2_MacroAssembler(CodeBuffer*)': make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' /usr/bin/ld: make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' collect2: error: ld returned 1 exit status * All command lines available in /tmp/build-release/make-support/failure-logs. === End of repeated output === I suggest making the following change: Suggestion: MacroAssembler masm(&buffer); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1411551297 From duke at openjdk.org Fri Dec 1 07:52:10 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 1 Dec 2023 07:52:10 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v7] In-Reply-To: References: Message-ID: <3qPd3ZhLhbYsRHcljYxk_m3C-4CmwBGyUADUAil64a8=.a15fd7e3-1c9b-4be1-b67c-24967427ef5f@github.com> On Thu, 30 Nov 2023 11:32:23 GMT, Hamlin Li wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Use concrete registers for input parameters. > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1479: > >> 1477: case T_SHORT: BLOCK_COMMENT("arrays_hashcode(short) {"); break; >> 1478: case T_INT: BLOCK_COMMENT("arrays_hashcode(int) {"); break; >> 1479: default: BLOCK_COMMENT("arrays_hashcode {"); break; > > Is this `BLOCK_COMMENT("arrays_hashcode {"); break;` necessary? I have just borrrowed that part of code from X86 counterpart: https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp#L3354 It is a dead code so 'ShouldNotReachHere();' looks more appropriate here. Do you think we should fix this as a part of this patch or as some follow-ups for both x86/RISC-V? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1411732405 From epeter at openjdk.org Fri Dec 1 08:27:07 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 1 Dec 2023 08:27:07 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v4] In-Reply-To: References: Message-ID: <_mYSInzsMDBTfmoIIh7_9Rdgaap8vQO7pe8D-30NGwQ=.d9e990ff-8f97-43d7-a22d-fde755a70035@github.com> On Thu, 30 Nov 2023 18:21:29 GMT, Tom Rodriguez wrote: >> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - manual merge with master after JDK-8267532 >> - more locking, still fails tho - WIP >> - adding more verification and more locking, WIP >> - add locks for jvmci calls to allocate_bci_to_data >> - 8306767 > > I agree with Roland that none of the accesses from C++ code are performance critical so always requiring a lock shouldn't matter for performance. It is somewhat intrusive though. The alternative is to make the API distinguish clearly between preallocated data and the extra data and adjust all internal usages to select the right one. For instance speculative_trap_data_tag usages are only in the extra_data section so access to those could have a distinct API. The only other thing in extra_data are bit_data_tag so any caller that's looking for a different kind of record is always safe since it must be from the preallocated section. A change like that might be clarifying in general as well but I think there's a question of effort vs benefit. > > Also to clarify, I never actually observed this problem in practice but inferred the possibility while addressing MDO concurrency issues with Graal. It would be very hard to notice and very transient but it could lead to crashes since SpeculativeTrapData contains a Method*. @tkrodriguez Yes, we could now do quite the refactoring. The question is how far I should take this. I think I will just make things safe with locks now, and then if someone really wants to fully refactor all of this, then they can do that separately. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1835673004 From rehn at openjdk.org Fri Dec 1 08:46:07 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 1 Dec 2023 08:46:07 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 17:48:11 GMT, Ludovic Henry wrote: > 8315856: RISC-V: Use Zacas extension for cmpxchg Thank you, looks good. ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16910#pullrequestreview-1759287998 From rehn at openjdk.org Fri Dec 1 08:48:06 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 1 Dec 2023 08:48:06 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr In-Reply-To: References: Message-ID: <6yxWGO7Wu4Wg6hXrnic-R2p0IBf10YDrKyIGos_gVkI=.1df5c691-d184-4c29-9838-6415ade49024@github.com> On Wed, 29 Nov 2023 11:58:31 GMT, Gui Cao wrote: > MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? > This issue is used to track avoid passing t0 as a temporary register in the following cases: > 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. > 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad > 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad > > Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. > https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 > > ### Testing: > - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) > - [x] Run tier1-3 tests with SiFive unmatched (release) My time is a bit constrained right now, hopefully I can look at this next week. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16880#issuecomment-1835697810 From xgong at openjdk.org Fri Dec 1 08:48:52 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 1 Dec 2023 08:48:52 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: > Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). > > SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. > > To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. > > Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. > > [1] https://github.com/openjdk/jdk/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: - Separate neon and sve functions into two source files - Merge branch 'jdk:master' into JDK-8312425 - Rename vmath to sleef in configure - Address review comments in build system - Add a bundled native lib in jdk as a bridge to libsleef - Merge 'jdk:master' into JDK-8312425 - Disable sleef by default - Merge 'jdk:master' into JDK-8312425 - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16234/files - new: https://git.openjdk.org/jdk/pull/16234/files/c1ce1968..ee5caf6d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=04-05 Stats: 638331 lines in 1866 files changed: 100400 ins; 474467 del; 63464 mod Patch: https://git.openjdk.org/jdk/pull/16234.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234 PR: https://git.openjdk.org/jdk/pull/16234 From never at openjdk.org Fri Dec 1 08:49:10 2023 From: never at openjdk.org (Tom Rodriguez) Date: Fri, 1 Dec 2023 08:49:10 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v4] In-Reply-To: References: Message-ID: <6CTbUBPmPjgvR_Rk6lQbJhvCuOgEqAVIHUPr-QLCx1c=.a470f8d8-51f2-4d3f-b080-cc30a7c8e70b@github.com> On Thu, 30 Nov 2023 16:16:54 GMT, Emanuel Peter wrote: >> I'm making sure that `allocate_bci_to_data` is only called when holding the `extra_data_lock`, so that no concurrent calls of it can ever occur. >> >> Testing: tier1-3 and stress. > > Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: > > - manual merge with master after JDK-8267532 > - more locking, still fails tho - WIP > - adding more verification and more locking, WIP > - add locks for jvmci calls to allocate_bci_to_data > - 8306767 Sounds reasonable to me. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1835699832 From mbaesken at openjdk.org Fri Dec 1 09:05:22 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 1 Dec 2023 09:05:22 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: > [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. > However the information is not added to the JFR events, and this should be enhanced. > The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Adjust macOS coding ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16903/files - new: https://git.openjdk.org/jdk/pull/16903/files/976d7fcd..c7e63a27 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16903&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16903&range=00-01 Stats: 81 lines in 2 files changed: 19 ins; 41 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/16903.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16903/head:pull/16903 PR: https://git.openjdk.org/jdk/pull/16903 From tschatzl at openjdk.org Fri Dec 1 09:15:08 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Fri, 1 Dec 2023 09:15:08 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v3] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Thu, 30 Nov 2023 10:34:37 GMT, Thomas Schatzl wrote: >> src/hotspot/share/gc/shared/classUnloadingContext.hpp line 63: >> >>> 61: }; >>> 62: >>> 63: class DefaultClassUnloadingContext : public ClassUnloadingContext { >> >> I don't understand why they need to be two classes, even after reading "These are the reason for the class hierarchy for...". The reference to future/other PR(s) in the description doesn't really help -- it's unclear what is *necessary* for the current PR and what is preparation for future PR(s). > > The base class is unnecessary for this change, but very nice to have for future changes. I'll just merge them for now, and separate them again later. As explained in the now removed original description: >These are the reason for the class hierarchy for `ClassUnloadingContext`: the goal is to ultimately have about this phasing (for G1): >1. collect all dead CLDs, using the `register_unloading_class_loader_data` method *only* >2. parallelize the stuff in `ClassLoaderData::unload()` in one way or another, adding them to the `complete_cleaning` >(parallel) phase. > >Particularly the split of `SystemDictionary::do_unloading` into "only" traversing the CLDs to find the dead ones and then in parallel process them in 2. above warrants a separate `ClassUnloadingContext` (to facilitate parallelism). I.e. the non-parallelized case does not need the necessary data structure complications and helper methods to do efficient parallel iteration. However as mentioned I removed the class hierarchy for this change as it's unnecessary for now; let's discuss this hierarchy separately. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1411815090 From gcao at openjdk.org Fri Dec 1 09:36:03 2023 From: gcao at openjdk.org (Gui Cao) Date: Fri, 1 Dec 2023 09:36:03 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr In-Reply-To: <6yxWGO7Wu4Wg6hXrnic-R2p0IBf10YDrKyIGos_gVkI=.1df5c691-d184-4c29-9838-6415ade49024@github.com> References: <6yxWGO7Wu4Wg6hXrnic-R2p0IBf10YDrKyIGos_gVkI=.1df5c691-d184-4c29-9838-6415ade49024@github.com> Message-ID: On Fri, 1 Dec 2023 08:45:00 GMT, Robbin Ehn wrote: > My time is a bit constrained right now, hopefully I can look at this next week. OK. There is no hurry. Take your time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16880#issuecomment-1835766167 From stefank at openjdk.org Fri Dec 1 09:51:17 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 1 Dec 2023 09:51:17 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC Message-ID: There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: if (UseTransparentHugePages && !HugePages::supports_thp()) { if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); } UseLargePages = UseTransparentHugePages = false; return; } This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: /sys/kernel/mm/transparent_hugepage/enabled: never /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise the above code will force ZGC to run without THPs. This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. The result of this change can be seen in these tables: ZGC large pages log output: E (T) = Enabled (Transparent) E (T, OS) = Enabled (Transparent, OS enforced) D = Disabled D = Disabled (OS enforced) -XX:+UseTransparentHugePages shem \ anon | always | madvise | never ------------+--------+---------+------- always | E (T) | E (T) | E (T) within_size | E (T) | E (T) | E (T) advise | E (T) | E (T) | E (T) never | D (OS) | D (OS) | D (OS) deny | D (OS) | D (OS) | D (OS) force | E (T) | E (T) | E (T) -XX:-UseTransparentHugePages shem \ anon | always | madvise | never ------------+-----------+-----------+------- always | E (T, OS) | E (T, OS) | E (T, OS) within_size | E (T, OS) | E (T, OS) | E (T, OS) advise | D | D | D never | D | D | D deny | D | D | D force | E (T, OS) | E (T, OS) | E (T, OS)` OS reported usage of shared memory huge pages Y = Yes - = No -XX:+UseTransparentHugePages shem \ anon | always | madvise | never ------------+--------+---------+------- always | Y | Y | Y within_size | Y | Y | Y advise | Y | Y | Y never | - | - | - deny | - | - | - force | Y | Y | Y -XX:-UseTransparentHugePages shem \ anon | always | madvise | never ------------+--------+---------+------- always | Y | Y | Y within_size | Y | Y | Y advise | - | - | - never | - | - | - deny | - | - | - force | Y | Y | Y OS reported usage of anonymous memory huge pages Y = Yes - = No -XX:+UseTransparentHugePages shem \ anon | always | madvise | never ------------+--------+---------+------- always | Y | Y | - within_size | Y | Y | - advise | Y | Y | - never | Y | Y | - deny | Y | Y | - force | Y | Y | - -XX:-UseTransparentHugePages shem \ anon | always | madvise | never ------------+--------+---------+------- always | Y | - | - within_size | Y | - | - advise | Y | - | - never | Y | - | - deny | Y | - | - force | Y | - | - ------------- Commit messages: - 8319969: os::large_page_init() turns off THPs for ZGC Changes: https://git.openjdk.org/jdk/pull/16690/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8319969 Stats: 303 lines in 10 files changed: 258 ins; 12 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/16690.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16690/head:pull/16690 PR: https://git.openjdk.org/jdk/pull/16690 From stefank at openjdk.org Fri Dec 1 09:55:09 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 1 Dec 2023 09:55:09 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC In-Reply-To: References: Message-ID: On Thu, 16 Nov 2023 13:30:48 GMT, Stefan Karlsson wrote: > There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: > > > if (UseTransparentHugePages && !HugePages::supports_thp()) { > if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { > log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); > } > UseLargePages = UseTransparentHugePages = false; > return; > } > > > This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: > > /sys/kernel/mm/transparent_hugepage/enabled: never > /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise > > > the above code will force ZGC to run without THPs. > > This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: > > 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. > > 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. > > 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. > > The result of this change can be seen in these tables: > > ZGC large pages log output: > > E (T) = Enabled (Transparent) > E (T, OS) = Enabled (Transparent, OS enforced) > D = Disabled > D = Disabled (OS enforced) > > -XX:+UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+--------+---------+------- > always | E (T) | E (T) | E (T) > within_size | E (T) | E (T) | E (T) > advise | E (T) | E (T) | E (T) > never | D (OS) | D (OS) | D (OS) > deny | D (OS) | D (OS) | D (OS) > force | E (T) | E (T) | E (T) > > -XX:-UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+-----------+-----------+------- > always | E (T, OS) | E (T, OS) | E (T, OS) > within_size | E (T, OS) | E (T, OS) | E (T, OS) > advise | D | D | D > never | D | D | D > deny | D | D | D > force ... src/hotspot/os/linux/os_linux.cpp line 2886: > 2884: > 2885: void os::pd_realign_memory(char *addr, size_t bytes, size_t alignment_hint) { > 2886: if (HugePages::should_madvise_anonymous_thps() && alignment_hint > vm_page_size()) { The use of `HugePages::should_madvise_anonymous_thps()` adds a change in behavior. By using it instead of `UseTransparentHugepages`, we only call `madvise` when the OS is configured to care about `madvise`. I've been using this in my testing, but I can revert back to using `UseTransparentHugepages`, and then we can change this separately with [JDK-8312468](https://bugs.openjdk.org/browse/JDK-8312468). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1411859149 From ogillespie at openjdk.org Fri Dec 1 09:55:47 2023 From: ogillespie at openjdk.org (Oli Gillespie) Date: Fri, 1 Dec 2023 09:55:47 GMT Subject: RFR: 8315559: Delay TempSymbol cleanup to avoid symbol table churn [v14] In-Reply-To: References: Message-ID: > Attempt to fix regressions in class-loading performance caused by fixing a symbol table leak in [JDK-8313678](https://bugs.openjdk.org/browse/JDK-8313678). > > See lengthy discussion in https://bugs.openjdk.org/browse/JDK-8315559 for more background. In short, the leak was providing an accidental cache for temporary symbols, allowing reuse. > > This change keeps new temporary symbols alive in a queue for a short time, allowing them to be re-used by subsequent operations. For example, when attempting to load a class we may call JVM_FindLoadedClass for multiple classloaders back-to-back, and all of them will create a TempNewSymbol for the same string. At present, each call will leave a dead symbol in the table and create a new one. Dead symbols add cleanup and lookup overhead, and insertion is also costly. With this change, the symbol from the first invocation will live for some time after it is used, and subsequent callers can find the symbol alive in the table - avoiding the extra work. > > The queue is bounded, and when full new entries displace the oldest entry. This means symbols are held for the time it takes for 100 new temp symbols to be created. 100 is chosen arbitrarily - the tradeoff is memory usage versus 'cache' hit rate. > > When concurrent symbol table cleanup runs, it also drains the queue. > > In my testing, this brings Dacapo pmd performance back to where it was before the leak was fixed. > > Thanks @shipilev , @coleenp and @MDBijman for helping with this fix. Oli Gillespie has updated the pull request incrementally with one additional commit since the last revision: Add copyright header for new file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16398/files - new: https://git.openjdk.org/jdk/pull/16398/files/2e5dd556..b88c70dd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16398&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16398&range=12-13 Stats: 23 lines in 1 file changed: 23 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16398.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16398/head:pull/16398 PR: https://git.openjdk.org/jdk/pull/16398 From aph at openjdk.org Fri Dec 1 10:00:14 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 10:00:14 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: <8SBUvWGDLtQmwYPRBDeUkeuq4pf2nJfKfDY5rzZODFU=.3f1cc0ff-02e6-4a4e-9425-5ffccc9cbc8f@github.com> Message-ID: On Thu, 30 Nov 2023 14:50:24 GMT, Andrew Haley wrote: >> Do this, but with the name vect_math.S. Don't use SLEEF headers in the build. I think you can do this with no build-time dependency on SLEEF at all if you load the library lazily at runtime. >> >> [vect_math.S.txt](https://github.com/openjdk/jdk/files/13512306/vect_math.S.txt) > >> [vect_math.S.txt](https://github.com/openjdk/jdk/files/13512306/vect_math.S.txt) > > I guess this will live only in os_linux and os_bsd because the Windows compiler won't like it AFAIK. > @theRealAph So your suggestion is that this assembly files lives in hotspot, instead of jdk.incubator.vector? I don't think it much matters, because these functions will only be available to HotSpot. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835801386 From sjohanss at openjdk.org Fri Dec 1 10:01:27 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Fri, 1 Dec 2023 10:01:27 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53] In-Reply-To: References: <-GX8bATX2hz3YWgnJbhTNEYbi4t8HxfdhYqBP-ulyGg=.0080d7b0-8e43-4b81-b885-1d4a742048cc@github.com> Message-ID: <4NJOslObcIY-G1nUAbKeiCWKK-wbhEw94avA2c2cJ7s=.2ff4560f-5184-40ad-982f-96570cc4a9fc@github.com> On Thu, 30 Nov 2023 21:59:41 GMT, Man Cao wrote: >> src/hotspot/share/runtime/cpuTimeCounters.hpp line 59: >> >>> 57: NONCOPYABLE(CPUTimeCounters); >>> 58: >>> 59: static CPUTimeCounters* _instance; >> >> I would prefer if we made the whole class static and got rid of the instance and just made the `_cpu_time_counters` array static. The only drawback I/we (discussed this with Albert as well) can see is that the memory for the array would not be accounted in NMT, but this array will always be very small so should not be a big problem. >> >> Do you see any other concerns? > > I thought it is typically preferred to initialize a singleton object on the heap, rather than using several static variables. It is easier to control the initialization order and timing of an on-heap singleton object than statics. > > Moreover, for this class, `initialize()` could also check `if (UsePerfData)`, and only create the singleton object under `UsePerfData`. This could save some memory when `UsePerfData` is false. I would say it depends on the use-case and here when switching to use static functions to use the instance it felt more like an all-static class. I agree that it would be nice to avoid the additional memory usage if `UsePerfData` is `false` so I'm ok with keeping the instance if we add that. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1411872139 From aph at openjdk.org Fri Dec 1 10:05:19 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 10:05:19 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 01:19:12 GMT, Xiaohong Gong wrote: > > Not having a build time dependency on libsleef means you cannot really verify that the functions you want to call are correct, but maybe you feel secure that they will never change? > > I'm not sure. The main reason that we add such a wrapper library is to catch the sleef's ABI version changing earlier (i.e. at build time). So using .s code and not including sleef at built time can not match this requirement? I don't know what this means. If we're using an external SLEEF, then we can't catch ABI versions changes at build time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835805290 From sjohanss at openjdk.org Fri Dec 1 10:05:25 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Fri, 1 Dec 2023 10:05:25 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v47] In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 01:43:28 GMT, Jonathan Joo wrote: >> (I just realized that I made a typo in my previous msg; should be *callee* instead.) That is what I have in mind. >> >> >> void CPUTimeCounters::update_counter(name, total) { >> auto counter = get_counter(name); >> auto old_v = counter->get_value(); >> auto diff = total - old_v; >> counter->inc(diff); >> if (counter->is_gc_counter()) { >> counter->inc(diff); >> } >> } > > I'm not sure I understood correctly, could you let me know if this latest commit addresses your comment in the way you were intending? It does. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1411876609 From aph at openjdk.org Fri Dec 1 10:05:23 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 10:05:23 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> Message-ID: On Fri, 1 Dec 2023 01:13:37 GMT, Xiaohong Gong wrote: >> make/autoconf/lib-sleef.m4 line 56: >> >>> 54: AC_MSG_CHECKING([for the specified LIBSLEEF]) >>> 55: if test -e ${with_libsleef}/lib/libsleef.so && >>> 56: test -e ${with_libsleef}/include/sleef.h; then >> >> This fails on my system because libsleef is in `/usr/local/lib64/`. This is the correct place to look according to the Linux FHS. You should _not_ hard-code `/lib` > > Did you try to find the libsleef by passing `--with-libsleef=` ? Currently `--with-libsleef=` can only work for people manually built from sleef source code. Yes. It still failed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1411877145 From aph at openjdk.org Fri Dec 1 10:08:11 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 10:08:11 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 09:59:58 GMT, Andrew Haley wrote: > Not having a build time dependency on libsleef means you cannot really verify that the functions you want to call are correct, but maybe you feel secure that they will never change? We can still have SLEEF tests, but they will require a SLEEF library to be there. We can't control what is present at runtime, though. What are you trying to verify? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835812963 From aph at openjdk.org Fri Dec 1 10:19:16 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 10:19:16 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF Let me summarize the issues, as I see them: To a high-order approximation, no one builds their own JDK. We want people to be able to use these vector math operations. There is no need to depend on a specific SLEEF library version. I do not expect SLEEF to break the ABI by e.g. renaming functions. (I know, but let's assume.) As long as the functions we want to use are present, we should use them. SLEEF is not (yet) a standard part of OSs and build systems. We don't want to fail unnecessarily at runtime because of a SLEEF ABI version change. We don't want to fail to build the JDK if our GCC is too old for SVE. (Is that a problem now? It might be.) We want to be able to test and run with any version of SLEEF, as long as the functions we need are present. It should be possible to drop a SLEEF library into the system, and the JDK will use it. The alternative is to package SLEEF with the JDK. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835829658 From aph at openjdk.org Fri Dec 1 10:22:21 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 10:22:21 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: <3cK9QjVQNIgZoWWhrWKEb3XxfbLjprjRMBbStWegH7M=.6df92651-b97d-445a-aa42-302ea791bbea@github.com> On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF Oh, and: If we can't trust SLEEF not to break the ABI we're using, we should not be using SLEEF. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1835833150 From sjohanss at openjdk.org Fri Dec 1 10:24:19 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Fri, 1 Dec 2023 10:24:19 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48] In-Reply-To: References: <_lEBVrWV8wrVbmhOiu3AAqPJo_xBs718ZtA9V-VSzGM=.253c0ec8-256e-4dee-b125-90be6338e4b8@github.com> Message-ID: On Thu, 30 Nov 2023 21:43:36 GMT, Man Cao wrote: >> @simonis was the original suggester of this counter, so I will defer to his expertise. I do agree that dropping the counter would simplify things, but it also might not hurt to just leave it in. I'm okay with either option! > > Right, see @simonis 's comments at https://github.com/openjdk/jdk/pull/15082#pullrequestreview-1613868256, https://github.com/openjdk/jdk/pull/15082#discussion_r1321703912. > > I initially had similar thought that `gc_total` isn't necessary and provides redundant data. Now I agree with @simonis that the `gc_total` is valuable to users. It saves users from extra work of aggregating different sets of counters for different garbage collectors, and potential mistakes of missing some counters. It is also future-proof that if GC implementation changes that add additional threads, users wouldn't need to change their code to add the counter for additional threads. > > I think the maintenance overhead is quite small for `gc_total` since it is mostly in this class. The benefit to users is worth it. I agree that the counter is valuable if always up-to-date, but if it is out of sync compared to the "concurrent counters" I think it will confuse some users. So if we want to keep it I think we should try to keep it in sync. I suppose adding a lock for updating `gc_total` should be ok. In the safepoint case we should never contend on that lock and for the concurrent updates it should not be a big deal. Basically what I think would be to change `update_counter(...)` to do something like: if (CPUTimeGroups::is_gc_counter(name)) { instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->inc(net_cpu_time); } This way we would also be able to remove the publish part above, right? Any other problems with this approach? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1411897115 From stuefe at openjdk.org Fri Dec 1 10:25:04 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 1 Dec 2023 10:25:04 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC In-Reply-To: References: Message-ID: On Thu, 16 Nov 2023 13:30:48 GMT, Stefan Karlsson wrote: > There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: > > > if (UseTransparentHugePages && !HugePages::supports_thp()) { > if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { > log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); > } > UseLargePages = UseTransparentHugePages = false; > return; > } > > > This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: > > /sys/kernel/mm/transparent_hugepage/enabled: never > /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise > > > the above code will force ZGC to run without THPs. > > This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: > > 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. > > 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. > > 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. > > The result of this change can be seen in these tables: > > ZGC large pages log output: > > E (T) = Enabled (Transparent) > E (T, OS) = Enabled (Transparent, OS enforced) > D = Disabled > D = Disabled (OS enforced) > > -XX:+UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+--------+---------+------- > always | E (T) | E (T) | E (T) > within_size | E (T) | E (T) | E (T) > advise | E (T) | E (T) | E (T) > never | D (OS) | D (OS) | D (OS) > deny | D (OS) | D (OS) | D (OS) > force | E (T) | E (T) | E (T) > > -XX:-UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+-----------+-----------+------- > always | E (T, OS) | E (T, OS) | E (T, OS) > within_size | E (T, OS) | E (T, OS) | E (T, OS) > advise | D | D | D > never | D | D | D > deny | D | D | D > force ... At first glance it looks reasonable, but I will look at it closer next week (no time). Thanks for following the clean separation of OS-info vs what-the-jvm-does-with-it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16690#issuecomment-1835838214 From aph at openjdk.org Fri Dec 1 10:38:05 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 10:38:05 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding I'd be careful about worrying users unnecessarily about this. We fully correct this problem if it happens, so an OpenJDK user is not affected. If we fail to fix the FP environment, the most likely reason is that it was broken before the shared library was loaded. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1835856491 From stefank at openjdk.org Fri Dec 1 10:55:30 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 1 Dec 2023 10:55:30 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v2] In-Reply-To: References: Message-ID: > There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: > > > if (UseTransparentHugePages && !HugePages::supports_thp()) { > if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { > log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); > } > UseLargePages = UseTransparentHugePages = false; > return; > } > > > This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: > > /sys/kernel/mm/transparent_hugepage/enabled: never > /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise > > > the above code will force ZGC to run without THPs. > > This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: > > 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. > > 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. > > 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. > > The result of this change can be seen in these tables: > > ZGC large pages log output: > > E (T) = Enabled (Transparent) > E (T, OS) = Enabled (Transparent, OS enforced) > D = Disabled > D = Disabled (OS enforced) > > -XX:+UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+--------+---------+------- > always | E (T) | E (T) | E (T) > within_size | E (T) | E (T) | E (T) > advise | E (T) | E (T) | E (T) > never | D (OS) | D (OS) | D (OS) > deny | D (OS) | D (OS) | D (OS) > force | E (T) | E (T) | E (T) > > -XX:-UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+-----------+-----------+------- > always | E (T, OS) | E (T, OS) | E (T, OS) > within_size | E (T, OS) | E (T, OS) | E (T, OS) > advise | D | D | D > never | D | D | D > deny | D | D | D > force ... Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Small tweaks ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16690/files - new: https://git.openjdk.org/jdk/pull/16690/files/c2774174..901b4b10 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=00-01 Stats: 5 lines in 1 file changed: 0 ins; 2 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16690.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16690/head:pull/16690 PR: https://git.openjdk.org/jdk/pull/16690 From stefank at openjdk.org Fri Dec 1 10:55:30 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 1 Dec 2023 10:55:30 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 10:22:32 GMT, Thomas Stuefe wrote: > At first glance it looks reasonable, but I will look at it closer next week (no time). Thanks for following the clean separation of OS-info vs what-the-jvm-does-with-it. Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16690#issuecomment-1835881814 From lucy at openjdk.org Fri Dec 1 11:09:05 2023 From: lucy at openjdk.org (Lutz Schmidt) Date: Fri, 1 Dec 2023 11:09:05 GMT Subject: RFR: 8320807: [PPC64][ZGC] C1 generates wrong code for atomics In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 21:52:53 GMT, Martin Doerr wrote: > Debugging test failures on PPC64 in java/lang/Thread/virtual/stress/Skynet.java#ZGenerational has shown that the ldarx+stdcx_ loop for uncompressed Oops in `LIR_Assembler::atomic_op` is wrong: `__ mr(Rtmp, Robj);` is inside of the ldarx+stdcx_ loop, but must be outside of it. Repeated execution leads to wrong store value. > In addition, zBarrierSetC1.cpp expects `cas_obj` and `xchg` to contain all necessary memory barriers. That doesn't fit to the current PPC64 design which inserts memory barriers on LIR level instead. I've changed this and moved them into the assembler code for all GCs. > While debugging, I have optimized out an unnecessary branch in `ZBarrierSetAssembler::store_barrier_medium`. Looks good. Thanks for detecting and fixing this bug. ------------- Marked as reviewed by lucy (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16835#pullrequestreview-1759560030 From duke at openjdk.org Fri Dec 1 11:16:19 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 1 Dec 2023 11:16:19 GMT Subject: RFR: JDK-8234502 : Merge GenCollectedHeap and SerialHeap [v9] In-Reply-To: References: Message-ID: <290NH775QtYIm1U2D6PUgPcxlrpeZPIrkhAMgUdhCP0=.68543776-26d9-4a57-9c09-0c91b2a7a828@github.com> > JDK-8234502 : Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: resolve conflicts #1 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16623/files - new: https://git.openjdk.org/jdk/pull/16623/files/12c680a3..4a28d0e8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16623&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16623&range=07-08 Stats: 145 lines in 7 files changed: 0 ins; 145 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16623.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16623/head:pull/16623 PR: https://git.openjdk.org/jdk/pull/16623 From duke at openjdk.org Fri Dec 1 11:25:34 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 1 Dec 2023 11:25:34 GMT Subject: RFR: JDK-8234502 : Merge GenCollectedHeap and SerialHeap [v10] In-Reply-To: References: Message-ID: > JDK-8234502 : Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request with a new target base due to a merge or a rebase. ------------- Changes: https://git.openjdk.org/jdk/pull/16623/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16623&range=09 Stats: 0 lines in 0 files changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16623.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16623/head:pull/16623 PR: https://git.openjdk.org/jdk/pull/16623 From duke at openjdk.org Fri Dec 1 11:25:36 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 1 Dec 2023 11:25:36 GMT Subject: Withdrawn: JDK-8234502 : Merge GenCollectedHeap and SerialHeap In-Reply-To: References: Message-ID: On Sat, 11 Nov 2023 06:44:14 GMT, Lei Zaakjyu wrote: > JDK-8234502 : Merge GenCollectedHeap and SerialHeap This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/16623 From jkern at openjdk.org Fri Dec 1 11:39:11 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 1 Dec 2023 11:39:11 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality Message-ID: On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). We propose a different, cleaner way of handling this: - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. - Cache dl handles; repeated opening of a library should return the cached handle. - Increase handle-local ref counter on open, Decrease it on close - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. ------------- Commit messages: - JDK-8320890 Changes: https://git.openjdk.org/jdk/pull/16920/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320890 Stats: 202 lines in 7 files changed: 122 ins; 70 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From rrich at openjdk.org Fri Dec 1 11:47:07 2023 From: rrich at openjdk.org (Richard Reingruber) Date: Fri, 1 Dec 2023 11:47:07 GMT Subject: RFR: 8320807: [PPC64][ZGC] C1 generates wrong code for atomics In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 21:52:53 GMT, Martin Doerr wrote: > Debugging test failures on PPC64 in java/lang/Thread/virtual/stress/Skynet.java#ZGenerational has shown that the ldarx+stdcx_ loop for uncompressed Oops in `LIR_Assembler::atomic_op` is wrong: `__ mr(Rtmp, Robj);` is inside of the ldarx+stdcx_ loop, but must be outside of it. Repeated execution leads to wrong store value. > In addition, zBarrierSetC1.cpp expects `cas_obj` and `xchg` to contain all necessary memory barriers. That doesn't fit to the current PPC64 design which inserts memory barriers on LIR level instead. I've changed this and moved them into the assembler code for all GCs. > While debugging, I have optimized out an unnecessary branch in `ZBarrierSetAssembler::store_barrier_medium`. Looks good. Just two suggestions to improve comments. Cheers, Richard. src/hotspot/cpu/ppc/c1_LIRAssembler_ppc.cpp line 2603: > 2601: } > 2602: > 2603: // Volatile load may be followed by Unsafe CAS. Suggestion: // There might be a volatile load before this Unsafe CAS. src/hotspot/cpu/ppc/c1_LIRAssembler_ppc.cpp line 2984: > 2982: } > 2983: > 2984: // Volatile load may be followed by Unsafe OP. Suggestion: // There might be a volatile load before this Unsafe OP. ------------- Marked as reviewed by rrich (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16835#pullrequestreview-1759593616 PR Review Comment: https://git.openjdk.org/jdk/pull/16835#discussion_r1411967401 PR Review Comment: https://git.openjdk.org/jdk/pull/16835#discussion_r1411988559 From duke at openjdk.org Fri Dec 1 12:06:17 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 1 Dec 2023 12:06:17 GMT Subject: RFR: JDK-8234502 : Merge GenCollectedHeap and SerialHeap [v10] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 11:25:34 GMT, Lei Zaakjyu wrote: >> JDK-8234502 : Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request with a new target base due to a merge or a rebase. I will open another pr later. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16623#issuecomment-1836008485 From mdoerr at openjdk.org Fri Dec 1 13:22:41 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 1 Dec 2023 13:22:41 GMT Subject: RFR: 8320807: [PPC64][ZGC] C1 generates wrong code for atomics In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 21:52:53 GMT, Martin Doerr wrote: > Debugging test failures on PPC64 in java/lang/Thread/virtual/stress/Skynet.java#ZGenerational has shown that the ldarx+stdcx_ loop for uncompressed Oops in `LIR_Assembler::atomic_op` is wrong: `__ mr(Rtmp, Robj);` is inside of the ldarx+stdcx_ loop, but must be outside of it. Repeated execution leads to wrong store value. > In addition, zBarrierSetC1.cpp expects `cas_obj` and `xchg` to contain all necessary memory barriers. That doesn't fit to the current PPC64 design which inserts memory barriers on LIR level instead. I've changed this and moved them into the assembler code for all GCs. > While debugging, I have optimized out an unnecessary branch in `ZBarrierSetAssembler::store_barrier_medium`. Thanks for the reviews! I've improved the comments as suggested. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16835#issuecomment-1836106556 From mdoerr at openjdk.org Fri Dec 1 13:22:39 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 1 Dec 2023 13:22:39 GMT Subject: RFR: 8320807: [PPC64][ZGC] C1 generates wrong code for atomics [v2] In-Reply-To: References: Message-ID: > Debugging test failures on PPC64 in java/lang/Thread/virtual/stress/Skynet.java#ZGenerational has shown that the ldarx+stdcx_ loop for uncompressed Oops in `LIR_Assembler::atomic_op` is wrong: `__ mr(Rtmp, Robj);` is inside of the ldarx+stdcx_ loop, but must be outside of it. Repeated execution leads to wrong store value. > In addition, zBarrierSetC1.cpp expects `cas_obj` and `xchg` to contain all necessary memory barriers. That doesn't fit to the current PPC64 design which inserts memory barriers on LIR level instead. I've changed this and moved them into the assembler code for all GCs. > While debugging, I have optimized out an unnecessary branch in `ZBarrierSetAssembler::store_barrier_medium`. Martin Doerr has updated the pull request incrementally with one additional commit since the last revision: Improve comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16835/files - new: https://git.openjdk.org/jdk/pull/16835/files/54a0d25d..e6958e4c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16835&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16835&range=00-01 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16835.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16835/head:pull/16835 PR: https://git.openjdk.org/jdk/pull/16835 From ayang at openjdk.org Fri Dec 1 13:31:18 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 1 Dec 2023 13:31:18 GMT Subject: RFR: JDK-8234502 : Merge GenCollectedHeap and SerialHeap [v10] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 11:25:34 GMT, Lei Zaakjyu wrote: >> JDK-8234502 : Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request with a new target base due to a merge or a rebase. Both reusing this PR and creating another one are fine, IMO. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16623#issuecomment-1836120441 From mbaesken at openjdk.org Fri Dec 1 14:20:07 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 1 Dec 2023 14:20:07 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 10:35:23 GMT, Andrew Haley wrote: > If we fail to fix the FP environment, the most likely reason is that it was broken before the shared library was loaded. Yes, but in those cases the user should be informed . The user can then look at the libs on the runtime system and see if there are any candidates causing the issue (like in our RHEL aarch64 example). There is already the option to trigger an assert (but few users will run (fast)debug binaries) and the option to get information from UL (but who will have UL enabled - for sure not all users). But quite a few people use JFR so it is a nice way to get the info 'recorded' . ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1836191900 From ayang at openjdk.org Fri Dec 1 14:45:24 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 1 Dec 2023 14:45:24 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53] In-Reply-To: <4NJOslObcIY-G1nUAbKeiCWKK-wbhEw94avA2c2cJ7s=.2ff4560f-5184-40ad-982f-96570cc4a9fc@github.com> References: <-GX8bATX2hz3YWgnJbhTNEYbi4t8HxfdhYqBP-ulyGg=.0080d7b0-8e43-4b81-b885-1d4a742048cc@github.com> <4NJOslObcIY-G1nUAbKeiCWKK-wbhEw94avA2c2cJ7s=.2ff4560f-5184-40ad-982f-96570cc4a9fc@github.com> Message-ID: On Fri, 1 Dec 2023 09:58:16 GMT, Stefan Johansson wrote: >> I thought it is typically preferred to initialize a singleton object on the heap, rather than using several static variables. It is easier to control the initialization order and timing of an on-heap singleton object than statics. >> >> Moreover, for this class, `initialize()` could also check `if (UsePerfData)`, and only create the singleton object under `UsePerfData`. This could save some memory when `UsePerfData` is false. > > I would say it depends on the use-case and here when switching to use static functions to use the instance it felt more like an all-static class. I agree that it would be nice to avoid the additional memory usage if `UsePerfData` is `false` so I'm ok with keeping the instance if we add that. > It is easier to control the initialization order and timing of an on-heap singleton object than statics. It's generally true, but the init of CPUTimeCounters is not sensitive to ordering. > This could save some memory when UsePerfData is false. True, but the mem savings will be marginal at most, `PerfCounter* _cpu_time_counters[static_cast(CPUTimeGroups::CPUTimeType::COUNT)];` will be 8 * ~12 = ~96 bytes (including future cpu-time-type enums). (I don't have a strong preference here; however, I'd like to make the pros and cons explicit.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1412193296 From mdoerr at openjdk.org Fri Dec 1 14:47:17 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 1 Dec 2023 14:47:17 GMT Subject: Integrated: 8320807: [PPC64][ZGC] C1 generates wrong code for atomics In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 21:52:53 GMT, Martin Doerr wrote: > Debugging test failures on PPC64 in java/lang/Thread/virtual/stress/Skynet.java#ZGenerational has shown that the ldarx+stdcx_ loop for uncompressed Oops in `LIR_Assembler::atomic_op` is wrong: `__ mr(Rtmp, Robj);` is inside of the ldarx+stdcx_ loop, but must be outside of it. Repeated execution leads to wrong store value. > In addition, zBarrierSetC1.cpp expects `cas_obj` and `xchg` to contain all necessary memory barriers. That doesn't fit to the current PPC64 design which inserts memory barriers on LIR level instead. I've changed this and moved them into the assembler code for all GCs. > While debugging, I have optimized out an unnecessary branch in `ZBarrierSetAssembler::store_barrier_medium`. This pull request has now been integrated. Changeset: 3087e14c Author: Martin Doerr URL: https://git.openjdk.org/jdk/commit/3087e14cde9257680f0406b11942f9cb7739cb7b Stats: 121 lines in 4 files changed: 42 ins; 70 del; 9 mod 8320807: [PPC64][ZGC] C1 generates wrong code for atomics Reviewed-by: lucy, rrich ------------- PR: https://git.openjdk.org/jdk/pull/16835 From ihse at openjdk.org Fri Dec 1 15:04:19 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Dec 2023 15:04:19 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: <3cK9QjVQNIgZoWWhrWKEb3XxfbLjprjRMBbStWegH7M=.6df92651-b97d-445a-aa42-302ea791bbea@github.com> References: <3cK9QjVQNIgZoWWhrWKEb3XxfbLjprjRMBbStWegH7M=.6df92651-b97d-445a-aa42-302ea791bbea@github.com> Message-ID: On Fri, 1 Dec 2023 10:19:01 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: >> >> - Separate neon and sve functions into two source files >> - Merge branch 'jdk:master' into JDK-8312425 >> - Rename vmath to sleef in configure >> - Address review comments in build system >> - Add a bundled native lib in jdk as a bridge to libsleef >> - Merge 'jdk:master' into JDK-8312425 >> - Disable sleef by default >> - Merge 'jdk:master' into JDK-8312425 >> - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF > > Oh, and: > > If we can't trust SLEEF not to break the ABI we're using, we should not be using SLEEF. @theRealAph You are making good points. You are basically saying: "we don't need any configure support for libsleef, we can just hard-code the names and dispatch to them directly to a dlopened library at runtime". That is technically correct, but I'd still like to argue that the current setup have some merits: * It will check that there is no typo in the function names. I agree that the likelihood of getting this wrong is low, but it is still a good practice to use official include files to have the compiler help checking this. * If we would like to bundle libsleef.so with the JVM, now we have the possibility do do so easily. (Especially if it is like you say that it is not commonly installed). (If licenses allow etc) * If we want to incorporate/bundle the source code of libsleef into OpenJDK, and build it as part of our internal library, we will have a good starting position, compared to starting from a hard-coded assembly file in hotspot. (I thought I heard some noise about this prospect). So at this point, I am okay with the general approach of this PR. There are still some build issues to sort out, though, I'll address them separately. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836266314 From simonis at openjdk.org Fri Dec 1 16:22:23 2023 From: simonis at openjdk.org (Volker Simonis) Date: Fri, 1 Dec 2023 16:22:23 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53] In-Reply-To: <-GX8bATX2hz3YWgnJbhTNEYbi4t8HxfdhYqBP-ulyGg=.0080d7b0-8e43-4b81-b885-1d4a742048cc@github.com> References: <-GX8bATX2hz3YWgnJbhTNEYbi4t8HxfdhYqBP-ulyGg=.0080d7b0-8e43-4b81-b885-1d4a742048cc@github.com> Message-ID: <810qMt__o90-A1Csix4IiygZEpyP09w8tisrwY5mQC4=.b8696cfe-2323-4a6d-884c-47df6568a337@github.com> On Thu, 30 Nov 2023 09:46:03 GMT, Stefan Johansson wrote: >> Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision: >> >> Add missing include > > src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 905: > >> 903: gc_threads_do(&tttc); >> 904: >> 905: CPUTimeCounters::publish_gc_total_cpu_time(); > > As I suggested in the other comment, maybe we should not keep the total counter, but if we do we need to make sure the destructor of the closure is run before the call to `publish_gc_total_cpu_time()`. Otherwise we will publish a not yet updated value. I still think that a total counter is useful and I'd appreciate if you can keep it. To second what @caoman said before, it is GC agnostic, easy to use even for non GC experts and future proof with regards to implementation changes in the GCs. Please keep it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1412322098 From redestad at openjdk.org Fri Dec 1 16:23:17 2023 From: redestad at openjdk.org (Claes Redestad) Date: Fri, 1 Dec 2023 16:23:17 GMT Subject: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v14] In-Reply-To: References: <6SKlGLh5MmxoEx07wHCCUc8KWbbhcspLJmcc1uxQ_FI=.ca33bfb4-fa5c-45f0-b49f-ee6c5c6b68b4@github.com> Message-ID: On Thu, 30 Nov 2023 15:51:46 GMT, Roger Riggs wrote: >> Strings, after construction, are immutable but may be constructed from mutable arrays of bytes, characters, or integers. >> The string constructors should guard against the effects of mutating the arrays during construction that might invalidate internal invariants for the correct behavior of operations on the resulting strings. In particular, a number of operations have optimizations for operations on pairs of latin1 strings and pairs of non-latin1 strings, while operations between latin1 and non-latin1 strings use a more general implementation. >> >> The changes include: >> >> - Adding a warning to each constructor with an array as an argument to indicate that the results are indeterminate >> if the input array is modified before the constructor returns. >> The resulting string may contain any combination of characters sampled from the input array. >> >> - Ensure that strings that are represented as non-latin1 contain at least one non-latin1 character. >> For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or another encoding decoded to latin1 the scanning and compression is unchanged. >> If a non-latin1 character is found, the string is represented as non-latin1 with the added verification that a non-latin1 character is present at the same index. >> If that character is found to be latin1, then the input array has been modified and the result of the scan may be incorrect. >> Though a ConcurrentModificationException could be thrown, the risk to an existing application of an unexpected exception should be avoided. >> Instead, the non-latin1 copy of the input is re-scanned and compressed; that scan determines whether the latin1 or the non-latin1 representation is returned. >> >> - The methods that scan for non-latin1 characters and their intrinsic implementations are updated to return the index of the non-latin1 character. >> >> - String construction from StringBuilder and CharSequence must also be guarded as their contents may be modified during construction. > > Roger Riggs has updated the pull request incrementally with one additional commit since the last revision: > > Correct jcc/jccb branches Marked as reviewed by redestad (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16425#pullrequestreview-1760162195 From rriggs at openjdk.org Fri Dec 1 16:23:19 2023 From: rriggs at openjdk.org (Roger Riggs) Date: Fri, 1 Dec 2023 16:23:19 GMT Subject: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v13] In-Reply-To: References: <6SKlGLh5MmxoEx07wHCCUc8KWbbhcspLJmcc1uxQ_FI=.ca33bfb4-fa5c-45f0-b49f-ee6c5c6b68b4@github.com> Message-ID: On Thu, 30 Nov 2023 08:00:12 GMT, Damon Fenacci wrote: >> Roger Riggs has updated the pull request incrementally with one additional commit since the last revision: >> >> Use byte off branches in char_array_compress >> Verified by manual tests with "-XX:AVX3Threshold=0" >> And test in the PR test/hotspot/jtreg/compiler/intrinsics/string/TestStringConstructionIntrinsics.java > > src/hotspot/cpu/x86/macroAssembler_x86.cpp line 8547: > >> 8545: // bail out when there is nothing to be done >> 8546: testl(tmp5, 0xFFFFFFFF); >> 8547: jcc(Assembler::zero, post_alignment); > > @RogerRiggs I think you changed the wrong line ? > Suggestion: > > jccb(Assembler::zero, post_alignment); Thanks for spotting that mistake, corrected in [b2fc385](https://github.com/openjdk/jdk/pull/16425/commits/b2fc38550ba95bcd7ec1ae4f52f22b220afcb045). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16425#discussion_r1410870186 From ihse at openjdk.org Fri Dec 1 16:29:18 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Dec 2023 16:29:18 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF doc/building.md line 639: > 637: > 638: libsleef, the [SIMD Library for Evaluating Elementary Functions]( > 639: https://sleef.org/) is required when building libvmath.so on Linux/aarch64 This is incorrect. The library is not required, but if it is present, we will build libvmath with it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1412330808 From ihse at openjdk.org Fri Dec 1 16:39:14 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Dec 2023 16:39:14 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> Message-ID: On Fri, 1 Dec 2023 10:02:35 GMT, Andrew Haley wrote: >> Did you try to find the libsleef by passing `--with-libsleef=` ? Currently `--with-libsleef=` can only work for people manually built from sleef source code. > > Yes. It still failed. You need to expand this logic to cover more instances. See e.g. lib-ffi.m4 for inspiration. Basic flow: * if user has specified libsleef root with argument, check both lib/ and lib64/ under that root. * if user has not specified libsleef root, and we have no SYSROOT, try PKG_CHECK * Otherwise, look in well-known directories which is $SYSROOT/usr/[local/]lib[64]. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1412340745 From ihse at openjdk.org Fri Dec 1 16:39:15 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Dec 2023 16:39:15 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> Message-ID: On Fri, 1 Dec 2023 16:35:31 GMT, Magnus Ihse Bursie wrote: >> Yes. It still failed. > > You need to expand this logic to cover more instances. See e.g. lib-ffi.m4 for inspiration. > > Basic flow: > * if user has specified libsleef root with argument, check both lib/ and lib64/ under that root. > * if user has not specified libsleef root, and we have no SYSROOT, try PKG_CHECK > * Otherwise, look in well-known directories which is $SYSROOT/usr/[local/]lib[64]. also, ideally, you will add the corresponding specific overrides like in ffi: AC_ARG_WITH(libffi-include, [AS_HELP_STRING([--with-libffi-include], [specify directory for the libffi include files])]) AC_ARG_WITH(libffi-lib, [AS_HELP_STRING([--with-libffi-lib], [specify directory for the libffi library])]) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1412341451 From ihse at openjdk.org Fri Dec 1 16:48:17 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Fri, 1 Dec 2023 16:48:17 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF The final thing we need to resolve properly is the SVE compiler test. @theRealAph says: > arm_sve.h is part of GCC. It was added to GCC in 2019. A more relevant question is what version of gcc it was added, and if that also implies that the compiler knows about `-march=armv8-a+sve`. If so, then this test could basically be framed as a gcc version check. I'm still leaning towards failing configure if the SVE code cannot be compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS would not be set? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836444674 From aph at openjdk.org Fri Dec 1 16:52:14 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 1 Dec 2023 16:52:14 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: <3cK9QjVQNIgZoWWhrWKEb3XxfbLjprjRMBbStWegH7M=.6df92651-b97d-445a-aa42-302ea791bbea@github.com> References: <3cK9QjVQNIgZoWWhrWKEb3XxfbLjprjRMBbStWegH7M=.6df92651-b97d-445a-aa42-302ea791bbea@github.com> Message-ID: On Fri, 1 Dec 2023 10:19:01 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: >> >> - Separate neon and sve functions into two source files >> - Merge branch 'jdk:master' into JDK-8312425 >> - Rename vmath to sleef in configure >> - Address review comments in build system >> - Add a bundled native lib in jdk as a bridge to libsleef >> - Merge 'jdk:master' into JDK-8312425 >> - Disable sleef by default >> - Merge 'jdk:master' into JDK-8312425 >> - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF > > Oh, and: > > If we can't trust SLEEF not to break the ABI we're using, we should not be using SLEEF. > @theRealAph You are making good points. > > You are basically saying: "we don't need any configure support for libsleef, we can just hard-code the names and dispatch to them directly to a dlopened library at runtime". Yep. > That is technically correct, but I'd still like to argue that the current setup have some merits: > > * It will check that there is no typo in the function names. I agree that the likelihood of getting this wrong is low, but it is still a good practice to use official include files to have the compiler help checking this. > > * If we would like to bundle libsleef.so with the JVM, now we have the possibility do do so easily. (Especially if it is like you say that it is not commonly installed). (If licenses allow etc) > > * If we want to incorporate/bundle the source code of libsleef into OpenJDK, and build it as part of our internal library, we will have a good starting position, compared to starting from a hard-coded assembly file in hotspot. (I thought I heard some noise about this prospect). > > > So at this point, I am okay with the general approach of this PR. There are still some build issues to sort out, though, I'll address them separately. I see, OK. The question in my mind is whether the common builds of OpenJDK (Oracle, Adoptium, etc.) will support running with SLEEF. If by default SLEEF is not required, support won't be built, and (to an nth order approximation) no one will use it. But I guess it's better than nothing. Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF support? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1836449876 From duke at openjdk.org Fri Dec 1 16:58:21 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 1 Dec 2023 16:58:21 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap Message-ID: 8234502: Merge GenCollectedHeap and SerialHeap ------------- Commit messages: - merge 'CollectedHeap' and 'SerialHeap' Changes: https://git.openjdk.org/jdk/pull/16927/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8234502 Stats: 2908 lines in 15 files changed: 1431 ins; 1463 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From duke at openjdk.org Fri Dec 1 17:01:07 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 1 Dec 2023 17:01:07 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 16:47:28 GMT, Lei Zaakjyu wrote: > 8234502: Merge GenCollectedHeap and SerialHeap I wonder if the files in 'src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/' which contain the java class 'GenCollectedHeap' should also be modified in this pr. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1836462105 From mli at openjdk.org Fri Dec 1 17:04:12 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 1 Dec 2023 17:04:12 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v7] In-Reply-To: <3qPd3ZhLhbYsRHcljYxk_m3C-4CmwBGyUADUAil64a8=.a15fd7e3-1c9b-4be1-b67c-24967427ef5f@github.com> References: <3qPd3ZhLhbYsRHcljYxk_m3C-4CmwBGyUADUAil64a8=.a15fd7e3-1c9b-4be1-b67c-24967427ef5f@github.com> Message-ID: On Fri, 1 Dec 2023 07:49:48 GMT, Yuri Gaevsky wrote: >> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1479: >> >>> 1477: case T_SHORT: BLOCK_COMMENT("arrays_hashcode(short) {"); break; >>> 1478: case T_INT: BLOCK_COMMENT("arrays_hashcode(int) {"); break; >>> 1479: default: BLOCK_COMMENT("arrays_hashcode {"); break; >> >> Is this `BLOCK_COMMENT("arrays_hashcode {"); break;` necessary? > > I have just borrrowed that part of code from X86 counterpart: > https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp#L3354 > It is a dead code so 'ShouldNotReachHere();' looks more appropriate here. Do you think we should fix this as a part of this patch or as some follow-ups for both x86/RISC-V? Yes, as we already have `ShouldNotReachHere();` to guard the default case. How do you think about it? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1412366205 From egahlin at openjdk.org Fri Dec 1 17:49:06 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Fri, 1 Dec 2023 17:49:06 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding I'm not sure I understand the issue, but adding a field to an event because of GCC bug seems excessive. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1836529303 From uschindler at openjdk.org Fri Dec 1 18:02:18 2023 From: uschindler at openjdk.org (Uwe Schindler) Date: Fri, 1 Dec 2023 18:02:18 GMT Subject: RFR: 8310644: Make panama memory segment close use async handshakes [v5] In-Reply-To: References: Message-ID: On Fri, 24 Nov 2023 18:30:17 GMT, Erik ?sterlund wrote: >> The current logic for closing memory in panama today is susceptible to live lock if we have a closing thread that wants to close the memory in a loop that keeps failing, and a bunch of accessing threads that want to perform accesses as long as the memory is alive. They can both create impediments for the other. >> >> By using asynchronous handshakes to install an exception onto threads that are in @Scoped memory accesses, we can have close always succeed, and the accessing threads bail out. The idea is that we perform a synchronous handshake first to find threads that are in scoped methods. They might however be in the middle of throwing an exception or something wild like there, where an exception can't be delivered. We install an async handshake that will roll us forward to the first place where we can indeed install exceptions, then we reevaluate if we still need to do that, or if we have unwound out from the scoped method. If we are still inside of it, we ensure an exception is installed so we don't continue executing bytecodes that might access the memory that we have freed. >> >> Tested tier 1-5 as well as running test/jdk/java/foreign/TestHandshake.java hundreds of times, which tests this API pretty well. > > Erik ?sterlund has updated the pull request incrementally with two additional commits since the last revision: > > - Merge pull request #3 from JornVernee/PR_async_close+NoToNativeTrans > > - don't transition to native state on Unsafe_CopySwapMemory0 I can confirm: This works fine with Lucene! The isAlive state is visible in other threads and closing the Arena's scope can no longer throw IllegalStateException. See comment here: https://github.com/apache/lucene/pull/12706#issuecomment-1836540146 ------------- PR Comment: https://git.openjdk.org/jdk/pull/16792#issuecomment-1836546089 From ayang at openjdk.org Fri Dec 1 18:37:04 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 1 Dec 2023 18:37:04 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 16:58:30 GMT, Lei Zaakjyu wrote: > I wonder if the files in 'src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/' which contain the java class 'GenCollectedHeap' should also be modified in this pr. They should reflect the actual type inside VM. Some can probably be done before this PR though, e.g. `class PointerFinder` -- it's unclear why we report extra location info for SerialGC only. This should be enough for all GCs. (I don't think the additional info is that useful.) if (heap.isIn(a)) { loc.heap = heap; return loc; } (There may be more examples that can be done before this large PR.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1836592457 From rehn at openjdk.org Fri Dec 1 18:49:20 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 1 Dec 2023 18:49:20 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: <5ydUXSyM7-XcGRH86bvVH4LJM94sAY7rahyUeqcrkBk=.e237d328-06f4-4919-af88-ea6f56d0b202@github.com> References: <9x5sC6aXWG2OUYXdS97o-fJgjhNODf-mVC69bQNSSjI=.6425f2fc-d793-4b49-bf97-1ea55d0fd443@github.com> <_LkvimbOKKuIZon0Ajv9lKReO19xQjFI2VH2b4hsCE4=.89f5725a-150c-4a03-a6c2-a71a2f5fe3b6@github.com> <1rTN32en51Pjpr-mdaDjw3UzQnf7W4J8JQTf-CMG04s=.904657b9-7a3a-46e3-8936-cf0f16b5c7b9@github.com> <5ydUXSyM7-XcGRH86bvVH4LJM94sAY7rahyUeqcrkBk=.e237d328-06f4-4919-af88-ea6f56d0b202@github.com> Message-ID: On Tue, 21 Nov 2023 10:57:35 GMT, Robbin Ehn wrote: >> Thanks. Now I see what you mean. That makes sense to me. It will be interesting to see how the performance numbers may vary. Unfortunately, I don't have access to the hardware yet. > > We don't either have such hardware, we simulate via gem5. > Ventana v2 should have 15 wide pipeline with RVV 1.0 how knows how this will execute on such :) > > As we don't know I think you are correct in we should write the most readable version first. > And later we can apt these for hwprobe triplet of vendor/arch/impl if we think that it's worth it. Not yet addressed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1407833282 From rehn at openjdk.org Fri Dec 1 18:49:55 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 1 Dec 2023 18:49:55 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: References: Message-ID: On Wed, 15 Nov 2023 07:36:54 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Flag fixes >> - Merge branch 'master' into sha256 >> - Share code >> - SHA-2 > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4146: > >> 4144: Register ofs = c_rarg2; >> 4145: Register limit = c_rarg3; >> 4146: Register consts = t0; > > Similar here. Please consider using `t1` instead for `consts`. Used t2 with spill. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1407833049 From fparain at openjdk.org Fri Dec 1 19:37:32 2023 From: fparain at openjdk.org (Frederic Parain) Date: Fri, 1 Dec 2023 19:37:32 GMT Subject: RFR: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index [v3] In-Reply-To: <8uEcqzVz-sUB1NACfJnQ2c1s3Vxjf0d-V5Upwgi703o=.c9b8c675-f3e5-4a2d-94a0-64e9d5613bf4@github.com> References: <8uEcqzVz-sUB1NACfJnQ2c1s3Vxjf0d-V5Upwgi703o=.c9b8c675-f3e5-4a2d-94a0-64e9d5613bf4@github.com> Message-ID: <3l_lV16uKxsDEYIbfJzOp8o80di0FOHdgEdLKXWNt2k=.f7a6623e-523b-48f4-9513-65207088e8f7@github.com> On Thu, 30 Nov 2023 15:26:22 GMT, Patricio Chilano Mateo wrote: >> Please review the following fix. The assert fails while verifying the top frame of the stackChunk before returning from a thaw call. The stackChunk is in gc mode but we found a narrow oop for this c2 compiled frame that doesn't have its corresponding bit set. This is because while thawing its callee we cleared the bitmap range associated with the argument area, but this narrow oop happens to land at the very last stack slot of that region. >> Loom code assumes the size of the argument area is always a multiple of 2 stack slots, as SharedRuntime::java_calling_convention() shows. But c2 doesn't seem to follow this convention and, knowing the last passed argument only takes one stack slot, it's using the remaining space to store a narrow oop for the caller. There are more details about the specific crash in JBS. >> >> The initial proposed fix is to just restrict the range of the bitmap we clear by excluding the last stack slot of the argument area, since passed oops are always word aligned. I've also experimented with a patch where I changed SharedRuntime::java_calling_convention() and Fingerprinter::do_type_calling_convention() to not round up the number of stack slots used, and then changed the callers to use a round up value or not depending on the needs [1]. I wasn't convinced it was worthy given we only care about this difference in this Loom code, but I don't mind going with that fix instead. The 3rd alternative would be to just change c2 to not use this stack slot and start spilling at a word aligned offset from the sp. >> >> I run the patch with the failing test and verified the crash doesn't reproduce anymore. I've also run this patch through loom tiers1-5. >> >> Thanks, >> Patricio >> >> [1] https://github.com/pchilano/jdk/commit/42ae9269b28beb6f36c502182116545b680e418f > > Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: > > add comment in clear_bitmap_bits() LGTM Thank you for having explored the different options to fix this bug. ------------- Marked as reviewed by fparain (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16837#pullrequestreview-1760449408 From mli at openjdk.org Fri Dec 1 19:54:39 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 1 Dec 2023 19:54:39 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 14:20:40 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: > > - Flag fixes > - Merge branch 'master' into sha256 > - Share code > - SHA-2 Thanks for updating. Please check the initial comments below. src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 1359: > 1357: } > 1358: > 1359: inline void vmsltu_vi(VectorRegister Vd, VectorRegister Vs2, int32_t imm, VectorMask vm = unmasked) { Seems this function is not used in the code? And, when `imm` == 0, seems it will output unexpected value? src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 1363: > 1361: } > 1362: > 1363: inline void vmsgeu_vi(VectorRegister Vd, VectorRegister Vs2, int32_t imm, VectorMask vm = unmasked) { Same comments as `vmsltu_vi ` above. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3672: > 3670: Sha2Generator(MacroAssembler* masm, StubCodeGenerator* cgen) : MacroAssembler(masm->code()), _cgen(cgen) {} > 3671: address generate_sha256_implCompress(bool multi_block) { > 3672: return generate_sha2_implCompress(multi_block); Not sure if we should use template or just adding a parameter here, seems the latter can acheive the same effect and will not bloat the code. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3681: > 3679: template > 3680: void vl1reXX_v(VectorRegister vr, Register sr) { > 3681: if (T == Assembler::e32) __ vl1re32_v(vr, sr); related comments as above about the template usage, seems here it's more suitable to use a parameter. Maybe also in some other places below. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3895: > 3893: __ enter(); > 3894: > 3895: __ push_reg(saved_regs, sp); Not sure if we need to push and pop `saved_regs `, as t2 is the only register in it, or maybe I miss something? src/hotspot/cpu/riscv/vm_version_riscv.cpp line 159: > 157: } > 158: > 159: if (UseZvkn) { Maybe it's safe to move the code behind `#endif // COMPILER2` at line 291, as it depends on UseRVV. ------------- PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1760417335 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1412479561 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1412479867 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1412501848 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1412502971 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1412509509 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1412487773 From dcubed at openjdk.org Fri Dec 1 20:15:57 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Fri, 1 Dec 2023 20:15:57 GMT Subject: RFR: 8321066: Multiple JFR tests have started failing In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 15:28:51 GMT, Erik ?sterlund wrote: >> Before integrating https://bugs.openjdk.org/browse/JDK-8310644 we added a seemingly innocent NoSafepointVerifier in some code that really shouldn't safepoint and reran tier1 only. >> >> What nobody anticipated is that the JFR dumping during crash reporting code probably introduced by https://bugs.openjdk.org/browse/JDK-8233706 performs safepoint polls from inside the crash reporter. These JFR tests try to provoke a crash and check that the JFR recording gets dumped. But we crash during crash reporting in debug builds, because the NSV doesn't like the safepoint polls inside the crash reporter. >> >> Now while this crash reporting code can seemingly make any NSV in the JVM fail if you get a crash there, and even worse, accept safepoints and do GC while crash reporting from totally safepoint unsafe code and what not, this change merely removes the new NSV from the fix that introduced the test failures. But maybe going forward we shouldn't poll for safepoints in the crash reporter. >> >> I tested that the reported test failures fail deterministically without this patch and do not fail with this patch. I also re-ran tier1 just to be safe. > > Since this is causing a bit of a christmas tree in the CI and christmas shouldn't come quite yet, I'm going to go ahead and /integrate @fisk - Has a follow-up bug been filed for the fact that we're polling for safepoints in the crash reporting code? I don't see a link in [JDK-8321066](https://bugs.openjdk.org/browse/JDK-8321066) Multiple JFR tests have started failing ------------- PR Comment: https://git.openjdk.org/jdk/pull/16900#issuecomment-1836717912 From amenkov at openjdk.org Fri Dec 1 20:16:47 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Fri, 1 Dec 2023 20:16:47 GMT Subject: RFR: 8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 [v5] In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 21:27:27 GMT, Serguei Spitsyn wrote: >> This is a fix of a performance/scalability related issue. The `JvmtiThreadState` objects for virtual thread filtered events enabled globally are created eagerly because it is needed when the `interp_only_mode` is enabled. Otherwise, some events which are generated in `interp_only_mode` from the debugging version of interpreter chunks can be missed. >> However, it has to be okay to avoid eager creation of these object if no `interp_only_mode` has ever been requested. >> It seems to be an extremely important optimization to create JvmtiThreadState objects lazily in such cases. >> It is done by introducing the flag `JvmtiThreadState::_seen_interp_only_mode` which indicates when the `JvmtiThreadState` objects have to be created eagerly. >> >> Additionally, the fix includes the following related changes: >> - Use condition double checking idiom for `MutexLocker mu(JvmtiThreadState_lock)` in the function `JvmtiVTMSTransitionDisabler::VTMS_mount_end` which is on a performance-critical path and looks like this: >> >> JvmtiThreadState* state = thread->jvmti_thread_state(); >> if (state != nullptr && state->is_pending_interp_only_mode()) { >> MutexLocker mu(JvmtiThreadState_lock); >> state = thread->jvmti_thread_state(); >> if (state != nullptr && state->is_pending_interp_only_mode()) { >> JvmtiEventController::enter_interp_only_mode(); >> } >> } >> >> >> - Add extra check of `JvmtiExport::can_support_virtual_threads()` when virtual thread mount and unmount are posted. >> - Minor: Added a `ThreadsListHandle` to the `JvmtiEventControllerPrivate::enter_interp_only_mode`. It is needed because of the dynamic creation of compensating carrier threads which is racy for JVMTI `SetEventNotificationMode` implementation. >> >> Performance mesurements: >> - Without this fix the test provided by the bug submitter gives execution numbers: >> - no ClassLoad events enabled: 3251 ms >> - ClassLoad events are enabled: 40534 ms >> >> - With the fix: >> - no ClassLoad events enabled: 3270 ms >> - ClassLoad events are enabled: 3385 ms >> >> Testing: >> - Ran mach5 tiers 1-6, no regressions are noticed > > Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: > > - review: one more minor correction of a comment > - review: minor correction of a comment Marked as reviewed by amenkov (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16686#pullrequestreview-1760508007 From duke at openjdk.org Fri Dec 1 20:28:39 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 1 Dec 2023 20:28:39 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v7] In-Reply-To: References: <3qPd3ZhLhbYsRHcljYxk_m3C-4CmwBGyUADUAil64a8=.a15fd7e3-1c9b-4be1-b67c-24967427ef5f@github.com> Message-ID: <99A-EjDHpG4MZGOW8F-mOZROA5lZ_HEKXEHOxwZqtnA=.cf248b3d-86b6-4195-a699-718a0987f5b2@github.com> On Fri, 1 Dec 2023 17:00:53 GMT, Hamlin Li wrote: >> I have just borrrowed that part of code from X86 counterpart: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp#L3354 >> It is a dead code so 'ShouldNotReachHere();' looks more appropriate here. Do you think we should fix this as a part of this patch or as some follow-ups for both x86/RISC-V? > > Yes, as we already have `ShouldNotReachHere();` to guard the default case. How do you think about it? Oops, somehow I completely missed that ShouldNotReachHere() is already in place, will fix that shortly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1412544588 From pchilanomate at openjdk.org Fri Dec 1 20:43:35 2023 From: pchilanomate at openjdk.org (Patricio Chilano Mateo) Date: Fri, 1 Dec 2023 20:43:35 GMT Subject: RFR: 8320275: assert(_chunk->bitmap().at(index)) failed: Bit not set at index [v3] In-Reply-To: References: <8uEcqzVz-sUB1NACfJnQ2c1s3Vxjf0d-V5Upwgi703o=.c9b8c675-f3e5-4a2d-94a0-64e9d5613bf4@github.com> Message-ID: On Thu, 30 Nov 2023 20:27:35 GMT, Dean Long wrote: >> Patricio Chilano Mateo has updated the pull request incrementally with one additional commit since the last revision: >> >> add comment in clear_bitmap_bits() > > I would be tempted to put the round up in `compiled_frame_stack_argsize`, but it's not a big deal. Thanks for the reviews @dean-long and @fparain! I want to give it a couple more rounds of testing in the upper tiers. Since I'll be out on vacation after today that means I'll integrate this once I'm back. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16837#issuecomment-1836748002 From sspitsyn at openjdk.org Fri Dec 1 20:57:50 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Dec 2023 20:57:50 GMT Subject: RFR: 8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 [v5] In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 21:27:27 GMT, Serguei Spitsyn wrote: >> This is a fix of a performance/scalability related issue. The `JvmtiThreadState` objects for virtual thread filtered events enabled globally are created eagerly because it is needed when the `interp_only_mode` is enabled. Otherwise, some events which are generated in `interp_only_mode` from the debugging version of interpreter chunks can be missed. >> However, it has to be okay to avoid eager creation of these object if no `interp_only_mode` has ever been requested. >> It seems to be an extremely important optimization to create JvmtiThreadState objects lazily in such cases. >> It is done by introducing the flag `JvmtiThreadState::_seen_interp_only_mode` which indicates when the `JvmtiThreadState` objects have to be created eagerly. >> >> Additionally, the fix includes the following related changes: >> - Use condition double checking idiom for `MutexLocker mu(JvmtiThreadState_lock)` in the function `JvmtiVTMSTransitionDisabler::VTMS_mount_end` which is on a performance-critical path and looks like this: >> >> JvmtiThreadState* state = thread->jvmti_thread_state(); >> if (state != nullptr && state->is_pending_interp_only_mode()) { >> MutexLocker mu(JvmtiThreadState_lock); >> state = thread->jvmti_thread_state(); >> if (state != nullptr && state->is_pending_interp_only_mode()) { >> JvmtiEventController::enter_interp_only_mode(); >> } >> } >> >> >> - Add extra check of `JvmtiExport::can_support_virtual_threads()` when virtual thread mount and unmount are posted. >> - Minor: Added a `ThreadsListHandle` to the `JvmtiEventControllerPrivate::enter_interp_only_mode`. It is needed because of the dynamic creation of compensating carrier threads which is racy for JVMTI `SetEventNotificationMode` implementation. >> >> Performance mesurements: >> - Without this fix the test provided by the bug submitter gives execution numbers: >> - no ClassLoad events enabled: 3251 ms >> - ClassLoad events are enabled: 40534 ms >> >> - With the fix: >> - no ClassLoad events enabled: 3270 ms >> - ClassLoad events are enabled: 3385 ms >> >> Testing: >> - Ran mach5 tiers 1-6, no regressions are noticed > > Serguei Spitsyn has updated the pull request incrementally with two additional commits since the last revision: > > - review: one more minor correction of a comment > - review: minor correction of a comment Chris and Alex, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16686#issuecomment-1836761160 From sspitsyn at openjdk.org Fri Dec 1 20:57:52 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 1 Dec 2023 20:57:52 GMT Subject: Integrated: 8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 In-Reply-To: References: Message-ID: On Thu, 16 Nov 2023 11:15:27 GMT, Serguei Spitsyn wrote: > This is a fix of a performance/scalability related issue. The `JvmtiThreadState` objects for virtual thread filtered events enabled globally are created eagerly because it is needed when the `interp_only_mode` is enabled. Otherwise, some events which are generated in `interp_only_mode` from the debugging version of interpreter chunks can be missed. > However, it has to be okay to avoid eager creation of these object if no `interp_only_mode` has ever been requested. > It seems to be an extremely important optimization to create JvmtiThreadState objects lazily in such cases. > It is done by introducing the flag `JvmtiThreadState::_seen_interp_only_mode` which indicates when the `JvmtiThreadState` objects have to be created eagerly. > > Additionally, the fix includes the following related changes: > - Use condition double checking idiom for `MutexLocker mu(JvmtiThreadState_lock)` in the function `JvmtiVTMSTransitionDisabler::VTMS_mount_end` which is on a performance-critical path and looks like this: > > JvmtiThreadState* state = thread->jvmti_thread_state(); > if (state != nullptr && state->is_pending_interp_only_mode()) { > MutexLocker mu(JvmtiThreadState_lock); > state = thread->jvmti_thread_state(); > if (state != nullptr && state->is_pending_interp_only_mode()) { > JvmtiEventController::enter_interp_only_mode(); > } > } > > > - Add extra check of `JvmtiExport::can_support_virtual_threads()` when virtual thread mount and unmount are posted. > - Minor: Added a `ThreadsListHandle` to the `JvmtiEventControllerPrivate::enter_interp_only_mode`. It is needed because of the dynamic creation of compensating carrier threads which is racy for JVMTI `SetEventNotificationMode` implementation. > > Performance mesurements: > - Without this fix the test provided by the bug submitter gives execution numbers: > - no ClassLoad events enabled: 3251 ms > - ClassLoad events are enabled: 40534 ms > > - With the fix: > - no ClassLoad events enabled: 3270 ms > - ClassLoad events are enabled: 3385 ms > > Testing: > - Ran mach5 tiers 1-6, no regressions are noticed This pull request has now been integrated. Changeset: 42af8ce1 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/42af8ce1f6605376fdb69e03df9e22381a54fc36 Stats: 24 lines in 2 files changed: 13 ins; 2 del; 9 mod 8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 Reviewed-by: dcubed, cjplummer, amenkov ------------- PR: https://git.openjdk.org/jdk/pull/16686 From manc at openjdk.org Fri Dec 1 21:04:51 2023 From: manc at openjdk.org (Man Cao) Date: Fri, 1 Dec 2023 21:04:51 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48] In-Reply-To: References: <_lEBVrWV8wrVbmhOiu3AAqPJo_xBs718ZtA9V-VSzGM=.253c0ec8-256e-4dee-b125-90be6338e4b8@github.com> Message-ID: On Fri, 1 Dec 2023 10:21:31 GMT, Stefan Johansson wrote: >> Right, see @simonis 's comments at https://github.com/openjdk/jdk/pull/15082#pullrequestreview-1613868256, https://github.com/openjdk/jdk/pull/15082#discussion_r1321703912. >> >> I initially had similar thought that `gc_total` isn't necessary and provides redundant data. Now I agree with @simonis that the `gc_total` is valuable to users. It saves users from extra work of aggregating different sets of counters for different garbage collectors, and potential mistakes of missing some counters. It is also future-proof that if GC implementation changes that add additional threads, users wouldn't need to change their code to add the counter for additional threads. >> >> I think the maintenance overhead is quite small for `gc_total` since it is mostly in this class. The benefit to users is worth it. > > I agree that the counter is valuable if always up-to-date, but if it is out of sync compared to the "concurrent counters" I think it will confuse some users. So if we want to keep it I think we should try to keep it in sync. > > I suppose adding a lock for updating `gc_total` should be ok. In the safepoint case we should never contend on that lock and for the concurrent updates it should not be a big deal. Basically what I think would be to change `update_counter(...)` to do something like: > > if (CPUTimeGroups::is_gc_counter(name)) { > > instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->inc(net_cpu_time); > } > > > This way we would also be able to remove the publish part above, right? Any other problems with this approach? I think the ideal approach to simplify this is to support Atomic operation on a `PerfCounter`. We could either introduce a `PerfAtomicCounter`/`PerfAtomicLongCounter` class, or perform `Atomic::add()` on the `PerfData::_valuep` pointer. There's already `PerfData::get_address()`, so we might be able to do: Atomic::add((volatile jlong *)(instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->get_address()), net_cpu_time); However, a new class `PerfAtomicCounter` is likely cleaner. E.g., we may also want to make `PerfAtomicCounter::sample()` use a CAS. It is probably better to introduce `PerfAtomicCounter` in a separate RFE later. Would the `Atomic::add()` with `PerfData::get_address()` approach be OK for now, or would we rather introduce a lock, or leave the `gc_total` mechanism as-is and address the out-of-sync-ness in a follow-up RFE? IMO the out-of-sync-ness problem is minor, because users are likely to either look at a single `gc_total` counter, or look at each individual GC CPU counter and disregard `gc_total`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1412569208 From cjplummer at openjdk.org Fri Dec 1 21:13:37 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 1 Dec 2023 21:13:37 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v5] In-Reply-To: <81dXSHvLQMGj3s1BcBs8fmJUEoJpaU-5wBRSIjnztMM=.d53f8a2f-8353-49ec-8a9b-695b32f03d20@github.com> References: <81dXSHvLQMGj3s1BcBs8fmJUEoJpaU-5wBRSIjnztMM=.d53f8a2f-8353-49ec-8a9b-695b32f03d20@github.com> Message-ID: On Tue, 28 Nov 2023 23:25:27 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The man page of jcmd will be updated in a separate PR. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Apply man changes @yftsai Can you update the man page output in the PR description? Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15871#issuecomment-1836781584 From cjplummer at openjdk.org Fri Dec 1 21:34:39 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Fri, 1 Dec 2023 21:34:39 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v5] In-Reply-To: <81dXSHvLQMGj3s1BcBs8fmJUEoJpaU-5wBRSIjnztMM=.d53f8a2f-8353-49ec-8a9b-695b32f03d20@github.com> References: <81dXSHvLQMGj3s1BcBs8fmJUEoJpaU-5wBRSIjnztMM=.d53f8a2f-8353-49ec-8a9b-695b32f03d20@github.com> Message-ID: On Tue, 28 Nov 2023 23:25:27 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The man page of jcmd will be updated in a separate PR. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Apply man changes src/hotspot/share/code/codeCache.cpp line 1809: > 1807: } > 1808: > 1809: void CodeCache::write_perf_map(const char* filename) { Why not have a `filename == nullptr` indicate that the default should be used. Then you don't need CodeCache::DefaultPerfMapFile. You can just have a private `CodeCache::defaultPerfmapFileName()` method. src/hotspot/share/code/codeCache.hpp line 232: > 230: const char* name() const { return _name; } > 231: };) > 232: Multi-line sections like this should really use `#ifdef LINUX`. src/hotspot/share/runtime/java.cpp line 507: > 505: if (DumpPerfMapAtExit) { > 506: CodeCache::DefaultPerfMapFile file; > 507: CodeCache::write_perf_map(file.name()); It's a bit inconsistent to support a user provided filename for the dcmd but not when using `DumpPerfMapAtExit`. Perhaps add `PerfMapFilename`. I'm not insisting, but something to consider. Would be good for the compiler team to comment on this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15871#discussion_r1412585004 PR Review Comment: https://git.openjdk.org/jdk/pull/15871#discussion_r1412579044 PR Review Comment: https://git.openjdk.org/jdk/pull/15871#discussion_r1412582441 From matsaave at openjdk.org Fri Dec 1 22:17:03 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 1 Dec 2023 22:17:03 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 23:24:53 GMT, Ioi Lam wrote: > This is a simple clean up that moves the code for initializing the CDS config states from arguments.cpp to cdsConfig.cpp > > I renamed a few functions, but otherwise the code is unchanged. > > - `get_default_shared_archive_path()` -> `default_archive_path()` > - `GetSharedArchivePath()` -> `static_archive_path()` > - `GetSharedDynamicArchivePath()` -> `dynamic_archive_path()` > > There's also less `#if INCLUDE_CDS` since the entire cdsConfig.cpp file is compiled only if CDS is enabled. This looks good! I think this is a good opportunity to refactor some of the code for better readability so I left some comments below. src/hotspot/share/cds/cdsConfig.cpp line 101: > 99: void CDSConfig::extract_shared_archive_paths(const char* archive_path, > 100: char** base_archive_path, > 101: char** top_archive_path) { Could you align these arguments? src/hotspot/share/cds/cdsConfig.cpp line 125: > 123: } > 124: > 125: void CDSConfig::init_shared_archive_paths() { Now that I see this there is a lot of indentation thanks to the nested conditionals. I don't have much to offer but is there a cleaner way to format this method? Maybe you can extract the code in `if (archives == 1)` into its own method for better readability. src/hotspot/share/runtime/arguments.cpp line 1262: > 1260: } > 1261: > 1262: CDSConfig::check_system_property(key, value); I see this is only called once, do you expect this method to be used again? It may be unnecessary to extract this code into its own method. ------------- PR Review: https://git.openjdk.org/jdk/pull/16868#pullrequestreview-1760626242 PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412613074 PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412616593 PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412610795 From duke at openjdk.org Fri Dec 1 22:17:11 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 1 Dec 2023 22:17:11 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v8] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to support _vectorizedHashCode intrinsic on > RISC-V platform. The patch adds the "scalar" code for the intrinsic without > usage of any RVV instruction but provides manual unrolling of the appropriate > loop. The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Thanks, > -Yuri Gaevsky > > P.S. My OCA has been accepted recently (ygaevsky). > > ### Correctness checks > > Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. > > ### Performance results (the numbers for non-ints are similar) > > #### StarFive JH7110 board: > > > ArraysHashCode: without intrinsic with intrinsic > ------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > ------------------------------------------------------------------------------- > multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op > multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op > multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op > multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op > multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op > multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op > multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op > multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op > multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op > multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op > multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op > multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op > multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op > multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op > multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op > multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op > multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op > multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op > ------------------------------------------------------------------------------- > > #### T-Head RVB-ICE board: > > > ArraysHashCode: ... Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: Removed comment and break clause from default switch case. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16629/files - new: https://git.openjdk.org/jdk/pull/16629/files/23db372c..a57afe9c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=06-07 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16629.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16629/head:pull/16629 PR: https://git.openjdk.org/jdk/pull/16629 From duke at openjdk.org Fri Dec 1 22:17:12 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 1 Dec 2023 22:17:12 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v7] In-Reply-To: References: <3qPd3ZhLhbYsRHcljYxk_m3C-4CmwBGyUADUAil64a8=.a15fd7e3-1c9b-4be1-b67c-24967427ef5f@github.com> Message-ID: On Fri, 1 Dec 2023 17:00:53 GMT, Hamlin Li wrote: >> I have just borrrowed that part of code from X86 counterpart: >> https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp#L3354 >> It is a dead code so 'ShouldNotReachHere();' looks more appropriate here. Do you think we should fix this as a part of this patch or as some follow-ups for both x86/RISC-V? > > Yes, as we already have `ShouldNotReachHere();` to guard the default case. How do you think about it? FIxed. Thanks for catching this, @Hamlin-Li! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1412612384 From jjoo at openjdk.org Fri Dec 1 22:40:55 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Fri, 1 Dec 2023 22:40:55 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48] In-Reply-To: References: <_lEBVrWV8wrVbmhOiu3AAqPJo_xBs718ZtA9V-VSzGM=.253c0ec8-256e-4dee-b125-90be6338e4b8@github.com> Message-ID: On Fri, 1 Dec 2023 21:01:29 GMT, Man Cao wrote: >> I agree that the counter is valuable if always up-to-date, but if it is out of sync compared to the "concurrent counters" I think it will confuse some users. So if we want to keep it I think we should try to keep it in sync. >> >> I suppose adding a lock for updating `gc_total` should be ok. In the safepoint case we should never contend on that lock and for the concurrent updates it should not be a big deal. Basically what I think would be to change `update_counter(...)` to do something like: >> >> if (CPUTimeGroups::is_gc_counter(name)) { >> >> instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->inc(net_cpu_time); >> } >> >> >> This way we would also be able to remove the publish part above, right? Any other problems with this approach? > > I think the ideal approach to simplify this is to support Atomic operation on a `PerfCounter`. > We could either introduce a `PerfAtomicCounter`/`PerfAtomicLongCounter` class, or perform `Atomic::add()` on the `PerfData::_valuep` pointer. There's already `PerfData::get_address()`, so we might be able to do: > > > Atomic::add((volatile jlong *)(instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->get_address()), net_cpu_time); > > > However, a new class `PerfAtomicCounter` is likely cleaner. E.g., we may also want to make `PerfAtomicCounter::sample()` use a CAS. It is probably better to introduce `PerfAtomicCounter` in a separate RFE later. > > Would the `Atomic::add()` with `PerfData::get_address()` approach be OK for now, or would we rather introduce a lock, or leave the `gc_total` mechanism as-is and address the out-of-sync-ness in a follow-up RFE? > > IMO the out-of-sync-ness problem is minor, because users are likely to either look at a single `gc_total` counter, or look at each individual GC CPU counter and disregard `gc_total`. In the interest of the RDP1 deadline, should we leave improving the sync issues with gc_total to a separate RFE? (Especially given that a "correct" design may take some time to come up with, and that gc_total being slightly out of sync is not a major issue.) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1412631873 From iklam at openjdk.org Sat Dec 2 00:38:58 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 2 Dec 2023 00:38:58 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v2] In-Reply-To: References: Message-ID: > This is a simple clean up that moves the code for initializing the CDS config states from arguments.cpp to cdsConfig.cpp > > I renamed a few functions, but otherwise the code is unchanged. > > - `get_default_shared_archive_path()` -> `default_archive_path()` > - `GetSharedArchivePath()` -> `static_archive_path()` > - `GetSharedDynamicArchivePath()` -> `dynamic_archive_path()` > > There's also less `#if INCLUDE_CDS` since the entire cdsConfig.cpp file is compiled only if CDS is enabled. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: fixed indentation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16868/files - new: https://git.openjdk.org/jdk/pull/16868/files/72f3e44c..01dd47bc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16868&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16868&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16868/head:pull/16868 PR: https://git.openjdk.org/jdk/pull/16868 From iklam at openjdk.org Sat Dec 2 00:38:58 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 2 Dec 2023 00:38:58 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v2] In-Reply-To: <7tqgQeAidnvr6kp8hkHZ4QPCV_pFbVvWbafTiWzEEbg=.0e728f7b-0c77-4012-bc3d-6cec099b9e68@github.com> References: <7tqgQeAidnvr6kp8hkHZ4QPCV_pFbVvWbafTiWzEEbg=.0e728f7b-0c77-4012-bc3d-6cec099b9e68@github.com> Message-ID: On Wed, 29 Nov 2023 21:53:06 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed indentation > > src/hotspot/share/cds/cdsConfig.cpp line 34: > >> 32: #include "logging/log.hpp" >> 33: #include "runtime/arguments.hpp" >> 34: #include "runtime/java.hpp" > > I was able to build with your patch without including `java.hpp`. > The #include java.hpp could also be removed from arguments.cpp. cdsConfig.cpp needs the declaration of `vm_exit_during_initialization()` from java.hpp. Although java.hpp is included by arguments.hpp, we usually try to avoid such indirectly inclusions. Otherwise if arguments.hpp is changed to no longer include java.hpp, then cdsConfig.hpp will fail to compile. I am not sure about arguments.cpp -- if java.hpp is already included by arguments.hpp, do we need to explicitly include it in arguments.cpp? I'll leave that alone in this PR. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412682898 From iklam at openjdk.org Sat Dec 2 00:39:03 2023 From: iklam at openjdk.org (Ioi Lam) Date: Sat, 2 Dec 2023 00:39:03 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v2] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 22:04:22 GMT, Matias Saavedra Silva wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed indentation > > src/hotspot/share/cds/cdsConfig.cpp line 101: > >> 99: void CDSConfig::extract_shared_archive_paths(const char* archive_path, >> 100: char** base_archive_path, >> 101: char** top_archive_path) { > > Could you align these arguments? Fixed. > src/hotspot/share/cds/cdsConfig.cpp line 125: > >> 123: } >> 124: >> 125: void CDSConfig::init_shared_archive_paths() { > > Now that I see this there is a lot of indentation thanks to the nested conditionals. I don't have much to offer but is there a cleaner way to format this method? Maybe you can extract the code in `if (archives == 1)` into its own method for better readability. I want to keep the code change minimal while moving code from one file to another. I'll refactor this function in a follow-on PR. That way it will be easier to track the code history. > src/hotspot/share/runtime/arguments.cpp line 1262: > >> 1260: } >> 1261: >> 1262: CDSConfig::check_system_property(key, value); > > I see this is only called once, do you expect this method to be used again? It may be unnecessary to extract this code into its own method. I wanted to move the code from arguments.cpp to cdsConfig.cpp, so I had to put it in a new function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412683767 PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412683760 PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412683786 From jjoo at openjdk.org Sat Dec 2 01:25:53 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Sat, 2 Dec 2023 01:25:53 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53] In-Reply-To: <810qMt__o90-A1Csix4IiygZEpyP09w8tisrwY5mQC4=.b8696cfe-2323-4a6d-884c-47df6568a337@github.com> References: <-GX8bATX2hz3YWgnJbhTNEYbi4t8HxfdhYqBP-ulyGg=.0080d7b0-8e43-4b81-b885-1d4a742048cc@github.com> <810qMt__o90-A1Csix4IiygZEpyP09w8tisrwY5mQC4=.b8696cfe-2323-4a6d-884c-47df6568a337@github.com> Message-ID: On Fri, 1 Dec 2023 16:19:49 GMT, Volker Simonis wrote: >> src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp line 905: >> >>> 903: gc_threads_do(&tttc); >>> 904: >>> 905: CPUTimeCounters::publish_gc_total_cpu_time(); >> >> As I suggested in the other comment, maybe we should not keep the total counter, but if we do we need to make sure the destructor of the closure is run before the call to `publish_gc_total_cpu_time()`. Otherwise we will publish a not yet updated value. > > I still think that a total counter is useful and I'd appreciate if you can keep it. To second what @caoman said before, it is GC agnostic, easy to use even for non GC experts and future proof with regards to implementation changes in the GCs. Please keep it. Put the closure in a scope, I think that should address the concern. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1412694432 From duke at openjdk.org Sat Dec 2 01:28:59 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Sat, 2 Dec 2023 01:28:59 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v6] In-Reply-To: References: Message-ID: > `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. > > `jcmd PID help Compiler.perfmap` shows the following usage. > > > Compiler.perfmap > Write map file for Linux perf tool. > > Impact: Low > > Syntax : Compiler.perfmap [] > > Arguments: > filename : [optional] Name of the map file (STRING, no default value) > > > The man page of jcmd will be updated in a separate PR. Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Remove DefaultPerfMapFile ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15871/files - new: https://git.openjdk.org/jdk/pull/15871/files/a7dcf426..6a854920 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15871&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15871&range=04-05 Stats: 26 lines in 4 files changed: 9 ins; 15 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/15871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15871/head:pull/15871 PR: https://git.openjdk.org/jdk/pull/15871 From ccheung at openjdk.org Sat Dec 2 03:38:45 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Sat, 2 Dec 2023 03:38:45 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v2] In-Reply-To: References: <7tqgQeAidnvr6kp8hkHZ4QPCV_pFbVvWbafTiWzEEbg=.0e728f7b-0c77-4012-bc3d-6cec099b9e68@github.com> Message-ID: On Sat, 2 Dec 2023 00:32:30 GMT, Ioi Lam wrote: >> src/hotspot/share/cds/cdsConfig.cpp line 34: >> >>> 32: #include "logging/log.hpp" >>> 33: #include "runtime/arguments.hpp" >>> 34: #include "runtime/java.hpp" >> >> I was able to build with your patch without including `java.hpp`. >> The #include java.hpp could also be removed from arguments.cpp. > > cdsConfig.cpp needs the declaration of `vm_exit_during_initialization()` from java.hpp. Although java.hpp is included by arguments.hpp, we usually try to avoid such indirectly inclusions. Otherwise if arguments.hpp is changed to no longer include java.hpp, then cdsConfig.hpp will fail to compile. > > I am not sure about arguments.cpp -- if java.hpp is already included by arguments.hpp, do we need to explicitly include it in arguments.cpp? I'll leave that alone in this PR. Thanks for the explanation. Looks good then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16868#discussion_r1412714559 From ccheung at openjdk.org Sat Dec 2 03:38:43 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Sat, 2 Dec 2023 03:38:43 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v2] In-Reply-To: References: Message-ID: On Sat, 2 Dec 2023 00:38:58 GMT, Ioi Lam wrote: >> This is a simple clean up that moves the code for initializing the CDS config states from arguments.cpp to cdsConfig.cpp >> >> I renamed a few functions, but otherwise the code is unchanged. >> >> - `get_default_shared_archive_path()` -> `default_archive_path()` >> - `GetSharedArchivePath()` -> `static_archive_path()` >> - `GetSharedDynamicArchivePath()` -> `dynamic_archive_path()` >> >> There's also less `#if INCLUDE_CDS` since the entire cdsConfig.cpp file is compiled only if CDS is enabled. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > fixed indentation Marked as reviewed by ccheung (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16868#pullrequestreview-1760771168 From duke at openjdk.org Sat Dec 2 05:12:37 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 2 Dec 2023 05:12:37 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 16:47:28 GMT, Lei Zaakjyu wrote: > 8234502: Merge GenCollectedHeap and SerialHeap It's weird to see GathererTest::testMassivelyComposedGatherers fail when testing in linux-x86. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1837043039 From duke at openjdk.org Sat Dec 2 05:34:35 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 2 Dec 2023 05:34:35 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 16:47:28 GMT, Lei Zaakjyu wrote: > 8234502: Merge GenCollectedHeap and SerialHeap #16928 found this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16927#issuecomment-1837048728 From stuefe at openjdk.org Sat Dec 2 07:01:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 2 Dec 2023 07:01:40 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v2] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 10:55:30 GMT, Stefan Karlsson wrote: >> There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: >> >> >> if (UseTransparentHugePages && !HugePages::supports_thp()) { >> if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { >> log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); >> } >> UseLargePages = UseTransparentHugePages = false; >> return; >> } >> >> >> This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: >> >> /sys/kernel/mm/transparent_hugepage/enabled: never >> /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise >> >> >> the above code will force ZGC to run without THPs. >> >> This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: >> >> 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. >> >> 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. >> >> 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. >> >> The result of this change can be seen in these tables: >> >> ZGC large pages log output: >> >> E (T) = Enabled (Transparent) >> E (T, OS) = Enabled (Transparent, OS enforced) >> D = Disabled >> D = Disabled (OS enforced) >> >> -XX:+UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+--------+---------+------- >> always | E (T) | E (T) | E (T) >> within_size | E (T) | E (T) | E (T) >> advise | E (T) | E (T) | E (T) >> never | D (OS) | D (OS) | D (OS) >> deny | D (OS) | D (OS) | D (OS) >> force | E (T) | E (T) | E (T) >> >> -XX:-UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+-----------+-----------+------- >> always | E (T, OS) | E (T, OS) | E (T, OS) >> within_size | E (T, OS) | E (T, OS) | E (T, OS) >> advise | D ... > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Small tweaks Small question: https://wiki.openjdk.org/display/zgc/Main#Main-EnablingTransparentHugePagesOnLinux mentions that to use THPs with ZGC, one needs both `/sys/kernel/mm/transparent_hugepage/enabled -> "madvise"` and `/sys/kernel/mm/transparent_hugepage/shmem_enabled -> "advise"` in conjunction. Is that correct, the latter needs the former? I did not read this from https://www.kernel.org/doc/html/next/admin-guide/mm/transhuge.html. src/hotspot/os/linux/hugepages.cpp line 321: > 319: > 320: const bool huge_pages_turned_off = !FLAG_IS_DEFAULT(UseLargePages) && !UseLargePages; > 321: _thp_requested = UseTransparentHugePages && !huge_pages_turned_off; This muddles the water a bit, since the original intent of HugePages vs whatever happens in os_linux was to let HugePages give me the unadulterated info of what the OS supports, whereas processing switches and deciding on them should happen in os_linux in large_page_init. Would it be possible to move "_thp_requested" up to the caller? We can keep the "should_madvise_anonymous_thps" since those make sense here, but move the "requested" condition up to the caller. src/hotspot/os/linux/os_linux.cpp line 3722: > 3720: } > 3721: > 3722: log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); Would it be not clearer to define when to warn, as we do in warn_no_large_pages? Related to that, should we not warn if ZGC and +shmemthp configured but -anonymous thp? I am not sure the heap is the only part of the JVM that uses THP, and other parts would still use anon THP, or? E.g. Code heap. Also, maybe a better message for the poor admin that tries to setup. E.g.: bool requires_shmem_thp = UseTHP + UseZGC bool requires_anon_thp = UseTHP bool off = false; if (requires_shmem && !shmem configured) (log_warning "Shmem thp are not supported. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to advise to support shmem thp") off = true; if (requires_anonthp && !anon_thp configured) (log_warning "anonymous Thp are not supported. Set /sys/kernel/mm/transparent_hugepage/enabled to madvise") off = true; if (off) UseTHP = 0 log_warning(UseTHP disabled (see previous messages) if ZGC and !supports shmemthp or src/hotspot/os/linux/os_linux.cpp line 3736: > 3734: ls.print_cr(". Default large page size: " EXACTFMT ".", EXACTFMTARGS(os::large_page_size())); > 3735: } else { > 3736: ls.print("Large page support %sdisabled.", uses_zgc_shmem_thp() ? "partially " : ""); I wonder whether we could make our life simpler by not supporting mixes: we could require that for ZGC, to use THP, both shmen and anon thps have to be active. Would that be acceptable or do you think there are too many misconfigured systems out there? ------------- PR Review: https://git.openjdk.org/jdk/pull/16690#pullrequestreview-1760790216 PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1412735114 PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1412736663 PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1412737495 From stuefe at openjdk.org Sat Dec 2 07:01:42 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 2 Dec 2023 07:01:42 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v2] In-Reply-To: References: Message-ID: <8trTzQjUgAfCai6zCRnxhMcYBFRUpk8WhfKPYggXBxI=.52662d99-e5e5-4a2f-9b6f-2c780ba585e0@github.com> On Fri, 1 Dec 2023 09:52:29 GMT, Stefan Karlsson wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Small tweaks > > src/hotspot/os/linux/os_linux.cpp line 2886: > >> 2884: >> 2885: void os::pd_realign_memory(char *addr, size_t bytes, size_t alignment_hint) { >> 2886: if (HugePages::should_madvise_anonymous_thps() && alignment_hint > vm_page_size()) { > > The use of `HugePages::should_madvise_anonymous_thps()` adds a change in behavior. By using it instead of `UseTransparentHugepages`, we only call `madvise` when the OS is configured to care about `madvise`. I've been using this in my testing, but I can revert back to using `UseTransparentHugepages`, and then we can change this separately with [JDK-8312468](https://bugs.openjdk.org/browse/JDK-8312468). This makes sense. We can close https://bugs.openjdk.org/browse/JDK-8312468 as dup then. But why don't you madvise for shmemthp ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1412735637 From stuefe at openjdk.org Sat Dec 2 07:08:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Sat, 2 Dec 2023 07:08:40 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v2] In-Reply-To: References: Message-ID: <_yjGgwhpDdYsUK7I-FrhA_j4Efi326KJqgD-p4GjK8c=.0840954b-b06f-446e-bcfc-f0eb9d7a5855@github.com> On Sat, 2 Dec 2023 00:38:58 GMT, Ioi Lam wrote: >> This is a simple clean up that moves the code for initializing the CDS config states from arguments.cpp to cdsConfig.cpp >> >> I renamed a few functions, but otherwise the code is unchanged. >> >> - `get_default_shared_archive_path()` -> `default_archive_path()` >> - `GetSharedArchivePath()` -> `static_archive_path()` >> - `GetSharedDynamicArchivePath()` -> `dynamic_archive_path()` >> >> There's also less `#if INCLUDE_CDS` since the entire cdsConfig.cpp file is compiled only if CDS is enabled. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > fixed indentation Looks good. Did not find any functional difference to the original code. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16868#pullrequestreview-1760796058 From jjoo at openjdk.org Sat Dec 2 07:37:26 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Sat, 2 Dec 2023 07:37:26 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v55] In-Reply-To: References: Message-ID: > 8315149: Add hsperf counters for CPU time of internal GC threads Jonathan Joo has updated the pull request incrementally with two additional commits since the last revision: - Only create CPUTimeCounters if supported - Ensure TTTC is destructed before publishing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15082/files - new: https://git.openjdk.org/jdk/pull/15082/files/fcf00cfe..242fef84 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=54 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15082&range=53-54 Stats: 23 lines in 3 files changed: 11 ins; 0 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/15082.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15082/head:pull/15082 PR: https://git.openjdk.org/jdk/pull/15082 From jjoo at openjdk.org Sat Dec 2 07:37:28 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Sat, 2 Dec 2023 07:37:28 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53] In-Reply-To: References: <-GX8bATX2hz3YWgnJbhTNEYbi4t8HxfdhYqBP-ulyGg=.0080d7b0-8e43-4b81-b885-1d4a742048cc@github.com> <4NJOslObcIY-G1nUAbKeiCWKK-wbhEw94avA2c2cJ7s=.2ff4560f-5184-40ad-982f-96570cc4a9fc@github.com> Message-ID: On Fri, 1 Dec 2023 14:42:22 GMT, Albert Mingkun Yang wrote: >> I would say it depends on the use-case and here when switching to use static functions to use the instance it felt more like an all-static class. I agree that it would be nice to avoid the additional memory usage if `UsePerfData` is `false` so I'm ok with keeping the instance if we add that. > >> It is easier to control the initialization order and timing of an on-heap singleton object than statics. > > It's generally true, but the init of CPUTimeCounters is not sensitive to ordering. > >> This could save some memory when UsePerfData is false. > > True, but the mem savings will be marginal at most, `PerfCounter* _cpu_time_counters[static_cast(CPUTimeGroups::CPUTimeType::COUNT)];` will be 8 * ~12 = ~96 bytes (including future cpu-time-type enums). > > (I don't have a strong preference here; however, I'd like to make the pros and cons explicit.) Added the check in the initializer ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1412749097 From mli at openjdk.org Sat Dec 2 10:34:36 2023 From: mli at openjdk.org (Hamlin Li) Date: Sat, 2 Dec 2023 10:34:36 GMT Subject: RFR: 8319313: G1: Rename G1EvacFailureInjector appropriately In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 15:29:22 GMT, Thomas Schatzl wrote: > Hi all, > > please review this rename of `G1EvacFailureInjector` and associated options to `G1AllocationFailureInjector` according to the results of the discussion for the review of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706). > > To facilitate review the first commit implements the renaming changes, the second moves the affected files only. > > Testing: gha, local gc/g1 tests > > Thanks, > Thomas LGTM. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16905#pullrequestreview-1760853320 From duke at openjdk.org Sat Dec 2 11:07:24 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 2 Dec 2023 11:07:24 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v2] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: line-break for EOF ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/4ef3a2e2..bdf57c83 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From mgronlun at openjdk.org Sat Dec 2 11:33:53 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sat, 2 Dec 2023 11:33:53 GMT Subject: RFR: 8211238: @Deprecated JFR event Message-ID: Greetings, please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. Testing: jdk_jfr, CI 1-6, stress testing Thanks Markus ------------- Commit messages: - Merge branch 'openjdk:master' into 8211238 - whitespace - emergency dump support - whitespace - remove assert - 8211238 Changes: https://git.openjdk.org/jdk/pull/16931/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8211238 Stats: 2391 lines in 65 files changed: 2043 ins; 251 del; 97 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From duke at openjdk.org Sat Dec 2 16:02:56 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 2 Dec 2023 16:02:56 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v3] In-Reply-To: References: Message-ID: <1eZytBv58RjqU1NMHbf4sZEgAjoN0ArxjOkCvy8gylU=.a10d482f-983f-4c32-82f1-3dbdf44d0262@github.com> > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: restore comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/bdf57c83..e297b42f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From mgronlun at openjdk.org Sat Dec 2 17:20:58 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sat, 2 Dec 2023 17:20:58 GMT Subject: RFR: 8211238: @Deprecated JFR event [v2] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with two additional commits since the last revision: - Merge branch '8211238' of github.com:mgronlun/jdk into 8211238 - reflection support ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/cdedbb20..1d0b8a98 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=00-01 Stats: 131 lines in 5 files changed: 120 ins; 9 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From aph at openjdk.org Sat Dec 2 21:52:39 2023 From: aph at openjdk.org (Andrew Haley) Date: Sat, 2 Dec 2023 21:52:39 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v2] In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 13:12:32 GMT, Andrew Haley wrote: >> Vectorizing Poly1305 is quite tricky. We already have a highly- >> efficient scalar Poly1305 implementation that runs on the core integer >> unit, but it's highly serialized, so it does not make make good use of >> the parallelism available. >> >> The scalar implementation takes advantage of some particular features >> of the Poly1305 keys. In particular, certain bits of r, the secret >> key, are required to be 0. These make it possible to use a full >> 64-bit-wide multiply-accumulate operation without needing to process >> carries between partial products, >> >> While this works well for a serial implementation, a parallel >> implementation cannot do this because rather than multiplying by r, >> each step multiplies by some integer power of r, modulo >> 2^130-5. >> >> In order to avoid processing carries between partial products we use a >> redundant representation, in which each 130-bit integer is encoded >> either as a 5-digit integer in base 2^26 or as a 3-digit integer in >> base 2^52, depending on whether we are using a 64- or 32-bit >> multiply-accumulate. >> >> In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate >> operation available to us, so we must use 32*32 -> 64-bit operations. >> >> In order to achieve maximum performance we'd like to get close to the >> processor's decode bandwidth, so that every clock cycle does something >> useful. In a typical high-end AArch64 implementation, the core integer >> unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a >> fast(ish) two-way 32-bit multiplier, which may be slower than than the >> core integer unit's. It is not at all obvious whether it's best to use >> ASIMD or core instructions. >> >> Fortunately, if we have a wide-bandwidth instruction decode, we can do >> both at the same time, by feeding alternating instructions to the core >> and the ASIMD units. This also allows us to make good use of all of >> the available core and ASIMD registers, in parallel. >> >> To do this we use generators, which here are a kind of iterator that >> emits a group of instructions each time it is called. In this case we >> 4 parallel generators, and by calling them alternately we interleave >> the ASIMD and the core instructions. We also take care to ensure that >> each generator finishes at about the same time, to maximize the >> distance between instructions which generate and consume data. >> >> The results are pretty good, ranging from 2* - 3* speedup. It is >> possible that a pure in-order processor (Raspberry Pi?) migh... > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > remove debug code Some raspberry Pi 4 numbers, before and after. There are some regressions for very small data sizes -below 1kbyte - but even on a very-low-power processor the vectorized version has a small advantage. Benchmark (dataSize) (provider) Mode Cnt Score Error Units Poly1305DigestBench.updateBytes 64 avgt 3 0.174 ? 0.001 us/op Poly1305DigestBench.updateBytes 256 avgt 3 0.536 ? 0.002 us/op Poly1305DigestBench.updateBytes 1024 avgt 3 1.976 ? 0.004 us/op Poly1305DigestBench.updateBytes 16384 avgt 3 30.820 ? 0.024 us/op Poly1305DigestBench.updateBytes 1048576 avgt 3 2001.719 ? 49.343 us/op Benchmark (dataSize) (provider) Mode Cnt Score Error Units Poly1305DigestBench.updateBytes 64 avgt 3 0.226 ? 0.002 us/op Poly1305DigestBench.updateBytes 256 avgt 3 0.765 ? 0.105 us/op Poly1305DigestBench.updateBytes 1024 avgt 3 1.807 ? 0.111 us/op Poly1305DigestBench.updateBytes 16384 avgt 3 22.679 ? 0.043 us/op Poly1305DigestBench.updateBytes 1048576 avgt 3 1452.466 ? 14.891 us/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/16812#issuecomment-1837262405 From alanb at openjdk.org Sun Dec 3 10:28:44 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 3 Dec 2023 10:28:44 GMT Subject: RFR: 8211238: @Deprecated JFR event [v2] In-Reply-To: References: Message-ID: On Sat, 2 Dec 2023 17:20:58 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with two additional commits since the last revision: > > - Merge branch '8211238' of github.com:mgronlun/jdk into 8211238 > - reflection support src/jdk.jfr/share/classes/jdk/jfr/internal/test/DeprecatedThing.java line 90: > 88: public static void reflectionForRemoval() { > 89: staticCounter++; > 90: } You might want to extend the set of tests to include cases that have the "since" element. There is a 2x2 matrix of cases to fully exercise the parsing of the RuntimeVisibleAnnotations content. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1413041171 From mgronlun at openjdk.org Sun Dec 3 13:29:21 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 13:29:21 GMT Subject: RFR: 8211238: @Deprecated JFR event [v3] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: restructured testcase ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/1d0b8a98..f8ef13cf Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=01-02 Stats: 279 lines in 4 files changed: 86 ins; 91 del; 102 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From mgronlun at openjdk.org Sun Dec 3 13:36:01 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 13:36:01 GMT Subject: RFR: 8211238: @Deprecated JFR event [v4] In-Reply-To: References: Message-ID: <848Z_qNEkLyfuHIDhH8zeM93fKdNQUk8IH9HNSX8R2g=.ecb41dae-630d-4515-9598-f642b93f4a0e@github.com> > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/f8ef13cf..2239ce2e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=02-03 Stats: 63 lines in 1 file changed: 0 ins; 0 del; 63 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From mgronlun at openjdk.org Sun Dec 3 13:36:05 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 13:36:05 GMT Subject: RFR: 8211238: @Deprecated JFR event [v2] In-Reply-To: References: Message-ID: On Sun, 3 Dec 2023 10:15:41 GMT, Alan Bateman wrote: >> Markus Gr?nlund has updated the pull request incrementally with two additional commits since the last revision: >> >> - Merge branch '8211238' of github.com:mgronlun/jdk into 8211238 >> - reflection support > > src/jdk.jfr/share/classes/jdk/jfr/internal/test/DeprecatedThing.java line 90: > >> 88: public static void reflectionForRemoval() { >> 89: staticCounter++; >> 90: } > > You might want to extend the set of tests to include cases that have the "since" element. There is a 2x2 matrix of cases to fully exercise the parsing of the RuntimeVisibleAnnotations content. Thanks for this input, Alan. I have made the tests and cases more structured to cover this. Cheers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1413080638 From mgronlun at openjdk.org Sun Dec 3 15:44:57 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 15:44:57 GMT Subject: RFR: 8211238: @Deprecated JFR event [v5] In-Reply-To: References: Message-ID: <0WwjBlkrqLti-FWPfWAcPrLEdgDbJPLfc0f1FGBtZM8=.d760cb3b-81cb-45d5-b8c1-994a9e16413b@github.com> > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: tighter lock scopes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/2239ce2e..a5717e16 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=03-04 Stats: 45 lines in 6 files changed: 16 ins; 13 del; 16 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From mgronlun at openjdk.org Sun Dec 3 16:06:16 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 16:06:16 GMT Subject: RFR: 8211238: @Deprecated JFR event [v6] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: tuning ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/a5717e16..628651b3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=04-05 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From mgronlun at openjdk.org Sun Dec 3 16:13:19 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 16:13:19 GMT Subject: RFR: 8211238: @Deprecated JFR event [v7] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: adjustements ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/628651b3..5d76503f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=05-06 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From mgronlun at openjdk.org Sun Dec 3 16:31:13 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 16:31:13 GMT Subject: RFR: 8211238: @Deprecated JFR event [v8] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: minor adjustment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/5d76503f..4e78e895 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=06-07 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From alanb at openjdk.org Sun Dec 3 16:36:40 2023 From: alanb at openjdk.org (Alan Bateman) Date: Sun, 3 Dec 2023 16:36:40 GMT Subject: RFR: 8211238: @Deprecated JFR event [v8] In-Reply-To: References: Message-ID: On Sun, 3 Dec 2023 16:31:13 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > minor adjustment A question about "level". Is the intention that the value can be anything, e.g. some new event next month might use the values "1", "2, "3"? Just asking because ordinarily deprecated vs. terminally deprecated is very specific to the manner in which a program element is deprecated and I assume you don't want this event grabbing the general name for a very specific event setting. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16931#issuecomment-1837532534 From mgronlun at openjdk.org Sun Dec 3 16:47:38 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Sun, 3 Dec 2023 16:47:38 GMT Subject: RFR: 8211238: @Deprecated JFR event [v8] In-Reply-To: References: Message-ID: On Sun, 3 Dec 2023 16:33:48 GMT, Alan Bateman wrote: > A question about "level". Is the intention that the value can be anything, e.g. some new event next month might use the values "1", "2, "3"? Just asking because ordinarily deprecated vs. terminally deprecated is very specific to the manner in which a program element is deprecated and I assume you don't want this event grabbing the general name for a very specific event setting. Yes, the design is generic. An event control/setting to be used also for other events. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16931#issuecomment-1837534965 From dholmes at openjdk.org Mon Dec 4 00:43:56 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Dec 2023 00:43:56 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: References: Message-ID: On Wed, 29 Nov 2023 11:49:31 GMT, Jaroslav Bachorik wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` instance. Once that method gets redefined, the `jmethodID` is updated to point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original `jmethodID` will be redirected to the latest `Method` version and at the same time the 'previous' `Method` version will receive a new `jmethodID` pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` JVMTI calls while this previous `Method` version is still on stack we will have the corresponding frame identified by a `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for cleanup at what time all contained `Method` instances. The cleanup process will not perform the `jmethodID` pointer maintenance and we will end up with pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to `ClassLoaderData` instance and all relevant `jmethodID`s will get batch-updated when the class loader is being released and all its classes are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being deallocate the associated `jmethodID` (if any) must not point to the deallocated instance once we are finished. Unfortunately, we can not just update the `jmethodID` values in bulk when purging an old class version - the per `InstanceKlass` jmethodID cache is present only for the main class version and contains `jmethodID` values for both the old and current method versions. >> >> ~Therefore we need to perform `jmethodID` lookup when we are about to deallocate a `Method` instance and clean up the pointer only if that `jmethodID` is pointing to the `Method` instance which is being deallocated.~ >> >> Therefore, we need to perform `jmethodID` lookup for each method in an old class version that is getting purged, and null out the pointer of that `jmethodID` to break the link from `jmethodID` to the method instance that is about to get deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Restrict cleanup to obsolete methods only >From the blog: > Yes! The methods are being deallocated for a class loader that is still alive. Okay so why does this happen and is it a reasonable thing to be happening? On the surface it sounds wrong to deallocate anything associated with a live classloader. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1837668001 From fyang at openjdk.org Mon Dec 4 03:11:45 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 4 Dec 2023 03:11:45 GMT Subject: RFR: 8318227: RISC-V: C2 ConvHF2F [v3] In-Reply-To: References: Message-ID: <13Ot4D45ppGcgnXjlGP1xrYEcZ8LejbI5cxjRruUD4c=.4cd4ca6f-8e4f-4679-9706-59a86d867b6f@github.com> On Wed, 29 Nov 2023 11:15:23 GMT, Hamlin Li wrote: >> Hi, >> Can you review the patch to add ConvHF2F intrinsic to JDK for riscv? >> Thanks! >> >> (By latest kernel patch, `#define RISCV_HWPROBE_EXT_ZFH (1 << 27)` >> https://lore.kernel.org/lkml/20231114141256.126749-11-cleger at rivosinc.com/) >> >> ## Test >> ### Functionality >> #### hotspot tests >> test/hotspot/jtreg/compiler/intrinsics/ >> test/hotspot/jtreg/compiler/c2/irTests >> >> #### jdk tests >> test/jdk/java/lang/Float/Binary16Conversion*.java >> >> ### Performance >> tested on licheepi. >> >> #### with UseZfh enabled & stub out-of-band >> >> Benchmark (size) Mode Cnt Score Error Units >> Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 3493.376 ? 18.631 ns/op >> Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 19.819 ? 0.193 ns/op >> >> >> #### with UseZfh enabled only >> (i.e. enable the intrinsic) >> >> Benchmark (size) Mode Cnt Score Error Units >> Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 4659.796 ? 13.262 ns/op >> Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 22.957 ? 0.098 ns/op >> >> >> #### with UseZfh disabled >> (i.e. disable the intrinsic) >> >> Benchmark (size) Mode Cnt Score Error Units >> Fp16ConversionBenchmark.float16ToFloat 2048 avgt 10 22930.591 ? 72.595 ns/op >> Fp16ConversionBenchmark.float16ToFloatMemory 2048 avgt 10 25.970 ? 0.063 ns/op > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > Fix pipeline cost in ad; Add comments Hi Hamlin, updated change looks good to me. Please wait a while for the kernel patch to land. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16802#pullrequestreview-1761527566 From dholmes at openjdk.org Mon Dec 4 03:23:44 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Dec 2023 03:23:44 GMT Subject: RFR: 8320886: Unsafe_SetMemory0 is not guarded [v2] In-Reply-To: References: <5kRdxpEyFZLzxlyHpdHju1w9qLbm4OA6UkVZMr17nt0=.339b7543-574c-4a06-84e9-2ffb9d9a345a@github.com> Message-ID: On Wed, 29 Nov 2023 19:13:34 GMT, Jorn Vernee wrote: >> See JBS issue. >> >> Guard the memory access done in Unsafe_SetMemory0 to prevent a SIGBUS error from crashing the VM when a truncated memory mapped file is accessed. >> >> Testing: local `InternalErrorTest`, Tier 1-5 (ongoing) > > Jorn Vernee has updated the pull request incrementally with two additional commits since the last revision: > > - add handling for missing instruction > - Print out instruction Okay fix seems fine. Thanks test/hotspot/jtreg/runtime/Unsafe/InternalErrorTest.java line 158: > 156: case 3: > 157: MemorySegment segment = MemorySegment.ofBuffer(buffer); > 158: // testing Unsafe.setMemory, trying to access next page after truncation. Pre-existing nit in the line you copied - there is a double-space after 'next' ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16848#pullrequestreview-1761533861 PR Review Comment: https://git.openjdk.org/jdk/pull/16848#discussion_r1413325910 From dholmes at openjdk.org Mon Dec 4 04:57:35 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Dec 2023 04:57:35 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: On Thu, 30 Nov 2023 09:58:28 GMT, Andrew Haley wrote: >> I was also going to suggest adding a new flag and creating an alias. The new flag will need a CSR request of course. > >> I was also going to suggest adding a new flag and creating an alias. The new flag will need a CSR request of course. > > Given that it's new and it's diagnostic flag I'm a bit surprised at that. I was trying for a quick fix. > > Anyway, how do you create an alias? I can't see any examples, and I haven't found a way through the maze of twisty `#define` passages. @theRealAph the `RestoreMXCSROnJNICalls` flag is a product flag not diagnostic. Aliased flags are setup in arguments.cpp by editing this: static AliasedFlag const aliased_jvm_flags[] = { { "DefaultMaxRAMFraction", "MaxRAMFraction" }, { "CreateMinidumpOnCrash", "CreateCoredumpOnCrash" }, { nullptr, nullptr} }; ------------- PR Comment: https://git.openjdk.org/jdk/pull/16851#issuecomment-1837839773 From dholmes at openjdk.org Mon Dec 4 04:57:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Dec 2023 04:57:38 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: On Tue, 28 Nov 2023 15:58:04 GMT, Andrew Haley wrote: >> Some buggy libraries corrupt the floating-point control register. Provide something similar to the x86 RestoreMXCSROnJNICalls. >> >> I realize that using the x86ish name "RestoreMXCSROnJNICalls" might be a little controversial, but it is a _global_ flag, not a CPU-specific one. And it's clearly intended for this purpose. It might have been better if that flag had been given a better name twentyish years ago, but we can't change it now. > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Fix thinko Also from arguments.cpp * ALIASED: An option that is simply another name for another option. This is often * part of the process of deprecating a flag, but not all aliases need * to be deprecated. * * Create an alias for an option by adding the old and new option names to the * "aliased_jvm_flags" table. Delete the old variable from globals.hpp (etc). ------------- PR Comment: https://git.openjdk.org/jdk/pull/16851#issuecomment-1837841265 From fyang at openjdk.org Mon Dec 4 06:38:38 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 4 Dec 2023 06:38:38 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 17:48:11 GMT, Ludovic Henry wrote: > 8315856: RISC-V: Use Zacas extension for cmpxchg src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2630: > 2628: mv(tmp, oldv); > 2629: atomic_cas(tmp, newv, addr, Assembler::int64, Assembler::aq, Assembler::rl); > 2630: beq(tmp, oldv, succeed); The Zacas spec says: `The memory operation performed by an AMOCAS.W/D/Q, when not successful, has acquire semantics if aq bit is 1 but does not have release semantics, regardless of rl.` So when the CAS fails, I think we are lacking the needed semantics which is enforced at L2645 for the else block. Seems that we should place a `membar(AnyAny);` after the `beq` instruction when like our aarch64 counterpart [1]. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L2758 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1413422831 From vkempik at openjdk.org Mon Dec 4 07:23:38 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 4 Dec 2023 07:23:38 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: Message-ID: On Wed, 15 Nov 2023 15:44:47 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced tmp with t0 Can some reviewer take a look again please ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1837971648 From rehn at openjdk.org Mon Dec 4 07:24:39 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 4 Dec 2023 07:24:39 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg In-Reply-To: References: Message-ID: <6My9uP_jRGDPa31RMLN07O7DpheOFp6d2KgLSDieWU8=.bb4a52a3-b766-4167-9078-08dd9af564cf@github.com> On Mon, 4 Dec 2023 06:32:59 GMT, Fei Yang wrote: >> 8315856: RISC-V: Use Zacas extension for cmpxchg > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2630: > >> 2628: mv(tmp, oldv); >> 2629: atomic_cas(tmp, newv, addr, Assembler::int64, Assembler::aq, Assembler::rl); >> 2630: beq(tmp, oldv, succeed); > > The Zacas spec says: `The memory operation performed by an AMOCAS.W/D/Q, when not successful, has acquire semantics if aq bit is 1 but does not have release semantics, regardless of rl.` > > So when the CAS fails, I think we are lacking the needed semantics which is enforced at L2645 for the else block. Seems that we should place a `membar(AnyAny);` after the `beq` instruction when like our aarch64 counterpart [1]. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L2758 Good! Yea, we discussed that internally and I thought we fixed that, those changes seems to have been lost, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1413457961 From fweimer at redhat.com Mon Dec 4 07:28:32 2023 From: fweimer at redhat.com (Florian Weimer) Date: Mon, 04 Dec 2023 08:28:32 +0100 Subject: Use of C++ dynamic global object initialization with thread guards Message-ID: <87fs0izasf.fsf@oldenburg.str.redhat.com> As far as I understand it, the Hotspot C++ style guide advises not use C++ run-time library features. However, it seems that the use of dynamic initialization guards (that involve calls to __cxa_guard_acquire and __cxa_guard_release) has increased quite a bit over the years. The implementation of __cxa_guard_acquire is not entirely trivial because it detects recursive initialization and throws __gnu_cxx::recursive_init_error, which means that it pulls in the C++ unwinder (at least with a traditional GNU/Linux build of libstdc++.a). Furthermore, most uses of C++ dynamic initialization involve a computation that is idempotent and have unused bit patterns in the initialized value. This means that a separate guard variable is not needed, and a simple atomic store/atomic load could be used. In other cases, the use of global objects seems unnecessary. For example, src/hotspot/share/jfr/recorder/checkpoint/jfrCheckpointManager.cpp has a dynamically initialized static variable max_elem_size: ? BufferPtr JfrCheckpointManager::lease_global(Thread* thread, bool previous_epoch /* false */, size_t size /* 0 */) { JfrCheckpointMspace* const mspace = instance()._global_mspace; assert(mspace != nullptr, "invariant"); static const size_t max_elem_size = mspace->min_element_size(); // min is max BufferPtr buffer; if (size <= max_elem_size) { buffer = mspace_acquire_live(size, mspace, thread, previous_epoch); if (buffer != nullptr) { buffer->set_lease(); DEBUG_ONLY(assert_lease(buffer);) return buffer; } } buffer = mspace_allocate_transient_lease_to_live_list(size, mspace, thread, previous_epoch); DEBUG_ONLY(assert_lease(buffer);) return buffer; } ? The min_element_size() member function is inline and just returns _min_element_size, which is declared const, so it cannot change over time. This means that caching that value is pointless, and the static should probably removed. Would it make sense to minimize the use of __cxa_guard_acquire and __cxa_guard_release? There are currently 400 such calls, but many of them appear in templated code, so I could get it down to ~80 calls with about a days of work. Those calls are clearly visible in --with-stdc++lib=dynamic, so it would be possible to add a regression test to avoid introducing further such calls. Thanks, Florian From dfenacci at openjdk.org Mon Dec 4 07:48:54 2023 From: dfenacci at openjdk.org (Damon Fenacci) Date: Mon, 4 Dec 2023 07:48:54 GMT Subject: RFR: 8311906: Improve robustness of String constructors with mutable array inputs [v14] In-Reply-To: References: <6SKlGLh5MmxoEx07wHCCUc8KWbbhcspLJmcc1uxQ_FI=.ca33bfb4-fa5c-45f0-b49f-ee6c5c6b68b4@github.com> Message-ID: On Thu, 30 Nov 2023 15:51:46 GMT, Roger Riggs wrote: >> Strings, after construction, are immutable but may be constructed from mutable arrays of bytes, characters, or integers. >> The string constructors should guard against the effects of mutating the arrays during construction that might invalidate internal invariants for the correct behavior of operations on the resulting strings. In particular, a number of operations have optimizations for operations on pairs of latin1 strings and pairs of non-latin1 strings, while operations between latin1 and non-latin1 strings use a more general implementation. >> >> The changes include: >> >> - Adding a warning to each constructor with an array as an argument to indicate that the results are indeterminate >> if the input array is modified before the constructor returns. >> The resulting string may contain any combination of characters sampled from the input array. >> >> - Ensure that strings that are represented as non-latin1 contain at least one non-latin1 character. >> For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or another encoding decoded to latin1 the scanning and compression is unchanged. >> If a non-latin1 character is found, the string is represented as non-latin1 with the added verification that a non-latin1 character is present at the same index. >> If that character is found to be latin1, then the input array has been modified and the result of the scan may be incorrect. >> Though a ConcurrentModificationException could be thrown, the risk to an existing application of an unexpected exception should be avoided. >> Instead, the non-latin1 copy of the input is re-scanned and compressed; that scan determines whether the latin1 or the non-latin1 representation is returned. >> >> - The methods that scan for non-latin1 characters and their intrinsic implementations are updated to return the index of the non-latin1 character. >> >> - String construction from StringBuilder and CharSequence must also be guarded as their contents may be modified during construction. > > Roger Riggs has updated the pull request incrementally with one additional commit since the last revision: > > Correct jcc/jccb branches Intrinsics look OK to me. ------------- Marked as reviewed by dfenacci (Committer). PR Review: https://git.openjdk.org/jdk/pull/16425#pullrequestreview-1761763069 From tschatzl at openjdk.org Mon Dec 4 07:56:04 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 4 Dec 2023 07:56:04 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v5] In-Reply-To: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: > Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) > > Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). > > The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. > > Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). > > Upcoming changes will > * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. > * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) > * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism > * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) > * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. > > Please also first looking into the (small) PR this depends on. > > The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. > > Testing: tier1-7 > > Thanks, > Thomas Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: - Merge branch 'master' into 8317809-sorted-insertion-of-free-blobs - remove trailing whitespace - fix indentation after recent commit - Address ayang/iwalulya review comments, remove inheritance in ClassUnloadingContext for now as unnecessary for this change, use iterators, other review comments - Merge branch 'master' into mergeme - iwalulya review, naming - 8317809 Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) Introduce a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). Upcoming changes will * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism * G1: move some signifcant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. - Only run test case on debug VMs, sufficient - 8320331 g1 full gc "during" verification accesses half-unloaded metadata ------------- Changes: https://git.openjdk.org/jdk/pull/16759/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16759&range=04 Stats: 474 lines in 28 files changed: 347 ins; 83 del; 44 mod Patch: https://git.openjdk.org/jdk/pull/16759.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16759/head:pull/16759 PR: https://git.openjdk.org/jdk/pull/16759 From mbaesken at openjdk.org Mon Dec 4 07:58:37 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 4 Dec 2023 07:58:37 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding > /label jfr > > I'm not sure I understand the issue, but adding a field to an event because of a GCC bug seems excessive. I think in practise the issue happens mostly when using binaries (shared libs) compiled by gcc with certain flags. But it could happen as well with other shared libs loaded by the JVM that mess around with the fp environment ; that's why we check and reset now the fp env. Maybe Andrew Haley could comment, he knows better about the details of these issues . ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1838012540 From stefank at openjdk.org Mon Dec 4 08:17:54 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 4 Dec 2023 08:17:54 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v3] In-Reply-To: References: Message-ID: <0G1XJXBoSVApN6amRh4g8S4QjXtVLNqAgEl-4De5ir4=.7e1038af-c579-4077-aec8-f713ffd62294@github.com> > There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: > > > if (UseTransparentHugePages && !HugePages::supports_thp()) { > if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { > log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); > } > UseLargePages = UseTransparentHugePages = false; > return; > } > > > This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: > > /sys/kernel/mm/transparent_hugepage/enabled: never > /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise > > > the above code will force ZGC to run without THPs. > > This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: > > 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. > > 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. > > 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. > > The result of this change can be seen in these tables: > > ZGC large pages log output: > > E (T) = Enabled (Transparent) > E (T, OS) = Enabled (Transparent, OS enforced) > D = Disabled > D = Disabled (OS enforced) > > -XX:+UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+--------+---------+------- > always | E (T) | E (T) | E (T) > within_size | E (T) | E (T) | E (T) > advise | E (T) | E (T) | E (T) > never | D (OS) | D (OS) | D (OS) > deny | D (OS) | D (OS) | D (OS) > force | E (T) | E (T) | E (T) > > -XX:-UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+-----------+-----------+------- > always | E (T, OS) | E (T, OS) | E (T, OS) > within_size | E (T, OS) | E (T, OS) | E (T, OS) > advise | D | D | D > never | D | D | D > deny | D | D | D > force ... Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Move _thp_requested out from HugePages ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16690/files - new: https://git.openjdk.org/jdk/pull/16690/files/901b4b10..cb692dea Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=01-02 Stats: 49 lines in 6 files changed: 33 ins; 10 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16690.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16690/head:pull/16690 PR: https://git.openjdk.org/jdk/pull/16690 From stefank at openjdk.org Mon Dec 4 08:20:45 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 4 Dec 2023 08:20:45 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v2] In-Reply-To: References: Message-ID: On Sat, 2 Dec 2023 06:36:59 GMT, Thomas Stuefe wrote: >> Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: >> >> Small tweaks > > src/hotspot/os/linux/hugepages.cpp line 321: > >> 319: >> 320: const bool huge_pages_turned_off = !FLAG_IS_DEFAULT(UseLargePages) && !UseLargePages; >> 321: _thp_requested = UseTransparentHugePages && !huge_pages_turned_off; > > This muddles the water a bit, since the original intent of HugePages vs whatever happens in os_linux was to let HugePages give me the unadulterated info of what the OS supports, whereas processing switches and deciding on them should happen in os_linux in large_page_init. Would it be possible to move "_thp_requested" up to the caller? > > We can keep the "should_madvise_anonymous_thps" since those make sense here, but move the "requested" condition up to the caller. I've pushed a change that moves the "requested" condition out of `HugePages` and into `os::Linux`. I also moved `should_madvise_anonymous_thps`, since it depends on the "requested" property an not only on what's available in the HugePages structs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1413512440 From shade at openjdk.org Mon Dec 4 08:23:51 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Dec 2023 08:23:51 GMT Subject: Integrated: 8320924: Improve heap dump performance by optimizing archived object checks In-Reply-To: <8Ek_2iD6dG8MJE0AEHlzxcD4GDCYYEmKeVoBMO4PBF8=.4352c26a-76b9-46ae-af3f-8666821c9a9c@github.com> References: <8Ek_2iD6dG8MJE0AEHlzxcD4GDCYYEmKeVoBMO4PBF8=.4352c26a-76b9-46ae-af3f-8666821c9a9c@github.com> Message-ID: <8oLYqV5GjeB1BRfVAhZFPAGPSVdKTGa6_ADmnR7fNI0=.447c05ea-4e41-45bc-90e5-c3591004f716@github.com> On Tue, 28 Nov 2023 20:24:17 GMT, Aleksey Shipilev wrote: > Profiling heap dumping code reveals another simple issue: `mask_dormant_archived_object` on dumping hotpath takes quite a bit of time. We can reflow it for better inlineability, throwing out the non-essential parts into cold method. There is also no reason to peek into java mirror with (default) keep-alive, if we only use the result for null-check. > > Example improvements on Mac M1: > > > % for I in `seq 1 5`; do build/macosx-aarch64-server-release/images/jdk/bin/java -XX:+UseParallelGC -XX:+HeapDumpAfterFullGC -Xms8g -Xmx8g HeapDump.java 2>&1 | grep created; rm *.hprof; done > > # Before > Heap dump file created [1897307608 bytes in 1.584 secs] > Heap dump file created [1897308278 bytes in 1.439 secs] > Heap dump file created [1897308508 bytes in 1.460 secs] > Heap dump file created [1897308505 bytes in 1.423 secs] > Heap dump file created [1897308554 bytes in 1.414 secs] > > # After > Heap dump file created [1897307648 bytes in 1.509 secs] > Heap dump file created [1897308498 bytes in 1.281 secs] > Heap dump file created [1897308554 bytes in 1.282 secs] > Heap dump file created [1897308512 bytes in 1.263 secs] > Heap dump file created [1897308554 bytes in 1.270 secs] > > > ...which is about +12% faster heap dump. > > I also eyeballed the generated code and saw `mask_dormant_archived_object` fully inlined at least on x86_64. This pull request has now been integrated. Changeset: f32ab8cc Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/f32ab8cc47c8a1b4887e9c7c86b145ce4b85c546 Stats: 41 lines in 3 files changed: 19 ins; 17 del; 5 mod 8320924: Improve heap dump performance by optimizing archived object checks Reviewed-by: yyang, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/16863 From stefank at openjdk.org Mon Dec 4 08:26:41 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 4 Dec 2023 08:26:41 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v3] In-Reply-To: <8trTzQjUgAfCai6zCRnxhMcYBFRUpk8WhfKPYggXBxI=.52662d99-e5e5-4a2f-9b6f-2c780ba585e0@github.com> References: <8trTzQjUgAfCai6zCRnxhMcYBFRUpk8WhfKPYggXBxI=.52662d99-e5e5-4a2f-9b6f-2c780ba585e0@github.com> Message-ID: On Sat, 2 Dec 2023 06:40:55 GMT, Thomas Stuefe wrote: >> src/hotspot/os/linux/os_linux.cpp line 2886: >> >>> 2884: >>> 2885: void os::pd_realign_memory(char *addr, size_t bytes, size_t alignment_hint) { >>> 2886: if (HugePages::should_madvise_anonymous_thps() && alignment_hint > vm_page_size()) { >> >> The use of `HugePages::should_madvise_anonymous_thps()` adds a change in behavior. By using it instead of `UseTransparentHugepages`, we only call `madvise` when the OS is configured to care about `madvise`. I've been using this in my testing, but I can revert back to using `UseTransparentHugepages`, and then we can change this separately with [JDK-8312468](https://bugs.openjdk.org/browse/JDK-8312468). > > This makes sense. We can close https://bugs.openjdk.org/browse/JDK-8312468 as dup then. > > But why don't you madvise for shmemthp ? The assumption here is that pd_realign_memory is used for anonymous memory. The madvise for shmemthp is in the ZGC code (which is the only user of shmemthp, AFAIK): // Maybe madvise the mapping to use transparent huge pages if (os::Linux::should_madvise_shmem_thps()) { os::Linux::madvise_transparent_huge_pages(addr, length); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1413517946 From xgong at openjdk.org Mon Dec 4 08:33:49 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Mon, 4 Dec 2023 08:33:49 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 16:45:49 GMT, Magnus Ihse Bursie wrote: > The final thing we need to resolve properly is the SVE compiler test. > > @theRealAph says: > > > arm_sve.h is part of GCC. It was added to GCC in 2019. > > A more relevant question is what version of gcc it was added, and if that also implies that the compiler knows about `-march=armv8-a+sve`. If so, then this test could basically be framed as a gcc version check. > > I'm still leaning towards failing configure if the SVE code cannot be compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS would not be set? Yes, the SVE compiler test code could be treated as a gcc/clang version check. `arm_sve.h` which is included in `sleef.h` and then in `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It was included in gcc start from version 10 (may not be exact, but gcc 8/9 would fail when compile c code including this header). We have to make sure the compiler supports the SVE ACLE before using it. Here are the different scenarios: 1. The SVE compiler test success, and `SVE_CFLAGS` is set to `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully including NEON/SVE. Hence, the vector math operations with all kinds of vector size on both NEON/SVE machines will be improved as expected. 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. Hence, the enhancement for vector math operations with > 128-bit vector size on SVE machines are missing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838061502 From stuefe at openjdk.org Mon Dec 4 09:08:41 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Dec 2023 09:08:41 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding I also think it seems odd to include it in JFR. As an alternative, I propose to put a test for that condition into the JNI checker (`-Xcheck:jni`). The point of that option is to check third-party native code for problems, so its a perfect fit. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1838113884 From ayang at openjdk.org Mon Dec 4 09:11:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 4 Dec 2023 09:11:41 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v4] In-Reply-To: <2_ONmN3qxsdTIEJMbQhE82nBn10l_RnZm5-DZAmQn2I=.9ad49c74-3721-4299-8a9e-b8c1973eb494@github.com> References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> <2_ONmN3qxsdTIEJMbQhE82nBn10l_RnZm5-DZAmQn2I=.9ad49c74-3721-4299-8a9e-b8c1973eb494@github.com> Message-ID: On Thu, 30 Nov 2023 11:38:28 GMT, Thomas Schatzl wrote: >> Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) >> >> Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). >> >> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. >> >> Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). >> >> Upcoming changes will >> * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. >> * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) >> * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism >> * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) >> * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. >> >> Please also first looking into the (small) PR this depends on. >> >> The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. >> >> Testing: tier1-7 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with three additional commits since the last revision: > > - remove trailing whitespace > - fix indentation after recent commit > - Address ayang/iwalulya review comments, remove inheritance in ClassUnloadingContext for now as unnecessary for this change, use iterators, other review comments src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1686: > 1684: // Unload Klasses, String, Code Cache, etc. > 1685: if (ClassUnloadingWithConcurrentMark) { > 1686: _g1h->unload_classes_and_code("Class Unloading", &is_alive, _gc_timer_cm); Kind of preexisting: I'd not expect find "Class Unloading" inside `weak_refs_work()`. Its caller level is more reasonable, IMO. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1410762799 From ayang at openjdk.org Mon Dec 4 09:11:47 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 4 Dec 2023 09:11:47 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v5] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Mon, 4 Dec 2023 07:56:04 GMT, Thomas Schatzl wrote: >> Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) >> >> Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). >> >> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. >> >> Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). >> >> Upcoming changes will >> * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. >> * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) >> * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism >> * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) >> * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. >> >> Please also first looking into the (small) PR this depends on. >> >> The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. >> >> Testing: tier1-7 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: > > - Merge branch 'master' into 8317809-sorted-insertion-of-free-blobs > - remove trailing whitespace > - fix indentation after recent commit > - Address ayang/iwalulya review comments, remove inheritance in ClassUnloadingContext for now as unnecessary for this change, use iterators, other review comments > - Merge branch 'master' into mergeme > - iwalulya review, naming > - 8317809 Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) > > Introduce a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. > GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). > > The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform > this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every > insertion to allow for concurrent users for the lock to progress. > > Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing > CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared > towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). > > Upcoming changes will > * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly > reduce code purging time for the STW collectors. > * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) > * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better > parallelism > * G1: move some signifcant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) > * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. > - Only run test case on debug VMs, sufficient > - 8320331 g1 full gc "during" verification accesses half-unloa... src/hotspot/share/gc/shared/classUnloadingContext.hpp line 38: > 36: static ClassUnloadingContext* _context; > 37: > 38: ClassLoaderData* volatile _cld_head; I don't get why `ClassLoaderData* volatile _cld_head;` needs to be inside `ClassUnloadingContext`. What's the motivation for the change to files in `src/hotspot/share/classfile/`? Seems that they have nothing to do with other fields inside this class, which are for sort-before-free of nmethods. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1413566268 From luhenry at openjdk.org Mon Dec 4 09:15:39 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 4 Dec 2023 09:15:39 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg In-Reply-To: References: Message-ID: <-3jwuz4yezWZ6T1YsziD2LxZq_8mO9-TMSeGxmlYHR0=.552c9484-699f-4744-89bb-76680bc450b3@github.com> On Thu, 30 Nov 2023 17:48:11 GMT, Ludovic Henry wrote: > 8315856: RISC-V: Use Zacas extension for cmpxchg I?m OOO this week, I?ll take a look next week. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16910#issuecomment-1838124434 From stefank at openjdk.org Mon Dec 4 09:32:03 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 4 Dec 2023 09:32:03 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v4] In-Reply-To: References: Message-ID: > There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: > > > if (UseTransparentHugePages && !HugePages::supports_thp()) { > if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { > log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); > } > UseLargePages = UseTransparentHugePages = false; > return; > } > > > This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: > > /sys/kernel/mm/transparent_hugepage/enabled: never > /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise > > > the above code will force ZGC to run without THPs. > > This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: > > 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. > > 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. > > 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. > > The result of this change can be seen in these tables: > > ZGC large pages log output: > > E (T) = Enabled (Transparent) > E (T, OS) = Enabled (Transparent, OS enforced) > D = Disabled > D = Disabled (OS enforced) > > -XX:+UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+--------+---------+------- > always | E (T) | E (T) | E (T) > within_size | E (T) | E (T) | E (T) > advise | E (T) | E (T) | E (T) > never | D (OS) | D (OS) | D (OS) > deny | D (OS) | D (OS) | D (OS) > force | E (T) | E (T) | E (T) > > -XX:-UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+-----------+-----------+------- > always | E (T, OS) | E (T, OS) | E (T, OS) > within_size | E (T, OS) | E (T, OS) | E (T, OS) > advise | D | D | D > never | D | D | D > deny | D | D | D > force ... Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: More precise THP warning messages ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16690/files - new: https://git.openjdk.org/jdk/pull/16690/files/cb692dea..ec05ba30 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=02-03 Stats: 44 lines in 1 file changed: 26 ins; 15 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16690.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16690/head:pull/16690 PR: https://git.openjdk.org/jdk/pull/16690 From sjohanss at openjdk.org Mon Dec 4 09:32:11 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Mon, 4 Dec 2023 09:32:11 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v53] In-Reply-To: References: <-GX8bATX2hz3YWgnJbhTNEYbi4t8HxfdhYqBP-ulyGg=.0080d7b0-8e43-4b81-b885-1d4a742048cc@github.com> <810qMt__o90-A1Csix4IiygZEpyP09w8tisrwY5mQC4=.b8696cfe-2323-4a6d-884c-47df6568a337@github.com> Message-ID: On Sat, 2 Dec 2023 01:22:33 GMT, Jonathan Joo wrote: >> I still think that a total counter is useful and I'd appreciate if you can keep it. To second what @caoman said before, it is GC agnostic, easy to use even for non GC experts and future proof with regards to implementation changes in the GCs. Please keep it. > > Put the closure in a scope, I think that should address the concern. Yes, scoping it will work. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1413591437 From stefank at openjdk.org Mon Dec 4 09:32:08 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 4 Dec 2023 09:32:08 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v2] In-Reply-To: References: Message-ID: <3ZiujwFod2l9RQKLDZTHOLpCFCqWdBmlSEMANllEDkQ=.6750f422-cc50-4684-b7d0-cf211ef91dd8@github.com> On Sat, 2 Dec 2023 06:51:07 GMT, Thomas Stuefe wrote: > Would it be not clearer to define when to warn, as we do in warn_no_large_pages? I don't understand what you are suggesting with this question / request, so I'm not sure exactly what you are looking for. Instead, I made my own version of the pseudo code you posted. This is the warnings I get with that change: Without ZGC: $ thp never never always madvise [never] always within_size advise [never] deny force $ java -XX:+UseTransparentHugePages -version [0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. [0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system. ... $ thp never advise always madvise [never] always within_size [advise] never deny force java -XX:+UseTransparentHugePages -version [0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. [0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system. ... $ thp madvise never always [madvise] never always within_size advise [never] deny force $ java -XX:+UseTransparentHugePages -version ... $ thp madvise advise always [madvise] never always within_size [advise] never deny force $ java -XX:+UseTransparentHugePages -version ... With ZGC: $ thp never never always madvise [never] always within_size advise [never] deny force $ java -XX:+UseTransparentHugePages -XX:+UseZGC -version [0.002s][warning][pagesize] Shared memory transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to 'advise' to enable them. [0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. [0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system. ... $ thp never advise always madvise [never] always within_size [advise] never deny force $ java -XX:+UseTransparentHugePages -XX:+UseZGC -version [0.001s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. [0.001s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system. ... $ thp madvise never always [madvise] never always within_size advise [never] deny force $ java -XX:+UseTransparentHugePages -XX:+UseZGC -version [0.002s][warning][pagesize] Shared memory transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to 'advise' to enable them. ... $ thp madvise advise always [madvise] never always within_size [advise] never deny force $ java -XX:+UseTransparentHugePages -XX:+UseZGC -version ... Please take a look and see if this is an OK solution. > src/hotspot/os/linux/os_linux.cpp line 3736: > >> 3734: ls.print_cr(". Default large page size: " EXACTFMT ".", EXACTFMTARGS(os::large_page_size())); >> 3735: } else { >> 3736: ls.print("Large page support %sdisabled.", uses_zgc_shmem_thp() ? "partially " : ""); > > I wonder whether we could make our life simpler by not supporting mixes: we could require that for ZGC, to use THP, both shmen and anon thps have to be active. Would that be acceptable or do you think there are too many misconfigured systems out there? I would prefer to not force users to set both. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1413588574 PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1413591291 From aboldtch at openjdk.org Mon Dec 4 09:53:15 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 4 Dec 2023 09:53:15 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v10] In-Reply-To: References: <2MRTHFoYSaSW2NH922LOEvqKx4NLjshWaHJaYV2RdVY=.e234046a-aac8-4d7b-81b9-269506944165@github.com> Message-ID: On Thu, 30 Nov 2023 22:02:14 GMT, Daniel D. Daugherty wrote: >> For now I believe the extra code noise from trying to handle this race with deflation is not worth it. I creates some questionable code paths and head scratchers. If we were to add a separate FastHashCode just for LM_LIGHTWEIGHT it would be worth it as the while loop body would look quite a bit different and be easier to reason about. >> >> But I was looking for input if we should handle this case regardless of code complexity. Or maybe taking this all the way and create a separate FastHashCode with its own more understandable logic which does not have to try to fit in with the legacy locking/inflation protocol. >> >> Regardless if we were to just go with it as it is now there should probably be a comment here along the line: >> ```c++ >> // With LM_LIGHTWEIGHT FastHashCode may race with deflation here and cause a monitor to be re-inflated. > > I don't think the race with deflation is limited to LM_LIGHTWEIGHT. The inflation > code below detects when there is a collision with async deflation and retries > which can lead to a re-inflation when we loop around again. We can reach the > code below with LM_LEGACY, LM_LIGHTWEIGHT, or LM_MONITOR so I don't > think you need the LM_LIGHTWEIGHT specific comment. > > Yes, we can reach this point in the code when `mark.has_monitor() == true` and > not just when `LockingMode == LM_LIGHTWEIGHT`, but the `inflate()` function > already has to handle that race (and it does). When a Java monitor is lightweight > locked or stack-locked, there can be more than one contending thread and each > of those threads will attempt to `inflate()` the Java monitor into an ObjectMonitor. > Only one thread can win the inflation race and all of the racers trust `inflate()` > to do the right thing. What's the "right thing"? One of the callers to `inflate()` will > install the ObjectMonitor successfully and return it to that caller. All of the other > callers to `inflate()` will detect that they lost the race and return the winner's > ObjectMonitor to their callers. > > There's no reason for the logic to skip the call to `inflate()` because races are > already handled by `inflate()`. > > We got into this spiraling thread because we were trying to figure out if a > non-JavaThread could call `inflate()` because `inflate()` can call `is_lock_owned()` > which has a header comment which talks about non-JavaThreads... > > I believe that is possible with JVM/TI tagging even when we are in > LM_LIGHTWEIGHT mode because a lightweight monitor can be inflated > by a contending thread which can cause the ObjectMonitor to have an > anonymous owner. In that case, this if-statement in `inflate()` can execute: > > if (LockingMode == LM_LIGHTWEIGHT && inf->is_owner_anonymous() && is_lock_owned(current, object)) { > inf->set_owner_from_anonymous(current); > JavaThread::cast(current)->lock_stack().remove(object); > } > > Of course, if our caller is the VMThread, `is_lock_owned()` will return > false so we won't execute the if-statement's code block. There might be some confusion about what I am asking for here. This enhancement is to avoid inflating monitors when installing hash codes on objects with LM_LIGHTWEIGHT. The current state of the PR does this except for when it is racing with deflation. It is very possible to avoid inflating for the race as well. The question is not whether the race is handled, rather that it could be handled in such a way that installing a hash code would never cause monitor inflation. My question in this thread is whether we should handle this case. As already stated my opinion is let the race be handled by inflating and accept that we get some occasional `InflateCause::inflate_cause_hash_code` even with LM_LIGHTWEIGHT. But I do believe that there should be a comment about this. And if the consensus is to instead handle the race by retrying (and thus avoiding inflation completely), then we should split out the lightweight FastHashCode into its own loop. > We got into this spiraling thread because we were trying to figure out if a > non-JavaThread could call `inflate()` because `inflate()` can call `is_lock_owned()` > which has a header comment which talks about non-JavaThreads... I think it ultimately is because this enhancement claims to avoid inflating monitors, so why would `is_lock_owned()` be needed, but it is not the case as the current implementation does not handle the potential race with deflation. I wanted to add a comment to make it clear that this is known and intentional. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16603#discussion_r1413616484 From jsjolen at openjdk.org Mon Dec 4 09:57:39 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Mon, 4 Dec 2023 09:57:39 GMT Subject: RFR: 8319709: Make GrowableArrayCHeap copyable [v2] In-Reply-To: <0CTdcVjWDVOuV23lJ0EGFGgqa4x_P_UxSo5N4WzTJTE=.45f3ba14-bd81-4c58-afe7-6b4849e568aa@github.com> References: <2SEJ0Rh7DNmKgcylAW7_DFxas2Bs3YzTnUSe39OIVsI=.03298520-694f-4ba7-bdce-d1e67eb3872e@github.com> <0UAh881Jw6L5YNbClDQmuE_Q6fzv0ayeqkrblIoigZ8=.5d81b8a8-d04c-48cb-8987-f3fba98ac403@github.com> <0CTdcVjWDVOuV23lJ0EGFGgqa4x_P_UxSo5N4WzTJTE=.45f3ba14-bd81-4c58-afe7-6b4849e568aa@github.com> Message-ID: On Thu, 9 Nov 2023 07:17:32 GMT, Johan Sj?len wrote: >> What is the motivation for this? Please add something to JBS. Also see query below. >> >> Thanks > >> What is the motivation for this? Please add something to JBS. Also see query below. >> >> Thanks > > I need this feature because I want to store `GrowableArray` s within `GrowableArray`s without an unnecessary pointer indirection. I'll add this justification to JBS. > @jdksjolen have you checked all use cases where `GrowableArrayCHeap` is copied? I don't know the state of `GrowableArrayCHeap`, but for `GrowableArray`, there are lots of cases where we actually just would like to "swap" or "move" over the data, and not really duplicate/clone the data. It would just be nice to avoid the overhead of more allocations. > > I guess that is the drawback of the copy-constructor: you can very easily miss heavy allocations. > > Might it be better to forbid the assignment operator and the copy constructor, and make the copying explicit, i.e. with a `copy_from` method? Sorry, I forgot to reply last week. I went through all of the usages and nothing does unintended cloning of the data. The code is written generically for both GA and GACH, so internally unnecessary copying is done (if that's what you're talking about). I think `copy_from` is a guard rail that would inconvenience us as it'd be impossible to use the rest of the C++ idioms that come with copy ctr+copy assignment. If you create a copy, then you need to think about the consequences of that copy, regardless of the class. That's just (one of the many) consequences of coding in C++. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16559#issuecomment-1838196027 From epeter at openjdk.org Mon Dec 4 10:01:39 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 4 Dec 2023 10:01:39 GMT Subject: RFR: 8319709: Make GrowableArrayCHeap copyable [v2] In-Reply-To: References: <2SEJ0Rh7DNmKgcylAW7_DFxas2Bs3YzTnUSe39OIVsI=.03298520-694f-4ba7-bdce-d1e67eb3872e@github.com> <0UAh881Jw6L5YNbClDQmuE_Q6fzv0ayeqkrblIoigZ8=.5d81b8a8-d04c-48cb-8987-f3fba98ac403@github.com> <0CTdcVjWDVOuV23lJ0EGFGgqa4x_P_UxSo5N4WzTJTE=.45f3ba14-bd81-4c58-afe7-6b4849e568aa@github.com> Message-ID: On Mon, 4 Dec 2023 09:55:14 GMT, Johan Sj?len wrote: >>> What is the motivation for this? Please add something to JBS. Also see query below. >>> >>> Thanks >> >> I need this feature because I want to store `GrowableArray` s within `GrowableArray`s without an unnecessary pointer indirection. I'll add this justification to JBS. > >> @jdksjolen have you checked all use cases where `GrowableArrayCHeap` is copied? I don't know the state of `GrowableArrayCHeap`, but for `GrowableArray`, there are lots of cases where we actually just would like to "swap" or "move" over the data, and not really duplicate/clone the data. It would just be nice to avoid the overhead of more allocations. >> >> I guess that is the drawback of the copy-constructor: you can very easily miss heavy allocations. >> >> Might it be better to forbid the assignment operator and the copy constructor, and make the copying explicit, i.e. with a `copy_from` method? > > Sorry, I forgot to reply last week. I went through all of the usages and nothing does unintended cloning of the data. The code is written generically for both GA and GACH, so internally unnecessary copying is done (if that's what you're talking about). > > I think `copy_from` is a guard rail that would inconvenience us as it'd be impossible to use the rest of the C++ idioms that come with copy ctr+copy assignment. If you create a copy, then you need to think about the consequences of that copy, regardless of the class. That's just (one of the many) consequences of coding in C++. @jdksjolen I think the problem is that copy-constructing is also done implicitly, without people realizing it. The question is if we want to trust everybody to understand this and catch it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16559#issuecomment-1838201418 From jbachorik at openjdk.org Mon Dec 4 10:12:57 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Mon, 4 Dec 2023 10:12:57 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 05:15:15 GMT, Thomas Stuefe wrote: >> Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: >> >> Restrict cleanup to obsolete methods only > > I won't be able to review this this week, too snowed in atm. I can take a look next week. We can always just revert the change if needed. > > Thinking about Skara, I think as long as we have this confusing mixture of rules (hotspot wants 2 reviewers that are Reviewer/Committer, but some jdk libs only want one, but then you need two for desktop I think otherwise Phil gets angry) - we should hard-code the 2-reviewer rule into skara as default since it affects the lion's share of all changes. @tstuefe I got confused by the Skara tooling. I had a vague memory of some discussions going on about relaxing the requirement of 2 reviewers for some parts to the code base and I thought I was in a good shape seeing the Skara checkbox. ![Screenshot 2023-12-04 at 11 04 00](https://github.com/openjdk/jdk/assets/738413/a5e363ee-a9e0-4121-9677-c059aa299dd4) As for not having a review for the final version - I am not that restless. I specifically dismissed the previous review to avoid incidentally integrating based on a review of a version that was not actual. Then I asked @coleenp to re-do the review on the final bits (https://github.com/openjdk/jdk/pull/16662#issuecomment-1827432032) @dholmes-ora >Okay so why does this happen and is it a reasonable thing to be happening? On the surface it sounds wrong to deallocate anything associated with a live classloader. This is happening for previous versions of retransformed methods. As long as those methods are still on stack they are kept alive. But once they are not executing they are free to be destroyed. And this is where the problem was happening - the previous versions of methods were being destroyed but the associated jmethodIDs were not updated not to point to what became an invalid memory block. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1838221097 From shade at openjdk.org Mon Dec 4 10:17:49 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Dec 2023 10:17:49 GMT Subject: RFR: 8315559: Delay TempSymbol cleanup to avoid symbol table churn [v14] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 09:55:47 GMT, Oli Gillespie wrote: >> Attempt to fix regressions in class-loading performance caused by fixing a symbol table leak in [JDK-8313678](https://bugs.openjdk.org/browse/JDK-8313678). >> >> See lengthy discussion in https://bugs.openjdk.org/browse/JDK-8315559 for more background. In short, the leak was providing an accidental cache for temporary symbols, allowing reuse. >> >> This change keeps new temporary symbols alive in a queue for a short time, allowing them to be re-used by subsequent operations. For example, when attempting to load a class we may call JVM_FindLoadedClass for multiple classloaders back-to-back, and all of them will create a TempNewSymbol for the same string. At present, each call will leave a dead symbol in the table and create a new one. Dead symbols add cleanup and lookup overhead, and insertion is also costly. With this change, the symbol from the first invocation will live for some time after it is used, and subsequent callers can find the symbol alive in the table - avoiding the extra work. >> >> The queue is bounded, and when full new entries displace the oldest entry. This means symbols are held for the time it takes for 100 new temp symbols to be created. 100 is chosen arbitrarily - the tradeoff is memory usage versus 'cache' hit rate. >> >> When concurrent symbol table cleanup runs, it also drains the queue. >> >> In my testing, this brings Dacapo pmd performance back to where it was before the leak was fixed. >> >> Thanks @shipilev , @coleenp and @MDBijman for helping with this fix. > > Oli Gillespie has updated the pull request incrementally with one additional commit since the last revision: > > Add copyright header for new file Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16398#pullrequestreview-1762033992 From mgronlun at openjdk.org Mon Dec 4 10:21:02 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 4 Dec 2023 10:21:02 GMT Subject: RFR: 8211238: @Deprecated JFR event [v9] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: relax assertion for native caller ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/4e78e895..d67d2fcb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=07-08 Stats: 3 lines in 1 file changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From jbachorik at openjdk.org Mon Dec 4 10:21:03 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Mon, 4 Dec 2023 10:21:03 GMT Subject: RFR: 8211238: @Deprecated JFR event [v8] In-Reply-To: References: Message-ID: On Sun, 3 Dec 2023 16:44:28 GMT, Markus Gr?nlund wrote: >> A question about "level". Is the intention that the value can be anything, e.g. some new event next month might use the values "1", "2, "3"? Just asking because ordinarily deprecated vs. terminally deprecated is very specific to the manner in which a program element is deprecated and I assume you don't want this event grabbing the general name for a very specific event setting. > >> A question about "level". Is the intention that the value can be anything, e.g. some new event next month might use the values "1", "2, "3"? Just asking because ordinarily deprecated vs. terminally deprecated is very specific to the manner in which a program element is deprecated and I assume you don't want this event grabbing the general name for a very specific event setting. > > Yes, the design is generic. An event control/setting to be used also for other events. Hi @mgronlun - sorry for opening a design discussion in PR :( I wonder - will this report each single one invocation of a deprecated method conforming to the rules (JDK method called from non-JDK code)? Can this, potentially, flood the recording if the deprecated method gets called from a hot loop? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16931#issuecomment-1838232356 From mgronlun at openjdk.org Mon Dec 4 10:40:40 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 4 Dec 2023 10:40:40 GMT Subject: RFR: 8211238: @Deprecated JFR event [v8] In-Reply-To: References: Message-ID: On Sun, 3 Dec 2023 16:44:28 GMT, Markus Gr?nlund wrote: >> A question about "level". Is the intention that the value can be anything, e.g. some new event next month might use the values "1", "2, "3"? Just asking because ordinarily deprecated vs. terminally deprecated is very specific to the manner in which a program element is deprecated and I assume you don't want this event grabbing the general name for a very specific event setting. > >> A question about "level". Is the intention that the value can be anything, e.g. some new event next month might use the values "1", "2, "3"? Just asking because ordinarily deprecated vs. terminally deprecated is very specific to the manner in which a program element is deprecated and I assume you don't want this event grabbing the general name for a very specific event setting. > > Yes, the design is generic. An event control/setting to be used also for other events. > Hi @mgronlun - sorry for opening a design discussion in PR :( > > I wonder - will this report each single one invocation of a deprecated method conforming to the rules (JDK method called from non-JDK code)? Can this, potentially, flood the recording if the deprecated method gets called from a hot loop? Hi @jbachorik, it will only report one event per unique call site, during link time. Its not a function of hotness, only unique edge discovery. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16931#issuecomment-1838269120 From stuefe at openjdk.org Mon Dec 4 11:14:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Dec 2023 11:14:55 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: References: Message-ID: <277y6BBHCLkqj7vleST0dY2hg_iZdREWXlQDWqo7dUQ=.700d1ad2-d1de-4f09-b732-32ef5436360a@github.com> On Mon, 4 Dec 2023 00:41:23 GMT, David Holmes wrote: > From the blog: > > > Yes! The methods are being deallocated for a class loader that is still alive. > > Okay so why does this happen and is it a reasonable thing to be happening? On the surface it sounds wrong to deallocate anything associated with a live classloader. This sounds odd to me to. I know that we deallocate the old *byte code* of re-transformed classes; but `Method*` ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1838413238 From tschatzl at openjdk.org Mon Dec 4 11:15:41 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 4 Dec 2023 11:15:41 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v4] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> <2_ONmN3qxsdTIEJMbQhE82nBn10l_RnZm5-DZAmQn2I=.9ad49c74-3721-4299-8a9e-b8c1973eb494@github.com> Message-ID: On Thu, 30 Nov 2023 14:36:19 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with three additional commits since the last revision: >> >> - remove trailing whitespace >> - fix indentation after recent commit >> - Address ayang/iwalulya review comments, remove inheritance in ClassUnloadingContext for now as unnecessary for this change, use iterators, other review comments > > src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 1686: > >> 1684: // Unload Klasses, String, Code Cache, etc. >> 1685: if (ClassUnloadingWithConcurrentMark) { >> 1686: _g1h->unload_classes_and_code("Class Unloading", &is_alive, _gc_timer_cm); > > Kind of preexisting: I'd not expect find "Class Unloading" inside `weak_refs_work()`. Its caller level is more reasonable, IMO. So what is your suggestion here? In some way CLDs and nmethods are weak references, so I can see reason for the original placement, but since it's a fairly large chunk of functionality one might expect it at a higher level. If so, I can move it in a separate patch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1413725314 From tschatzl at openjdk.org Mon Dec 4 11:15:46 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 4 Dec 2023 11:15:46 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v5] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Mon, 4 Dec 2023 09:08:29 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains ten commits: >> >> - Merge branch 'master' into 8317809-sorted-insertion-of-free-blobs >> - remove trailing whitespace >> - fix indentation after recent commit >> - Address ayang/iwalulya review comments, remove inheritance in ClassUnloadingContext for now as unnecessary for this change, use iterators, other review comments >> - Merge branch 'master' into mergeme >> - iwalulya review, naming >> - 8317809 Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) >> >> Introduce a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. >> GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). >> >> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform >> this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every >> insertion to allow for concurrent users for the lock to progress. >> >> Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing >> CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared >> towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). >> >> Upcoming changes will >> * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly >> reduce code purging time for the STW collectors. >> * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) >> * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better >> parallelism >> * G1: move some signifcant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) >> * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. >> - Only run test case on debug VMs, sufficient >> ... > > src/hotspot/share/gc/shared/classUnloadingContext.hpp line 38: > >> 36: static ClassUnloadingContext* _context; >> 37: >> 38: ClassLoaderData* volatile _cld_head; > > I don't get why `ClassLoaderData* volatile _cld_head;` needs to be inside `ClassUnloadingContext`. What's the motivation for the change to files in `src/hotspot/share/classfile/`? Seems that they have nothing to do with other fields inside this class, which are for sort-before-free of nmethods. This change introduces a `ClassUnloadingContext`, not a `NmethodUnloadingContext`, and class (and nmethod) unloading consists of both (unloaded) class handling and (unloaded) nmethod handling. I did not want to first introduce a `NmethodUnloadingContext` that will soon be changed to the `ClassUnloadingContext` anyway as for easier parallel access to the CLDs a bit later. A `ClassUnloadingContext` without handling CLDs seems odd, and just fixing the nmethod stuff only seems just busywork for both the author and the reviewers later (for the sake of having minimal patches). However if you insist on changing this, I can remove the CLD handling and rename everything now and later again. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1413723381 From mgronlun at openjdk.org Mon Dec 4 11:36:27 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 4 Dec 2023 11:36:27 GMT Subject: RFR: 8211238: @Deprecated JFR event [v10] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: backpatching hook ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/d67d2fcb..e5aeb3ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=08-09 Stats: 17 lines in 5 files changed: 11 ins; 0 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From mbaesken at openjdk.org Mon Dec 4 11:40:38 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Mon, 4 Dec 2023 11:40:38 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: <9ZTuSYp_utYFLxv7eDQFSimjFtY007yUurc1CUgfNnA=.20d65535-11e6-46d1-b3fe-710cd75ed18d@github.com> On Mon, 4 Dec 2023 09:06:04 GMT, Thomas Stuefe wrote: > I also think it seems odd to include it in JFR. > If you look at the JFR goal "Provide a low-overhead data collection framework for troubleshooting Java applications and the HotSpot JVM." (https://openjdk.org/jeps/328) I think this fits quite well into the "troubleshooting" part. Not saying that your idea > As an alternative, I propose to put a test for that condition into the JNI checker (`-Xcheck:jni`). The point of that option is to check third-party native code for problems, so its a perfect fit. is bad; what do others think ? My concern here is that the JNI checker (JNI checker (`-Xcheck:jni`)) is probably even less used than JFR. But I might be wrong. The cases where we directly dlopen/dlsym/fcn-call in the JDK codebase are probably not covered by the JNI checker, right ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1838452407 From mgronlun at openjdk.org Mon Dec 4 11:58:06 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 4 Dec 2023 11:58:06 GMT Subject: RFR: 8211238: @Deprecated JFR event [v11] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: JFR conditionals in backpatching code ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/e5aeb3ee..12d9bcac Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=09-10 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From aph at openjdk.org Mon Dec 4 11:59:35 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 4 Dec 2023 11:59:35 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Mon, 4 Dec 2023 07:56:12 GMT, Matthias Baesken wrote: > /label jfr > > I'm not sure I understand the issue, but adding a field to an event because of a GCC bug seems excessive. It's a nasty hard-to-find bug that breaks Java compatibility. Some people have wondered if this is a real-world problem, and the answer is that it's happening, right now, in Oracle's CI testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1838483238 From ihse at openjdk.org Mon Dec 4 12:01:47 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 4 Dec 2023 12:01:47 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: <3cK9QjVQNIgZoWWhrWKEb3XxfbLjprjRMBbStWegH7M=.6df92651-b97d-445a-aa42-302ea791bbea@github.com> Message-ID: On Fri, 1 Dec 2023 16:49:28 GMT, Andrew Haley wrote: >> Oh, and: >> >> If we can't trust SLEEF not to break the ABI we're using, we should not be using SLEEF. > >> @theRealAph You are making good points. >> >> You are basically saying: "we don't need any configure support for libsleef, we can just hard-code the names and dispatch to them directly to a dlopened library at runtime". > > Yep. > >> That is technically correct, but I'd still like to argue that the current setup have some merits: >> >> * It will check that there is no typo in the function names. I agree that the likelihood of getting this wrong is low, but it is still a good practice to use official include files to have the compiler help checking this. >> >> * If we would like to bundle libsleef.so with the JVM, now we have the possibility do do so easily. (Especially if it is like you say that it is not commonly installed). (If licenses allow etc) >> >> * If we want to incorporate/bundle the source code of libsleef into OpenJDK, and build it as part of our internal library, we will have a good starting position, compared to starting from a hard-coded assembly file in hotspot. (I thought I heard some noise about this prospect). >> >> >> So at this point, I am okay with the general approach of this PR. There are still some build issues to sort out, though, I'll address them separately. > > I see, OK. The question in my mind is whether the common builds of OpenJDK (Oracle, Adoptium, etc.) will support running with SLEEF. If by default SLEEF is not required, support won't be built, and (to an nth order approximation) no one will use it. But I guess it's better than nothing. > > Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF support? @theRealAph > Or is there likely to be a plan to e.g. build Oracle's releases with SLEEF support? I can't say anything for sure, but I picked up some positive vibes from our internal chat. I think the idea was that libsleef could potentially cover up vector math for all platforms that the current Intel lib solution is missing (basically, everything but linux+windows x64). So I this can be seen as a bit of a trial balloon if it is worth a more complete integration of libsleef in the JDK. And I can't say anything at all about what Oracle JDKs are going to release with, but I am fully open towards adding libsleef to the GHA builds. And I'm guessing that Adoptium has no reason not to include libsleef, but then again, I cannot of course speak for them. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838489588 From ihse at openjdk.org Mon Dec 4 12:04:45 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Mon, 4 Dec 2023 12:04:45 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: <_oX3Rqyygpj-612IlxZBZOZQnauEyTBjE4goHDa4Ov8=.0ccafde6-c95e-469d-aee6-77a2e75ee7a7@github.com> On Mon, 4 Dec 2023 08:31:17 GMT, Xiaohong Gong wrote: >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >>> arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that also implies that the compiler knows about `-march=armv8-a+sve`. If so, then this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS would not be set? > >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >> >> > arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that also implies that the compiler knows about `-march=armv8-a+sve`. If so, then this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS would not be set? > > Yes, the SVE compiler test code could be treated as a gcc/clang version check. `arm_sve.h` which is included in `sleef.h` and then in `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It was included in gcc start from version 10 (may not be exact, but gcc 8/9 would fail when compile c code including this header). We have to make sure the compiler supports the SVE ACLE before using it. Here are the different scenarios: > > 1. The SVE compiler test success, and `SVE_CFLAGS` is set to `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully including NEON/SVE. Hence, the vector math operations with all kinds of vector size on both NEON/SVE machines will be improved as expected. > 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. Hence, the enhancement for vector math operations with > 128-bit vector size on SVE machines are missing. @XiaohongGong If we are sure that the SVE test will always succeed when running on gcc 10 or higher, then I guess I don't really need a way to enforce SVE support -- you'll just have to make sure you use a recent enough gcc. But, then the entire test becomes a bit unnecessary. You can just replace it with a version check on gcc, or perhaps a FLAGS_COMPILER_CHECK_ARGUMENTS on `-march=armv8-a+sve`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1838495553 From ogillespie at openjdk.org Mon Dec 4 12:29:08 2023 From: ogillespie at openjdk.org (Oli Gillespie) Date: Mon, 4 Dec 2023 12:29:08 GMT Subject: Integrated: 8315559: Delay TempSymbol cleanup to avoid symbol table churn In-Reply-To: References: Message-ID: <9ZpZcqWprAXGxiR52cFFxIl2thtucvC7y82VshASTNQ=.a5f66dff-2550-46b7-9f87-962bf1605f3b@github.com> On Fri, 27 Oct 2023 10:55:16 GMT, Oli Gillespie wrote: > Attempt to fix regressions in class-loading performance caused by fixing a symbol table leak in [JDK-8313678](https://bugs.openjdk.org/browse/JDK-8313678). > > See lengthy discussion in https://bugs.openjdk.org/browse/JDK-8315559 for more background. In short, the leak was providing an accidental cache for temporary symbols, allowing reuse. > > This change keeps new temporary symbols alive in a queue for a short time, allowing them to be re-used by subsequent operations. For example, when attempting to load a class we may call JVM_FindLoadedClass for multiple classloaders back-to-back, and all of them will create a TempNewSymbol for the same string. At present, each call will leave a dead symbol in the table and create a new one. Dead symbols add cleanup and lookup overhead, and insertion is also costly. With this change, the symbol from the first invocation will live for some time after it is used, and subsequent callers can find the symbol alive in the table - avoiding the extra work. > > The queue is bounded, and when full new entries displace the oldest entry. This means symbols are held for the time it takes for 100 new temp symbols to be created. 100 is chosen arbitrarily - the tradeoff is memory usage versus 'cache' hit rate. > > When concurrent symbol table cleanup runs, it also drains the queue. > > In my testing, this brings Dacapo pmd performance back to where it was before the leak was fixed. > > Thanks @shipilev , @coleenp and @MDBijman for helping with this fix. This pull request has now been integrated. Changeset: d23f4f12 Author: Oli Gillespie Committer: Coleen Phillimore URL: https://git.openjdk.org/jdk/commit/d23f4f12adf1ea26b8c340efe2c3854e50b68301 Stats: 134 lines in 4 files changed: 125 ins; 1 del; 8 mod 8315559: Delay TempSymbol cleanup to avoid symbol table churn Reviewed-by: coleenp, kbarrett, shade ------------- PR: https://git.openjdk.org/jdk/pull/16398 From aboldtch at openjdk.org Mon Dec 4 12:30:49 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 4 Dec 2023 12:30:49 GMT Subject: RFR: 8319773: Avoid inflating monitors when installing hash codes for LM_LIGHTWEIGHT [v11] In-Reply-To: References: Message-ID: > LM_LIGHTWEIGHT only uses the lock bits for its locking. This leaves the hashCode bits free when a monitor is not inflated. So instead of inflating when installing the hashCode on a fast locked object it can simply use the hashCode bits in the markWord. > > The mark word transitions Unlocked (0b01) <=> Locked (0b00) are done by retrying the CAS if it fails due to non-lock bit changes. > The mark word transitions Monitor (0b10) <=> Locked/Unlocked (0b0X) are the same as before, inflation already handles hash codes. This change does not interact with the mark word if it is in a Monitor (0b10) state, so the strong CAS which is used for deflation are still valid, and will not fail to any other reason than the cooperative race to help transition the mark word during deflation. > > This is dependent on JDK-8319778 simply because JDK-8319797 is dependent on both this and JDK-8319778. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 17 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 - Fix copy paste typo. - Update src/hotspot/share/opto/library_call.cpp Co-authored-by: Tobias Hartmann - Add retry CAS comment - Use is_neutral over is_unlocked - Merge remote-tracking branch 'upstream_jdk/pr/16602' into JDK-8319773 - Merge remote-tracking branch 'upstream_jdk/master' into JDK-8319778 - ... and 7 more: https://git.openjdk.org/jdk/compare/8ea9fab1...1b907f90 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16603/files - new: https://git.openjdk.org/jdk/pull/16603/files/4508ef5a..1b907f90 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16603&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16603&range=09-10 Stats: 43169 lines in 1145 files changed: 23046 ins; 15733 del; 4390 mod Patch: https://git.openjdk.org/jdk/pull/16603.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16603/head:pull/16603 PR: https://git.openjdk.org/jdk/pull/16603 From coleenp at openjdk.org Mon Dec 4 12:33:12 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 4 Dec 2023 12:33:12 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: References: Message-ID: <5gROVIKM5P_J15BGoYq0DFgDBuwvdEgWz6KNERecqz4=.32869544-e767-4b72-bad4-7b1ee4055444@github.com> On Wed, 29 Nov 2023 11:49:31 GMT, Jaroslav Bachorik wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` instance. Once that method gets redefined, the `jmethodID` is updated to point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original `jmethodID` will be redirected to the latest `Method` version and at the same time the 'previous' `Method` version will receive a new `jmethodID` pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` JVMTI calls while this previous `Method` version is still on stack we will have the corresponding frame identified by a `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for cleanup at what time all contained `Method` instances. The cleanup process will not perform the `jmethodID` pointer maintenance and we will end up with pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to `ClassLoaderData` instance and all relevant `jmethodID`s will get batch-updated when the class loader is being released and all its classes are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being deallocate the associated `jmethodID` (if any) must not point to the deallocated instance once we are finished. Unfortunately, we can not just update the `jmethodID` values in bulk when purging an old class version - the per `InstanceKlass` jmethodID cache is present only for the main class version and contains `jmethodID` values for both the old and current method versions. >> >> ~Therefore we need to perform `jmethodID` lookup when we are about to deallocate a `Method` instance and clean up the pointer only if that `jmethodID` is pointing to the `Method` instance which is being deallocated.~ >> >> Therefore, we need to perform `jmethodID` lookup for each method in an old class version that is getting purged, and null out the pointer of that `jmethodID` to break the link from `jmethodID` to the method instance that is about to get deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Restrict cleanup to obsolete methods only We deallocate the old Method* if nothing is referring to them (ie they're not running or being referenced for some other reason). Look at MetadataOnStackMark. The jmethodIDs to an obsolete method were a dangling pointer and we want to just null them out. The old Method* are attached to the scratch_version of the InstanceKlass so we essentially remove that and walk down to the Methods when none of them are found. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1838538568 From jkern at openjdk.org Mon Dec 4 12:33:26 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 4 Dec 2023 12:33:26 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v2] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: improve handling of nonexisting files ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/e756f496..0f6716db Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=00-01 Stats: 16 lines in 1 file changed: 4 ins; 5 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From aboldtch at openjdk.org Mon Dec 4 12:35:56 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 4 Dec 2023 12:35:56 GMT Subject: RFR: 8319797: Recursive lightweight locking: Runtime implementation [v9] In-Reply-To: References: Message-ID: > Implements the runtime part of JDK-8319796. > The different CPU implementations are/will be created as dependent pull requests. > > This enhancement proposes introducing the ability for LM_LIGHTWEIGHT to handle consecutive recursive monitor enter. Limiting the implementation to only consecutive monitor enters allows for more efficient emitted code which only needs to look at the two top most entires on the lock stack to determine what to do in a monitor exit. > > A high level overview: > * Locking is still performed on the mark word > * Unlocked (0b01) <=> Locked (0b00) > * Monitor enter on Obj with mark word Unlocked (0b01) is the same > * Transition Obj's mark word Unlocked (0b01) => Locked (0b00) > * Push Obj onto the lock stack > * Success > * Monitor enter on Obj with mark word Locked (0b00) will check the top entry on the lock stack > * If top entry is Obj > * Push Obj on the lock stack > * Success > * If top entry is not Obj > * Inflate and call ObjectMonitor::enter > * Monitor exit on Obj with mark word Locked (0b00) will check the two top entries on the lock stack > * If just the top entry is Obj > * Transition Obj's mark word Locked (0b00) => Unlocked (0b01) > * Pop the entry > * Success > * If both entries are Obj > * Pop the top entry > * Success > * Any other case only occurs for unstructured locking, then just inflate and call ObjectMonitor::exit > * If the monitor has been inflated for object Obj which is owned by the current thread > * All corresponding entries for Obj is removed from the lock stack > * The monitor recursions is set to the number of removed entries - 1 > * The owner is changed from anonymous to the thread > * The regular ObjectMonitor::action is called. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 12 commits: - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Avoid copy from and to the same location - Fix typo - Update unstructured unlock comment - Fix bad indent after merge - Remove whitespace - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Merge remote-tracking branch 'upstream_jdk/pr/16603' into JDK-8319797 - Fix nit - ... and 2 more: https://git.openjdk.org/jdk/compare/1b907f90...56b04f58 ------------- Changes: https://git.openjdk.org/jdk/pull/16606/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16606&range=08 Stats: 676 lines in 10 files changed: 634 ins; 7 del; 35 mod Patch: https://git.openjdk.org/jdk/pull/16606.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16606/head:pull/16606 PR: https://git.openjdk.org/jdk/pull/16606 From coleenp at openjdk.org Mon Dec 4 12:36:58 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 4 Dec 2023 12:36:58 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: <277y6BBHCLkqj7vleST0dY2hg_iZdREWXlQDWqo7dUQ=.700d1ad2-d1de-4f09-b732-32ef5436360a@github.com> References: <277y6BBHCLkqj7vleST0dY2hg_iZdREWXlQDWqo7dUQ=.700d1ad2-d1de-4f09-b732-32ef5436360a@github.com> Message-ID: On Mon, 4 Dec 2023 11:11:27 GMT, Thomas Stuefe wrote: > Okay so why does this happen and is it a reasonable thing to be happening? On the surface it sounds wrong to deallocate anything associated with a live classloader. If we didn't deallocate these old methods, there would be a memory leak when using class redefinition. It would be a lot simpler if we didn't have to do this though. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1838543829 From tschatzl at openjdk.org Mon Dec 4 12:39:59 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 4 Dec 2023 12:39:59 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v6] In-Reply-To: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: > Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) > > Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). > > The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. > > Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). > > Upcoming changes will > * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. > * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) > * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism > * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) > * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. > > Please also first looking into the (small) PR this depends on. > > The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. > > Testing: tier1-7 > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: ayang review: move class unloading outside of weak_refs_work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16759/files - new: https://git.openjdk.org/jdk/pull/16759/files/f8ddc131..4afd996a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16759&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16759&range=04-05 Stats: 13 lines in 1 file changed: 7 ins; 6 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16759.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16759/head:pull/16759 PR: https://git.openjdk.org/jdk/pull/16759 From aboldtch at openjdk.org Mon Dec 4 12:40:54 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Mon, 4 Dec 2023 12:40:54 GMT Subject: RFR: 8319799: Recursive lightweight locking: x86 implementation [v7] In-Reply-To: References: Message-ID: > Implements the x86 port of JDK-8319796. > > There are two major parts for the port implementation. The C2 part, and the part shared by the interpreter, C1 and the native call wrapper. > > The biggest change for both parts is that we check the lock stack first and if it is a recursive lightweight [un]lock and in that case simply pop/push and finish successfully. > > Only if the recursive lightweight [un]lock fails does it look at the mark word. > > For the shared part if it is an unstructured exit, the monitor is inflated or the mark word transition fails it calls into the runtime. > > The C2 operates under a few more assumptions, that the locking is structured and balanced. This means that some checks can be elided. > > First this means that in C2 unlock if the obj is not on the top of the lock stack, it must be inflated. And reversely if we reach the inflated C2 unlock the obj is not on the lock stack. This second property makes it possible to avoid reading the owner (and checking if it is anonymous). Instead it can either just do an un-contended unlock by writing null to the owner, or if contention happens, simply write the thread to the owner and jump to the runtime. > > The x86 C2 port also has some extra oddities. > > The mark word read is done early as it showed better scaling in hyper-threaded scenarios on certain intel hardware, and no noticeable downside on other tested x86 hardware. > > The fast path is written to avoid going through conditional branches. This in combination with keeping the ZF output correct, the code does some actions eagerly, decrementing the held monitor count, popping from the lock stack. And jumps to a code stub if a slow path is required which restores the thread local state to a correct state before jumping to the runtime. > > The contended unlock was also moved to the code stub. Axel Boldt-Christmas has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 10 additional commits since the last revision: - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - top load adjustments - Merge remote-tracking branch 'upstream_jdk/pr/16606' into JDK-8319799 - Fix type - Move inflated check in fast_locked - Move top load - 8319799: Recursive lightweight locking: x86 implementation - Cleanup: C2 fast_lock/fast_unlock x86 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16607/files - new: https://git.openjdk.org/jdk/pull/16607/files/40d30882..13f32a39 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16607&range=05-06 Stats: 63352 lines in 1954 files changed: 37111 ins; 19274 del; 6967 mod Patch: https://git.openjdk.org/jdk/pull/16607.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16607/head:pull/16607 PR: https://git.openjdk.org/jdk/pull/16607 From ayang at openjdk.org Mon Dec 4 13:00:46 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Mon, 4 Dec 2023 13:00:46 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v6] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Mon, 4 Dec 2023 12:39:59 GMT, Thomas Schatzl wrote: >> Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) >> >> Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). >> >> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. >> >> Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). >> >> Upcoming changes will >> * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. >> * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) >> * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism >> * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) >> * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. >> >> Please also first looking into the (small) PR this depends on. >> >> The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. >> >> Testing: tier1-7 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review: move class unloading outside of weak_refs_work Only one minor & subjective comment. src/hotspot/share/gc/shared/classUnloadingContext.hpp line 69: > 67: void register_unlinked_nmethod(nmethod* nm); > 68: void purge_nmethods(); > 69: void free_code_blobs(); I feel this is exposing too much detail, especially when the adjacent API just combines them. ------------- Marked as reviewed by ayang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16759#pullrequestreview-1762337991 PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1413833775 From sjohanss at openjdk.org Mon Dec 4 13:10:59 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Mon, 4 Dec 2023 13:10:59 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v55] In-Reply-To: References: Message-ID: <1Pwrkk0q5RFSLTrUSYKVIXC1n21Ugot9M5jslG5O_B4=.1a364569-c869-4f4e-a747-2756a4873aa3@github.com> On Sat, 2 Dec 2023 07:37:26 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with two additional commits since the last revision: > > - Only create CPUTimeCounters if supported > - Ensure TTTC is destructed before publishing Marked as reviewed by sjohanss (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1762364095 From sjohanss at openjdk.org Mon Dec 4 13:11:01 2023 From: sjohanss at openjdk.org (Stefan Johansson) Date: Mon, 4 Dec 2023 13:11:01 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48] In-Reply-To: References: <_lEBVrWV8wrVbmhOiu3AAqPJo_xBs718ZtA9V-VSzGM=.253c0ec8-256e-4dee-b125-90be6338e4b8@github.com> Message-ID: On Fri, 1 Dec 2023 22:37:58 GMT, Jonathan Joo wrote: >> I think the ideal approach to simplify this is to support Atomic operation on a `PerfCounter`. >> We could either introduce a `PerfAtomicCounter`/`PerfAtomicLongCounter` class, or perform `Atomic::add()` on the `PerfData::_valuep` pointer. There's already `PerfData::get_address()`, so we might be able to do: >> >> >> Atomic::add((volatile jlong *)(instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->get_address()), net_cpu_time); >> >> >> However, a new class `PerfAtomicCounter` is likely cleaner. E.g., we may also want to make `PerfAtomicCounter::sample()` use a CAS. It is probably better to introduce `PerfAtomicCounter` in a separate RFE later. >> >> Would the `Atomic::add()` with `PerfData::get_address()` approach be OK for now, or would we rather introduce a lock, or leave the `gc_total` mechanism as-is and address the out-of-sync-ness in a follow-up RFE? >> >> IMO the out-of-sync-ness problem is minor, because users are likely to either look at a single `gc_total` counter, or look at each individual GC CPU counter and disregard `gc_total`. > > In the interest of the RDP1 deadline, should we leave improving the sync issues with gc_total to a separate RFE? (Especially given that a "correct" design may take some time to come up with, and that gc_total being slightly out of sync is not a major issue.) Me and Albert discussed this again and we are ok with handling the `gc_total` sync issue as a follow up. Please create the RFE for that. If that would include needing a `PerfAtomicCounter`, that would a be its own RFE as well. For me I think a lock would be a good enough solution. >From our point of view having the counters out of sync for a long period of time (think a long concurrent mark cycle without any young collections updating the total) is not good since it shows that the counters are not incremented in sync. It would also be nice to avoid the two-step updating of the total time, so please try to find time to work on this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1413848440 From eastigeevich at openjdk.org Mon Dec 4 13:40:44 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 4 Dec 2023 13:40:44 GMT Subject: RFR: 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 Message-ID: UseCryptoPmullForCRC32 enables to use crypto pmull instructions in CRC32 implementation. It is set to true for Neoverse V1. As the performance of the instructions is the same on Neoverse V2, UseCryptoPmullForCRC32 should be set to true for V2. ------------- Commit messages: - 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 Changes: https://git.openjdk.org/jdk/pull/16949/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16949&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321105 Stats: 9 lines in 1 file changed: 6 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16949.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16949/head:pull/16949 PR: https://git.openjdk.org/jdk/pull/16949 From eastigeevich at openjdk.org Mon Dec 4 13:40:45 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Mon, 4 Dec 2023 13:40:45 GMT Subject: RFR: 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 13:36:17 GMT, Evgeny Astigeevich wrote: > UseCryptoPmullForCRC32 enables to use crypto pmull instructions in CRC32 implementation. It is set to true for Neoverse V1. As the performance of the instructions is the same on Neoverse V2, UseCryptoPmullForCRC32 should be set to true for V2. @nick-arm, @shipilev, could you please have a look? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16949#issuecomment-1838662336 From tschatzl at openjdk.org Mon Dec 4 13:55:41 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Mon, 4 Dec 2023 13:55:41 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v6] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Mon, 4 Dec 2023 12:39:59 GMT, Thomas Schatzl wrote: >> Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) >> >> Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). >> >> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. >> >> Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). >> >> Upcoming changes will >> * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. >> * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) >> * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism >> * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) >> * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. >> >> Please also first looking into the (small) PR this depends on. >> >> The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. >> >> Testing: tier1-7 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review: move class unloading outside of weak_refs_work Fwiw, to put this change in a bit more context: it is part of a series of changes to improve class unloading performance back to pre-jdk21 levels (and better). The basic plan: * this change, [JDK-8317809](https://bugs.openjdk.org/browse/JDK-8317809), that improves nmethod sorting/free list handling (and introduces the ClassUnloadingContext) * [JDK-8317007](https://bugs.openjdk.org/browse/JDK-8317007) that allows bulk unregistering of nmethods instead of (slow) per-nmethod unregistering (also out for review) With the above two changes, Remark pause time should be <= before removal of the code root sweeper (lots of changes went in already that improved time taken for various parts of the class/code unloading). I am planning the following follow-ups in the next few months (after FC time will be spent on bugfixing, and holidays coming up): * (for G1) move out several parts of class unloading into the concurrent phase, at least this will include - bulk nmethod unregistering ([JDK-8317007](https://bugs.openjdk.org/browse/JDK-8317007)) - nmethod code blob freeing (this change) - metaspace unloading Not necessarily in a single change; this basically halves g1 remark pause times again in my testing. * split up and parallelize ClassLoaderData unloading; currently with this change, when registering CLDs CLD->unload() is immediately called as before. However this is wasteful as most of that method can either be "obviously" parallelized or made so that other tasks can run in parallel. So the plan is that class unloading (`SystemDictionary::do_unloading`) will be split into a part that iterates only over the CLD list to determine dead ones, and a parallel part. There are no CR/PRs out for these latter two items, but hopefully this will short of making everything concurrent keep class/code unloading times low enough for some time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16759#issuecomment-1838687260 From ogillespie at openjdk.org Mon Dec 4 14:00:02 2023 From: ogillespie at openjdk.org (Oli Gillespie) Date: Mon, 4 Dec 2023 14:00:02 GMT Subject: RFR: 8315559: Delay TempSymbol cleanup to avoid symbol table churn [v14] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 09:55:47 GMT, Oli Gillespie wrote: >> Attempt to fix regressions in class-loading performance caused by fixing a symbol table leak in [JDK-8313678](https://bugs.openjdk.org/browse/JDK-8313678). >> >> See lengthy discussion in https://bugs.openjdk.org/browse/JDK-8315559 for more background. In short, the leak was providing an accidental cache for temporary symbols, allowing reuse. >> >> This change keeps new temporary symbols alive in a queue for a short time, allowing them to be re-used by subsequent operations. For example, when attempting to load a class we may call JVM_FindLoadedClass for multiple classloaders back-to-back, and all of them will create a TempNewSymbol for the same string. At present, each call will leave a dead symbol in the table and create a new one. Dead symbols add cleanup and lookup overhead, and insertion is also costly. With this change, the symbol from the first invocation will live for some time after it is used, and subsequent callers can find the symbol alive in the table - avoiding the extra work. >> >> The queue is bounded, and when full new entries displace the oldest entry. This means symbols are held for the time it takes for 100 new temp symbols to be created. 100 is chosen arbitrarily - the tradeoff is memory usage versus 'cache' hit rate. >> >> When concurrent symbol table cleanup runs, it also drains the queue. >> >> In my testing, this brings Dacapo pmd performance back to where it was before the leak was fixed. >> >> Thanks @shipilev , @coleenp and @MDBijman for helping with this fix. > > Oli Gillespie has updated the pull request incrementally with one additional commit since the last revision: > > Add copyright header for new file Thanks all for the help and reviews! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16398#issuecomment-1838693214 From shade at openjdk.org Mon Dec 4 14:00:58 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Dec 2023 14:00:58 GMT Subject: RFR: 8321269: Require platforms to define DEFAULT_CACHE_LINE_SIZE Message-ID: Found it while doing new code that wants to know the cache line size. Currently, there is a fallback in `globalDefinitions.hpp` that defaults `DEFAULT_CACHE_LINE_SIZE` to `64` if platform does not define it. Instead of relying on default, force platform definitions to tell what is the reasonable default for the platform. This would simplify porting to other architectures, with less surprises for them. The actual sizes do not change. If any existing platform needs adjustments, those should be handled as separate issues. ------------- Commit messages: - Fix Changes: https://git.openjdk.org/jdk/pull/16948/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16948&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321269 Stats: 7 lines in 4 files changed: 6 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16948.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16948/head:pull/16948 PR: https://git.openjdk.org/jdk/pull/16948 From shade at openjdk.org Mon Dec 4 14:03:38 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 4 Dec 2023 14:03:38 GMT Subject: RFR: 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 In-Reply-To: References: Message-ID: <2WnBFRDAWMLKR0QfxZM2XnLO6UsqNyzH_QE5l7rre5c=.05a9b638-2aa2-4010-bd34-dc05b355545c@github.com> On Mon, 4 Dec 2023 13:36:17 GMT, Evgeny Astigeevich wrote: > UseCryptoPmullForCRC32 enables to use crypto pmull instructions in CRC32 implementation. It is set to true for Neoverse V1. As the performance of the instructions is the same on Neoverse V2, UseCryptoPmullForCRC32 should be set to true for V2. Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16949#pullrequestreview-1762470043 From stuefe at openjdk.org Mon Dec 4 14:10:01 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Dec 2023 14:10:01 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold Message-ID: We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. --- Motivation: The main usage for this option is to analyze situations that would lead to an OOM kill of the process. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. In either case, one has little or no information to go on; often, one does not even know it was the OOM killer, or if the JVM was really responsible. In these situations, getting a voluntary abort *before* the process is killed can give us valuable information we would not get otherwise. Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting, to catch obvious footprint degradations early. Letting the JVM handle this Limit has many advantages: - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. - Re-using the normal error reporting mechanism is powerful since: - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. ---- Usage: Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. If given as percent, JVM will also react to container limit updates. Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` ---- Patch: Implemented for Linux, MacOS and Windows. Left out AIX since there we have a long-standing problem that RSS is not easily obtained since the normal memory usage numbers don't include system V shared memory, which is the Lion's share of JVM memory we use. ------------- Commit messages: - wip - wip - wip - RssLimit Changes: https://git.openjdk.org/jdk/pull/16938/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16938&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321266 Stats: 395 lines in 13 files changed: 394 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16938.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16938/head:pull/16938 PR: https://git.openjdk.org/jdk/pull/16938 From stefank at openjdk.org Mon Dec 4 14:54:36 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 4 Dec 2023 14:54:36 GMT Subject: RFR: 8321269: Require platforms to define DEFAULT_CACHE_LINE_SIZE In-Reply-To: References: Message-ID: <7rNgllKXGh5IPG141MHqNIXZKHgYO1Np9VogWIVCrXE=.c9f595df-7fbb-4e55-b428-6d5f92fba857@github.com> On Mon, 4 Dec 2023 12:42:45 GMT, Aleksey Shipilev wrote: > Found it while doing new code that wants to know the cache line size. Currently, there is a fallback in `globalDefinitions.hpp` that defaults `DEFAULT_CACHE_LINE_SIZE` to `64` if platform does not define it. Instead of relying on default, force platform definitions to tell what is the reasonable default for the platform. This would simplify porting to other architectures, with less surprises for them. > > The actual sizes do not change. If any existing platform needs adjustments, those should be handled as separate issues. Seems reasonable to me. ------------- Marked as reviewed by stefank (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16948#pullrequestreview-1762631833 From stuefe at openjdk.org Mon Dec 4 15:01:47 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Dec 2023 15:01:47 GMT Subject: RFR: 8321269: Require platforms to define DEFAULT_CACHE_LINE_SIZE In-Reply-To: References: Message-ID: <2SyLtpVHXafn3wAjYDeYJqxSOPkNwnMz1Lg2MeetI-w=.07ba4d2d-9aaa-45db-9847-705471a95c7c@github.com> On Mon, 4 Dec 2023 12:42:45 GMT, Aleksey Shipilev wrote: > Found it while doing new code that wants to know the cache line size. Currently, there is a fallback in `globalDefinitions.hpp` that defaults `DEFAULT_CACHE_LINE_SIZE` to `64` if platform does not define it. Instead of relying on default, force platform definitions to tell what is the reasonable default for the platform. This would simplify porting to other architectures, with less surprises for them. > > The actual sizes do not change. If any existing platform needs adjustments, those should be handled as separate issues. +1 ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16948#pullrequestreview-1762651405 From jbachorik at openjdk.org Mon Dec 4 15:11:38 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Mon, 4 Dec 2023 15:11:38 GMT Subject: RFR: 8211238: @Deprecated JFR event [v8] In-Reply-To: References: Message-ID: <2yQLLCsc2Ux9NqSE0NTz6yWyg6sxO3LWlkMK8h6jyrk=.c61993f8-cdd5-4758-ab6e-a10e6505dd37@github.com> On Mon, 4 Dec 2023 10:15:32 GMT, Jaroslav Bachorik wrote: >>> A question about "level". Is the intention that the value can be anything, e.g. some new event next month might use the values "1", "2, "3"? Just asking because ordinarily deprecated vs. terminally deprecated is very specific to the manner in which a program element is deprecated and I assume you don't want this event grabbing the general name for a very specific event setting. >> >> Yes, the design is generic. An event control/setting to be used also for other events. > > Hi @mgronlun - sorry for opening a design discussion in PR :( > > I wonder - will this report each single one invocation of a deprecated method conforming to the rules (JDK method called from non-JDK code)? Can this, potentially, flood the recording if the deprecated method gets called from a hot loop? > Hi @jbachorik, it will only report one event per unique call site, during link time. Its not a function of hotness, only unique edge discovery. Excellent! Thanks! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16931#issuecomment-1838845171 From aph at openjdk.org Mon Dec 4 15:45:44 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 4 Dec 2023 15:45:44 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: On Thu, 30 Nov 2023 09:58:28 GMT, Andrew Haley wrote: >> I was also going to suggest adding a new flag and creating an alias. The new flag will need a CSR request of course. > >> I was also going to suggest adding a new flag and creating an alias. The new flag will need a CSR request of course. > > Given that it's new and it's diagnostic flag I'm a bit surprised at that. I was trying for a quick fix. > > Anyway, how do you create an alias? I can't see any examples, and I haven't found a way through the maze of twisty `#define` passages. > @theRealAph the `RestoreMXCSROnJNICalls` flag is a product flag not diagnostic. Ah, thanks, > Aliased flags are setup in arguments.cpp by editing this: OK. How about we split this into two, this first part without a CSR, and the second part, which creates the generic alias, with one? That way we can mitigate a live problem in this release. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16851#issuecomment-1838915879 From aph-open at littlepinkcloud.com Mon Dec 4 15:48:09 2023 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Mon, 4 Dec 2023 15:48:09 +0000 Subject: Use of C++ dynamic global object initialization with thread guards In-Reply-To: <87fs0izasf.fsf@oldenburg.str.redhat.com> References: <87fs0izasf.fsf@oldenburg.str.redhat.com> Message-ID: <0f2be98c-f07b-4877-b78b-b0c38badfecf@littlepinkcloud.com> On 12/4/23 07:28, Florian Weimer wrote: > Furthermore, most uses of C++ dynamic initialization involve a > computation that is idempotent and have unused bit patterns in the > initialized value. This means that a separate guard variable is not > needed, and a simple atomic store/atomic load could be used. I used it in HotSpot code to trigger one-time resolution of some JDK classes. These classes were in an incubator module, so I did not want them to be loaded by default. I guess we could replace the C++ mechanism by, one of our own, but that doesn't seem to me to be much of a maintenance win. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From rehn at openjdk.org Mon Dec 4 15:50:45 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 4 Dec 2023 15:50:45 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: References: Message-ID: <-1ZEb9zsjqsg6L2Rb_teeZePsRwKAxrMGBzjmCUERvk=.a3ea277a-dc10-4e6e-a3f4-4bfe66d0bbf3@github.com> On Fri, 1 Dec 2023 19:47:22 GMT, Hamlin Li wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Flag fixes >> - Merge branch 'master' into sha256 >> - Share code >> - SHA-2 > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3895: > >> 3893: __ enter(); >> 3894: >> 3895: __ push_reg(saved_regs, sp); > > Not sure if we need to push and pop `saved_regs `, as t2 is the only register in it, or maybe I miss something? t2 is used by C2 as general register, see R7 in riscv.ad. As this may be inlined directly into the graph IR, i.e. no call to get here, t2 may be a live register. saved_regs only contains t2 so there is just one spill and one restore. No? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1414101381 From rehn at openjdk.org Mon Dec 4 15:58:45 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 4 Dec 2023 15:58:45 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 19:19:35 GMT, Hamlin Li wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Flag fixes >> - Merge branch 'master' into sha256 >> - Share code >> - SHA-2 > > src/hotspot/cpu/riscv/vm_version_riscv.cpp line 159: > >> 157: } >> 158: >> 159: if (UseZvkn) { > > Maybe it's safe to move the code behind `#endif // COMPILER2` at line 291, as it depends on UseRVV. There is some issues here predating my code: In void generate_compiler_stubs() { chacha and md5 is generated in the COMPILER2_OR_JVMCI block not the COMPILER2. So i put sha2 here also which means even if not COMPILER2 is defined it should be defined. I think all of these should be moved into COMPILER2 block as JVMCI cannot use them. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1414112339 From coleenp at openjdk.org Mon Dec 4 16:06:04 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Mon, 4 Dec 2023 16:06:04 GMT Subject: RFR: 8315559: Delay TempSymbol cleanup to avoid symbol table churn [v14] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 13:56:35 GMT, Oli Gillespie wrote: >> Oli Gillespie has updated the pull request incrementally with one additional commit since the last revision: >> >> Add copyright header for new file > > Thanks all for the help and reviews! Nice work @olivergillespie ------------- PR Comment: https://git.openjdk.org/jdk/pull/16398#issuecomment-1838955878 From ngasson at openjdk.org Mon Dec 4 16:39:36 2023 From: ngasson at openjdk.org (Nick Gasson) Date: Mon, 4 Dec 2023 16:39:36 GMT Subject: RFR: 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 13:36:17 GMT, Evgeny Astigeevich wrote: > UseCryptoPmullForCRC32 enables to use crypto pmull instructions in CRC32 implementation. It is set to true for Neoverse V1. As the performance of the instructions is the same on Neoverse V2, UseCryptoPmullForCRC32 should be set to true for V2. Marked as reviewed by ngasson (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16949#pullrequestreview-1762900299 From aph at openjdk.org Mon Dec 4 16:57:11 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 4 Dec 2023 16:57:11 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v3] In-Reply-To: References: Message-ID: > Vectorizing Poly1305 is quite tricky. We already have a highly- > efficient scalar Poly1305 implementation that runs on the core integer > unit, but it's highly serialized, so it does not make make good use of > the parallelism available. > > The scalar implementation takes advantage of some particular features > of the Poly1305 keys. In particular, certain bits of r, the secret > key, are required to be 0. These make it possible to use a full > 64-bit-wide multiply-accumulate operation without needing to process > carries between partial products, > > While this works well for a serial implementation, a parallel > implementation cannot do this because rather than multiplying by r, > each step multiplies by some integer power of r, modulo > 2^130-5. > > In order to avoid processing carries between partial products we use a > redundant representation, in which each 130-bit integer is encoded > either as a 5-digit integer in base 2^26 or as a 3-digit integer in > base 2^52, depending on whether we are using a 64- or 32-bit > multiply-accumulate. > > In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate > operation available to us, so we must use 32*32 -> 64-bit operations. > > In order to achieve maximum performance we'd like to get close to the > processor's decode bandwidth, so that every clock cycle does something > useful. In a typical high-end AArch64 implementation, the core integer > unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a > fast(ish) two-way 32-bit multiplier, which may be slower than than the > core integer unit's. It is not at all obvious whether it's best to use > ASIMD or core instructions. > > Fortunately, if we have a wide-bandwidth instruction decode, we can do > both at the same time, by feeding alternating instructions to the core > and the ASIMD units. This also allows us to make good use of all of > the available core and ASIMD registers, in parallel. > > To do this we use generators, which here are a kind of iterator that > emits a group of instructions each time it is called. In this case we > 4 parallel generators, and by calling them alternately we interleave > the ASIMD and the core instructions. We also take care to ensure that > each generator finishes at about the same time, to maximize the > distance between instructions which generate and consume data. > > The results are pretty good, ranging from 2* - 3* speedup. It is > possible that a pure in-order processor (Raspberry Pi?) might be at > some disadvantage because more work is being done even though it is > highly parallel, b... Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Add comment, cleanup. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16812/files - new: https://git.openjdk.org/jdk/pull/16812/files/f2e9c8e5..5cdb3630 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16812&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16812&range=01-02 Stats: 111 lines in 1 file changed: 55 ins; 6 del; 50 mod Patch: https://git.openjdk.org/jdk/pull/16812.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16812/head:pull/16812 PR: https://git.openjdk.org/jdk/pull/16812 From aph at openjdk.org Mon Dec 4 17:22:58 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 4 Dec 2023 17:22:58 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v4] In-Reply-To: References: Message-ID: > Vectorizing Poly1305 is quite tricky. We already have a highly- > efficient scalar Poly1305 implementation that runs on the core integer > unit, but it's highly serialized, so it does not make make good use of > the parallelism available. > > The scalar implementation takes advantage of some particular features > of the Poly1305 keys. In particular, certain bits of r, the secret > key, are required to be 0. These make it possible to use a full > 64-bit-wide multiply-accumulate operation without needing to process > carries between partial products, > > While this works well for a serial implementation, a parallel > implementation cannot do this because rather than multiplying by r, > each step multiplies by some integer power of r, modulo > 2^130-5. > > In order to avoid processing carries between partial products we use a > redundant representation, in which each 130-bit integer is encoded > either as a 5-digit integer in base 2^26 or as a 3-digit integer in > base 2^52, depending on whether we are using a 64- or 32-bit > multiply-accumulate. > > In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate > operation available to us, so we must use 32*32 -> 64-bit operations. > > In order to achieve maximum performance we'd like to get close to the > processor's decode bandwidth, so that every clock cycle does something > useful. In a typical high-end AArch64 implementation, the core integer > unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a > fast(ish) two-way 32-bit multiplier, which may be slower than than the > core integer unit's. It is not at all obvious whether it's best to use > ASIMD or core instructions. > > Fortunately, if we have a wide-bandwidth instruction decode, we can do > both at the same time, by feeding alternating instructions to the core > and the ASIMD units. This also allows us to make good use of all of > the available core and ASIMD registers, in parallel. > > To do this we use generators, which here are a kind of iterator that > emits a group of instructions each time it is called. In this case we > 4 parallel generators, and by calling them alternately we interleave > the ASIMD and the core instructions. We also take care to ensure that > each generator finishes at about the same time, to maximize the > distance between instructions which generate and consume data. > > The results are pretty good, ranging from 2* - 3* speedup. It is > possible that a pure in-order processor (Raspberry Pi?) might be at > some disadvantage because more work is being done even though it is > highly parallel, b... Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: - Dead code - Whitespace only ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16812/files - new: https://git.openjdk.org/jdk/pull/16812/files/5cdb3630..584b081e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16812&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16812&range=02-03 Stats: 234 lines in 1 file changed: 36 ins; 39 del; 159 mod Patch: https://git.openjdk.org/jdk/pull/16812.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16812/head:pull/16812 PR: https://git.openjdk.org/jdk/pull/16812 From aph at openjdk.org Mon Dec 4 17:33:00 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 4 Dec 2023 17:33:00 GMT Subject: RFR: 8320709: AArch64: Vectorized Poly1305 intrinsics [v5] In-Reply-To: References: Message-ID: <4ro4TX0p_iovP5lOIRrutPRy7OjpBPqk1j7SgqP-WZg=.e23b4bd3-a033-45fb-bd5b-507f275255b7@github.com> > Vectorizing Poly1305 is quite tricky. We already have a highly- > efficient scalar Poly1305 implementation that runs on the core integer > unit, but it's highly serialized, so it does not make make good use of > the parallelism available. > > The scalar implementation takes advantage of some particular features > of the Poly1305 keys. In particular, certain bits of r, the secret > key, are required to be 0. These make it possible to use a full > 64-bit-wide multiply-accumulate operation without needing to process > carries between partial products, > > While this works well for a serial implementation, a parallel > implementation cannot do this because rather than multiplying by r, > each step multiplies by some integer power of r, modulo > 2^130-5. > > In order to avoid processing carries between partial products we use a > redundant representation, in which each 130-bit integer is encoded > either as a 5-digit integer in base 2^26 or as a 3-digit integer in > base 2^52, depending on whether we are using a 64- or 32-bit > multiply-accumulate. > > In AArch64 Advanced SIMD, there is no 64-bit multiply-accumulate > operation available to us, so we must use 32*32 -> 64-bit operations. > > In order to achieve maximum performance we'd like to get close to the > processor's decode bandwidth, so that every clock cycle does something > useful. In a typical high-end AArch64 implementation, the core integer > unit has a fast 64-bit multiplier pipeline and the ASIMD unit has a > fast(ish) two-way 32-bit multiplier, which may be slower than than the > core integer unit's. It is not at all obvious whether it's best to use > ASIMD or core instructions. > > Fortunately, if we have a wide-bandwidth instruction decode, we can do > both at the same time, by feeding alternating instructions to the core > and the ASIMD units. This also allows us to make good use of all of > the available core and ASIMD registers, in parallel. > > To do this we use generators, which here are a kind of iterator that > emits a group of instructions each time it is called. In this case we > 4 parallel generators, and by calling them alternately we interleave > the ASIMD and the core instructions. We also take care to ensure that > each generator finishes at about the same time, to maximize the > distance between instructions which generate and consume data. > > The results are pretty good, ranging from 2* - 3* speedup. It is > possible that a pure in-order processor (Raspberry Pi?) might be at > some disadvantage because more work is being done even though it is > highly parallel, b... Andrew Haley has updated the pull request incrementally with two additional commits since the last revision: - Whitespace - Whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16812/files - new: https://git.openjdk.org/jdk/pull/16812/files/584b081e..47024e3b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16812&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16812&range=03-04 Stats: 48 lines in 1 file changed: 0 ins; 0 del; 48 mod Patch: https://git.openjdk.org/jdk/pull/16812.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16812/head:pull/16812 PR: https://git.openjdk.org/jdk/pull/16812 From rriggs at openjdk.org Mon Dec 4 18:31:59 2023 From: rriggs at openjdk.org (Roger Riggs) Date: Mon, 4 Dec 2023 18:31:59 GMT Subject: Integrated: 8311906: Improve robustness of String constructors with mutable array inputs In-Reply-To: <6SKlGLh5MmxoEx07wHCCUc8KWbbhcspLJmcc1uxQ_FI=.ca33bfb4-fa5c-45f0-b49f-ee6c5c6b68b4@github.com> References: <6SKlGLh5MmxoEx07wHCCUc8KWbbhcspLJmcc1uxQ_FI=.ca33bfb4-fa5c-45f0-b49f-ee6c5c6b68b4@github.com> Message-ID: On Mon, 30 Oct 2023 18:34:44 GMT, Roger Riggs wrote: > Strings, after construction, are immutable but may be constructed from mutable arrays of bytes, characters, or integers. > The string constructors should guard against the effects of mutating the arrays during construction that might invalidate internal invariants for the correct behavior of operations on the resulting strings. In particular, a number of operations have optimizations for operations on pairs of latin1 strings and pairs of non-latin1 strings, while operations between latin1 and non-latin1 strings use a more general implementation. > > The changes include: > > - Adding a warning to each constructor with an array as an argument to indicate that the results are indeterminate > if the input array is modified before the constructor returns. > The resulting string may contain any combination of characters sampled from the input array. > > - Ensure that strings that are represented as non-latin1 contain at least one non-latin1 character. > For latin1 inputs, whether the arrays contain ASCII, ISO-8859-1, UTF8, or another encoding decoded to latin1 the scanning and compression is unchanged. > If a non-latin1 character is found, the string is represented as non-latin1 with the added verification that a non-latin1 character is present at the same index. > If that character is found to be latin1, then the input array has been modified and the result of the scan may be incorrect. > Though a ConcurrentModificationException could be thrown, the risk to an existing application of an unexpected exception should be avoided. > Instead, the non-latin1 copy of the input is re-scanned and compressed; that scan determines whether the latin1 or the non-latin1 representation is returned. > > - The methods that scan for non-latin1 characters and their intrinsic implementations are updated to return the index of the non-latin1 character. > > - String construction from StringBuilder and CharSequence must also be guarded as their contents may be modified during construction. This pull request has now been integrated. Changeset: 155abc57 Author: Roger Riggs URL: https://git.openjdk.org/jdk/commit/155abc576a0212932825485380d4e2a9c7dd2fdc Stats: 1415 lines in 15 files changed: 1162 ins; 110 del; 143 mod 8311906: Improve robustness of String constructors with mutable array inputs Co-authored-by: Damon Fenacci Co-authored-by: Claes Redestad Co-authored-by: Amit Kumar Co-authored-by: Martin Doerr Reviewed-by: rgiulietti, thartmann, redestad, dfenacci ------------- PR: https://git.openjdk.org/jdk/pull/16425 From duke at openjdk.org Mon Dec 4 19:16:58 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Mon, 4 Dec 2023 19:16:58 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v2] In-Reply-To: References: <9i5yHmpRi3-XqL5lw0-0IexhCDr2FOi5nT4dgY7cWao=.ab8a1d6e-c9fc-4108-820b-374ce7815463@github.com> Message-ID: On Tue, 14 Nov 2023 16:19:12 GMT, Hamlin Li wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Minor cosmetic fixes. > >> The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Hey @ygaevsky, I can work on this real vectorized intrinsic implementation, please let me know how you think about it. > If you already had a solution or started working on it, please ignore my message. > > Thanks. @Hamlin-Li, @RealFYang: please take a look at the latest updates when you have time, thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1839303026 From stuefe at openjdk.org Mon Dec 4 19:17:48 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Dec 2023 19:17:48 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v2] In-Reply-To: <3ZiujwFod2l9RQKLDZTHOLpCFCqWdBmlSEMANllEDkQ=.6750f422-cc50-4684-b7d0-cf211ef91dd8@github.com> References: <3ZiujwFod2l9RQKLDZTHOLpCFCqWdBmlSEMANllEDkQ=.6750f422-cc50-4684-b7d0-cf211ef91dd8@github.com> Message-ID: <0oa4yGhunC8gZlAY-QsowrfoYXY3-ltghj2CQDWgmU4=.7c071152-f45b-4b03-aa3b-9bb2146d0eda@github.com> On Mon, 4 Dec 2023 09:26:40 GMT, Stefan Karlsson wrote: >> src/hotspot/os/linux/os_linux.cpp line 3722: >> >>> 3720: } >>> 3721: >>> 3722: log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); >> >> Would it be not clearer to define when to warn, as we do in warn_no_large_pages? >> >> Related to that, should we not warn if ZGC and +shmemthp configured but -anonymous thp? I am not sure the heap is the only part of the JVM that uses THP, and other parts would still use anon THP, or? E.g. Code heap. >> >> Also, maybe a better message for the poor admin that tries to setup. E.g.: >> >> >> bool requires_shmem_thp = UseTHP + UseZGC >> bool requires_anon_thp = UseTHP >> bool off = false; >> >> if (requires_shmem && !shmem configured) >> (log_warning "Shmem thp are not supported. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to advise to support shmem thp") >> off = true; >> >> if (requires_anonthp && !anon_thp configured) >> (log_warning "anonymous Thp are not supported. Set /sys/kernel/mm/transparent_hugepage/enabled to madvise") >> off = true; >> >> if (off) >> UseTHP = 0 >> log_warning(UseTHP disabled (see previous messages) >> >> >> if ZGC and !supports shmemthp or > >> Would it be not clearer to define when to warn, as we do in warn_no_large_pages? > > I don't understand what you are suggesting with this question / request, so I'm not sure exactly what you are looking for. Instead, I made my own version of the pseudo code you posted. > > This is the warnings I get with that change: > > Without ZGC: > > $ thp never never > always madvise [never] > always within_size advise [never] deny force > > $ java -XX:+UseTransparentHugePages -version > [0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. > [0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system. > ... > > $ thp never advise > always madvise [never] > always within_size [advise] never deny force > > java -XX:+UseTransparentHugePages -version > [0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. > [0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system. > ... > > $ thp madvise never > always [madvise] never > always within_size advise [never] deny force > > $ java -XX:+UseTransparentHugePages -version > ... > > $ thp madvise advise > always [madvise] never > always within_size [advise] never deny force > $ java -XX:+UseTransparentHugePages -version > ... > > With ZGC: > > $ thp never never > always madvise [never] > always within_size advise [never] deny force > > $ java -XX:+UseTransparentHugePages -XX:+UseZGC -version > [0.002s][warning][pagesize] Shared memory transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/shmem_enabled to 'advise' to enable them. > [0.002s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. > [0.002s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system. > ... > > $ thp never advise > always madvise [never] > always within_size [advise] never deny force > > $ java -XX:+UseTransparentHugePages -XX:+UseZGC -version > [0.001s][warning][pagesize] Anonymous transparent huge pages are not enabled in the OS. Set /sys/kernel/mm/transparent_hugepage/enabled to 'madvise' to enable them. > [0.001s][warning][pagesize] UseTransparentHugePages disabled, transparent huge pages are not sup... I think this is okay, what do you think? Too many messages? >> src/hotspot/os/linux/os_linux.cpp line 3736: >> >>> 3734: ls.print_cr(". Default large page size: " EXACTFMT ".", EXACTFMTARGS(os::large_page_size())); >>> 3735: } else { >>> 3736: ls.print("Large page support %sdisabled.", uses_zgc_shmem_thp() ? "partially " : ""); >> >> I wonder whether we could make our life simpler by not supporting mixes: we could require that for ZGC, to use THP, both shmen and anon thps have to be active. Would that be acceptable or do you think there are too many misconfigured systems out there? > > I would prefer to not force users to set both. Fair enough. It is better to be able to run efficiently on as many configurations as possible. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1414366736 PR Review Comment: https://git.openjdk.org/jdk/pull/16690#discussion_r1414367277 From mgronlun at openjdk.org Mon Dec 4 19:21:00 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 4 Dec 2023 19:21:00 GMT Subject: RFR: 8211238: @Deprecated JFR event [v12] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: reviewer feedback and fixes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/12d9bcac..f1c8cd18 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=10-11 Stats: 277 lines in 14 files changed: 100 ins; 101 del; 76 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From stuefe at openjdk.org Mon Dec 4 19:21:39 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 4 Dec 2023 19:21:39 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v4] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 09:32:03 GMT, Stefan Karlsson wrote: >> There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: >> >> >> if (UseTransparentHugePages && !HugePages::supports_thp()) { >> if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { >> log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); >> } >> UseLargePages = UseTransparentHugePages = false; >> return; >> } >> >> >> This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: >> >> /sys/kernel/mm/transparent_hugepage/enabled: never >> /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise >> >> >> the above code will force ZGC to run without THPs. >> >> This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: >> >> 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. >> >> 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. >> >> 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. >> >> The result of this change can be seen in these tables: >> >> ZGC large pages log output: >> >> E (T) = Enabled (Transparent) >> E (T, OS) = Enabled (Transparent, OS enforced) >> D = Disabled >> D = Disabled (OS enforced) >> >> -XX:+UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+--------+---------+------- >> always | E (T) | E (T) | E (T) >> within_size | E (T) | E (T) | E (T) >> advise | E (T) | E (T) | E (T) >> never | D (OS) | D (OS) | D (OS) >> deny | D (OS) | D (OS) | D (OS) >> force | E (T) | E (T) | E (T) >> >> -XX:-UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+-----------+-----------+------- >> always | E (T, OS) | E (T, OS) | E (T, OS) >> within_size | E (T, OS) | E (T, OS) | E (T, OS) >> advise | D ... > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > More precise THP warning messages I think this is good. Thank you. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16690#pullrequestreview-1763203936 From mgronlun at openjdk.org Mon Dec 4 22:25:23 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Mon, 4 Dec 2023 22:25:23 GMT Subject: RFR: 8211238: @Deprecated JFR event [v13] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: fix unloading ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/f1c8cd18..91ec681a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=11-12 Stats: 32 lines in 6 files changed: 0 ins; 15 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From dholmes at openjdk.org Mon Dec 4 23:35:59 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 4 Dec 2023 23:35:59 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: References: Message-ID: On Wed, 29 Nov 2023 11:49:31 GMT, Jaroslav Bachorik wrote: >> Please, review this fix for a corner case handling of `jmethodID` values. >> >> The issue is related to the interplay between `jmethodID` values and method redefinitions. Each `jmethodID` value is effectively a pointer to a `Method` instance. Once that method gets redefined, the `jmethodID` is updated to point to the last `Method` version. >> Unless the method is still on stack/running, in which case the original `jmethodID` will be redirected to the latest `Method` version and at the same time the 'previous' `Method` version will receive a new `jmethodID` pointing to that previous version. >> >> If we happen to capture stacktrace via `GetStackTrace` or `GetAllStackTraces` JVMTI calls while this previous `Method` version is still on stack we will have the corresponding frame identified by a `jmethodID` pointing to that version. >> However, sooner or later the 'previous' class version becomes eligible for cleanup at what time all contained `Method` instances. The cleanup process will not perform the `jmethodID` pointer maintenance and we will end up with pointers to deallocated memory. >> This is caused by the fact that the `jmethodID` lifecycle is bound to `ClassLoaderData` instance and all relevant `jmethodID`s will get batch-updated when the class loader is being released and all its classes are getting unloaded. >> >> This means that we need to make sure that if a `Method` instance is being deallocate the associated `jmethodID` (if any) must not point to the deallocated instance once we are finished. Unfortunately, we can not just update the `jmethodID` values in bulk when purging an old class version - the per `InstanceKlass` jmethodID cache is present only for the main class version and contains `jmethodID` values for both the old and current method versions. >> >> ~Therefore we need to perform `jmethodID` lookup when we are about to deallocate a `Method` instance and clean up the pointer only if that `jmethodID` is pointing to the `Method` instance which is being deallocated.~ >> >> Therefore, we need to perform `jmethodID` lookup for each method in an old class version that is getting purged, and null out the pointer of that `jmethodID` to break the link from `jmethodID` to the method instance that is about to get deallocated. >> >> _(For anyone interested, a much lengthier writeup is available in [my blog](https://jbachorik.github.io/posts/mysterious-jmethodid))_ > > Jaroslav Bachorik has updated the pull request incrementally with one additional commit since the last revision: > > Restrict cleanup to obsolete methods only Just for the record Coleen's review is marked as "Review applies to [81e31dae](https://git.openjdk.org/jdk/pull/16662/files/81e31daeef4c68352368a90e3ab453ba9b33650c)" - which is not the final version. The skara tooling does not currently support our rules but it remains as always that non-trivial Hotspot changes require two reviewers. Thanks for the additional explanations Coleen. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1839721631 From manc at openjdk.org Mon Dec 4 23:37:50 2023 From: manc at openjdk.org (Man Cao) Date: Mon, 4 Dec 2023 23:37:50 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48] In-Reply-To: References: <_lEBVrWV8wrVbmhOiu3AAqPJo_xBs718ZtA9V-VSzGM=.253c0ec8-256e-4dee-b125-90be6338e4b8@github.com> Message-ID: On Mon, 4 Dec 2023 13:07:15 GMT, Stefan Johansson wrote: >> In the interest of the RDP1 deadline, should we leave improving the sync issues with gc_total to a separate RFE? (Especially given that a "correct" design may take some time to come up with, and that gc_total being slightly out of sync is not a major issue.) > > Me and Albert discussed this again and we are ok with handling the `gc_total` sync issue as a follow up. Please create the RFE for that. If that would include needing a `PerfAtomicCounter`, that would a be its own RFE as well. For me I think a lock would be a good enough solution. > > From our point of view having the counters out of sync for a long period of time (think a long concurrent mark cycle without any young collections updating the total) is not good since it shows that the counters are not incremented in sync. It would also be nice to avoid the two-step updating of the total time, so please try to find time to work on this. Thanks, opened https://bugs.openjdk.org/browse/JDK-8321304. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1414634977 From manc at openjdk.org Tue Dec 5 00:22:50 2023 From: manc at openjdk.org (Man Cao) Date: Tue, 5 Dec 2023 00:22:50 GMT Subject: RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v55] In-Reply-To: References: Message-ID: On Sat, 2 Dec 2023 07:37:26 GMT, Jonathan Joo wrote: >> 8315149: Add hsperf counters for CPU time of internal GC threads > > Jonathan Joo has updated the pull request incrementally with two additional commits since the last revision: > > - Only create CPUTimeCounters if supported > - Ensure TTTC is destructed before publishing Marked as reviewed by manc (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/15082#pullrequestreview-1763690561 From sspitsyn at openjdk.org Tue Dec 5 00:29:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Dec 2023 00:29:55 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected Message-ID: The fix is for a regression caused by: [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 The fix of 8308614 just triggered a known issue: [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option The fix is just a work around with the extra checks of the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. Testing: - In progress: Test with tiers 1-6 ------------- Commit messages: - 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected Changes: https://git.openjdk.org/jdk/pull/16961/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16961&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321219 Stats: 3 lines in 1 file changed: 2 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16961.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16961/head:pull/16961 PR: https://git.openjdk.org/jdk/pull/16961 From fyang at openjdk.org Tue Dec 5 00:57:40 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 5 Dec 2023 00:57:40 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v2] In-Reply-To: References: <9i5yHmpRi3-XqL5lw0-0IexhCDr2FOi5nT4dgY7cWao=.ab8a1d6e-c9fc-4108-820b-374ce7815463@github.com> Message-ID: On Tue, 14 Nov 2023 16:19:12 GMT, Hamlin Li wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Minor cosmetic fixes. > >> The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Hey @ygaevsky, I can work on this real vectorized intrinsic implementation, please let me know how you think about it. > If you already had a solution or started working on it, please ignore my message. > > Thanks. > @Hamlin-Li, @RealFYang: please take a look at the latest updates when you have time, thanks. Having another look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1839816165 From duke at openjdk.org Tue Dec 5 01:16:04 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Tue, 5 Dec 2023 01:16:04 GMT Subject: RFR: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string Message-ID: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> Test CheckLargePages was broken by the previous changes: [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from Usable page sizes: 4k, 1G to Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. This change includes: - `os::can_execute_large_page_memory` uses `UseLargePages`, which could be implicitly enabled by `UseTransparentHugePages` as well. - The regular expression in CheckLargePages is updated to capture only the page sizes. - Test CheckLargePages is still kept in ProblemList.txt until [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795)) is resolved. ------------- Commit messages: - Reserve ProblemList.txt - 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string - 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string Changes: https://git.openjdk.org/jdk/pull/16962/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16962&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8317831 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16962.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16962/head:pull/16962 PR: https://git.openjdk.org/jdk/pull/16962 From duke at openjdk.org Tue Dec 5 02:03:41 2023 From: duke at openjdk.org (Liming Liu) Date: Tue, 5 Dec 2023 02:03:41 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v12] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 06:20:07 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Update the name of the method Ping! Hello! ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1839871322 PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1839872505 From xgong at openjdk.org Tue Dec 5 02:08:40 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 5 Dec 2023 02:08:40 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 08:31:17 GMT, Xiaohong Gong wrote: >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >>> arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that also implies that the compiler knows about `-march=armv8-a+sve`. If so, then this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS would not be set? > >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >> >> > arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what version of gcc it was added, and if that also implies that the compiler knows about `-march=armv8-a+sve`. If so, then this test could basically be framed as a gcc version check. >> >> I'm still leaning towards failing configure if the SVE code cannot be compiled. Under what circumstances can this test possibly fail, so SVE_CFLAGS would not be set? > > Yes, the SVE compiler test code could be treated as a gcc/clang version check. `arm_sve.h` which is included in `sleef.h` and then in `vect_math_sve.c` is the SVE ACLE (Arm C Language Extensions) header file. It was included in gcc start from version 10 (may not be exact, but gcc 8/9 would fail when compile c code including this header). We have to make sure the compiler supports the SVE ACLE before using it. Here are the different scenarios: > > 1. The SVE compiler test success, and `SVE_CFLAGS` is set to `-march=armv8-a+sve`. All symbols in `libvmath.so` are built successfully including NEON/SVE. Hence, the vector math operations with all kinds of vector size on both NEON/SVE machines will be improved as expected. > 2. The SVE compiler test fail, and `SVE_CFLAGS` is null. SVE symbols in `libvmath.so` cannot be built out. Only NEON symbols exist in `libvmath.so`. Hence, the enhancement for vector math operations with > 128-bit vector size on SVE machines are missing. > @XiaohongGong If we are sure that the SVE test will always succeed when running on gcc 10 or higher, then I guess I don't really need a way to enforce SVE support -- you'll just have to make sure you use a recent enough gcc. > > But, then the entire test becomes a bit unnecessary. You can just replace it with a version check on gcc, or perhaps a FLAGS_COMPILER_CHECK_ARGUMENTS on `-march=armv8-a+sve`. Thanks for the suggestion @magicus ! Replacing with a version check for the c compiler seems fine. But I cannot see the advantange than current test. Here are the reasons: 1. `-march=armv8-a+sve` is the necessary cflag for the sve source, and only included start from some c compiler versions. The c compiler version check must happen before using it. So it should also happen in the make or configure stage? Hence, we still have to find a right place to check it (should be in `lib-sleef.m4` or otherwhere?). 2. We not only have to check the gcc version, but also have to check the clang version. Would this make the code more complex? Regarding to using `FLAGS_COMPILER_CHECK_ARGUMENTS on "-march=armv8-a+sve"`, it is not right as well. Because we have to make sure the c compiler supports SVE ACLE completely which contains the sve header `arm_sve.h`. The compiler that supports option `-march=armv8-a+sve` cannot make sure the SVE ACLE is supported as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1839875881 From fyang at openjdk.org Tue Dec 5 03:36:34 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 5 Dec 2023 03:36:34 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: Message-ID: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> On Wed, 15 Nov 2023 15:44:47 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced tmp with t0 Unfortunately, I witnessed performance regression on sifive unmatched board. Before: FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.243 ? 0.506 ops/ms FpRoundingBenchmark.test_floor 2048 thrpt 15 39.448 ? 0.076 ops/ms FpRoundingBenchmark.test_rint 2048 thrpt 15 39.411 ? 0.134 ops/ms FpRoundingBenchmark.test_round_double 2048 thrpt 15 31.329 ? 0.085 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 31.328 ? 0.031 ops/ms After: FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.375 ? 0.125 ops/ms FpRoundingBenchmark.test_floor 2048 thrpt 15 39.407 ? 0.076 ops/ms FpRoundingBenchmark.test_rint 2048 thrpt 15 39.387 ? 0.235 ops/ms FpRoundingBenchmark.test_round_double 2048 thrpt 15 23.940 ? 0.025 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 30.629 ? 0.021 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1839944647 From dholmes at openjdk.org Tue Dec 5 04:44:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Dec 2023 04:44:33 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 00:23:45 GMT, Serguei Spitsyn wrote: > The fix is for a regression caused by: > [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 > > The fix of 8308614 just triggered a known issue: > [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option > > The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. > > Testing: > - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix > - In progress: Test with tiers 1-6 src/hotspot/share/prims/jvmtiThreadState.cpp line 562: > 560: if (JvmtiThreadState::seen_interp_only_mode() || > 561: JvmtiExport::should_post_field_access() || > 562: JvmtiExport::should_post_field_modification()){ The comment needs updating to explain the extra checks. Can't say I see the connection with [8316283](https://bugs.openjdk.org/browse/JDK-8316283) as no `-Xcomp` is involved in the current failures AFAICS. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16961#discussion_r1414867564 From dholmes at openjdk.org Tue Dec 5 05:52:43 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Dec 2023 05:52:43 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold In-Reply-To: References: Message-ID: On Sun, 3 Dec 2023 12:51:24 GMT, Thomas Stuefe wrote: > We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes > a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. > > --- > > Motivation: > > The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. > > One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. > > Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. > > Letting the JVM handle this Limit has many advantages: > > - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. > > - Re-using the normal error reporting mechanism is powerful since: > - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. > - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. > - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. > > ---- > > Usage: > > Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. > `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. > > If given as percent, JVM will also react to container limit updates. > > Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: > > `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` > > ---- > > Patch: > > Implemented for Linux, MacOS and Windows. Left out AIX since there we have a long-... Hi Thomas, I've taken a first pass through this and it seems okay in principle. A number of initial comments/suggestions below. Thanks. src/hotspot/os/aix/os_aix.cpp line 1299: > 1297: } > 1298: > 1299: // Unimplemented Is this temporary or does AIX not support a way to get RSS? src/hotspot/os/bsd/os_bsd.cpp line 1473: > 1471: result = info.resident_size; > 1472: } > 1473: #endif // __APPLE__ Hmmm so no general BSD support either ... src/hotspot/share/runtime/globals.hpp line 1372: > 1370: "memory size (e.g. \"2G\") or as a percentage of " \ > 1371: "the total available memory on this machine or in this " \ > 1372: "container (e.g. \"-XX:RssLimit=80%%\"). A value of 0 (default) " \ It would be more usual to take this as a fraction of available memory e.g. 0.8. That simplifies the parsing and validation logic. src/hotspot/share/runtime/globals.hpp line 1378: > 1376: "If RssLimit is set, interval, in ms, at which the JVM will " \ > 1377: "check the process resident set size." \ > 1378: range(10, UINT_MAX)) \ Can we actually handle enrolling a periodic task with a UINT_MAX interval? src/hotspot/share/runtime/os.hpp line 774: > 772: > 773: // Returns the process working set size (rss); 0 if unsupported. > 774: static size_t get_rss(); Nit: as it is an acronym `get_RSS` would be better IMO - just for this accessor; no need to rename everything to RSS. src/hotspot/share/runtime/threads.cpp line 775: > 773: if (RssLimit != nullptr) { > 774: RssWatcher::initialize(RssLimit); > 775: } So I think if we are on AIX or regular BSD then we should at least give a warning that the flag will be ignored, and actually ignore it. src/hotspot/share/services/rsswatch.cpp line 63: > 61: > 62: void update_limit() { > 63: const size_t limit_100 = os::physical_memory(); Can this change dynamically? src/hotspot/share/services/rsswatch.cpp line 113: > 111: } else { > 112: if (!parse_integer(s, (char**)&s, &limit) || limit == 0) { > 113: vm_exit_during_initialization("Failed to parse RssLimit", "Not a valid limit size"); You specified that zero turned the feature off src/hotspot/share/services/rsswatch.hpp line 2: > 1: /* > 2: * Copyright (c) 1999, 2023, Oracle and/or its affiliates. All rights reserved. Copyright should not include 1999. src/hotspot/share/services/rsswatch.hpp line 41: > 39: }; > 40: > 41: #endif // OS_LINUX_RSSWATCH_HPP Comment is wrong ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16938#pullrequestreview-1763989601 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414907775 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414908253 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414904350 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414909274 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414896587 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414910546 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414901304 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414905726 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414906442 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1414906583 From dholmes at openjdk.org Tue Dec 5 06:25:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Dec 2023 06:25:33 GMT Subject: RFR: 8321269: Require platforms to define DEFAULT_CACHE_LINE_SIZE In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 12:42:45 GMT, Aleksey Shipilev wrote: > Found it while doing new code that wants to know the cache line size. Currently, there is a fallback in `globalDefinitions.hpp` that defaults `DEFAULT_CACHE_LINE_SIZE` to `64` if platform does not define it. Instead of relying on default, force platform definitions to tell what is the reasonable default for the platform. This would simplify porting to other architectures, with less surprises for them. > > The actual sizes do not change. If any existing platform needs adjustments, those should be handled as separate issues. Seems fine to me too. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16948#pullrequestreview-1764043641 From xgong at openjdk.org Tue Dec 5 07:24:42 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Tue, 5 Dec 2023 07:24:42 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> Message-ID: On Fri, 1 Dec 2023 16:36:18 GMT, Magnus Ihse Bursie wrote: >> You need to expand this logic to cover more instances. See e.g. lib-ffi.m4 for inspiration. >> >> Basic flow: >> * if user has specified libsleef root with argument, check both lib/ and lib64/ under that root. >> * if user has not specified libsleef root, and we have no SYSROOT, try PKG_CHECK >> * Otherwise, look in well-known directories which is $SYSROOT/usr/[local/]lib[64]. > > also, ideally, you will add the corresponding specific overrides like in ffi: > > AC_ARG_WITH(libffi-include, [AS_HELP_STRING([--with-libffi-include], > [specify directory for the libffi include files])]) > AC_ARG_WITH(libffi-lib, [AS_HELP_STRING([--with-libffi-lib], > [specify directory for the libffi library])]) Thanks for the suggestion @magicus ! The check in current `lib-sleef.m4` is very common: - If user has specified libsleef root by '--with-libsleef', we assume it is the manually built sleef lib. So only `lib/` and `include/` is checked. And the flags are set based on that path. - If user has not specified the libsleef root, and no `SYSROOT` is set, we try `PKG_CHECK` (like what you suggested) - Otherwise, check `sleef.h` - We assume the sleef module is installed under one of the valid system paths if the header can be found. So just linking with `-lsleef` will success. It's an issue in current flow like what @theRealAph met. I will add the options like `--with-libsleef-lib` and `--with-libsleef-include` like ffi. Regarding to extending the check for`--with-libsleef`, I think we can just make it simple like what it is now. Or, we have to check all the potential valid lib paths like `lib/`, `lib64/`, or maybe `lib/aarch64-linux-gnu`. The same to the `include` part. @theRealAph @magicus , WDYT? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1414980861 From aboldtch at openjdk.org Tue Dec 5 08:07:39 2023 From: aboldtch at openjdk.org (Axel Boldt-Christmas) Date: Tue, 5 Dec 2023 08:07:39 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v4] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 09:32:03 GMT, Stefan Karlsson wrote: >> There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: >> >> >> if (UseTransparentHugePages && !HugePages::supports_thp()) { >> if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { >> log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); >> } >> UseLargePages = UseTransparentHugePages = false; >> return; >> } >> >> >> This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: >> >> /sys/kernel/mm/transparent_hugepage/enabled: never >> /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise >> >> >> the above code will force ZGC to run without THPs. >> >> This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: >> >> 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. >> >> 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. >> >> 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. >> >> The result of this change can be seen in these tables: >> >> ZGC large pages log output: >> >> E (T) = Enabled (Transparent) >> E (T, OS) = Enabled (Transparent, OS enforced) >> D = Disabled >> D = Disabled (OS enforced) >> >> -XX:+UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+--------+---------+------- >> always | E (T) | E (T) | E (T) >> within_size | E (T) | E (T) | E (T) >> advise | E (T) | E (T) | E (T) >> never | D (OS) | D (OS) | D (OS) >> deny | D (OS) | D (OS) | D (OS) >> force | E (T) | E (T) | E (T) >> >> -XX:-UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+-----------+-----------+------- >> always | E (T, OS) | E (T, OS) | E (T, OS) >> within_size | E (T, OS) | E (T, OS) | E (T, OS) >> advise | D ... > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > More precise THP warning messages lgtm. ------------- Marked as reviewed by aboldtch (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16690#pullrequestreview-1764262418 From rehn at openjdk.org Tue Dec 5 08:29:35 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 5 Dec 2023 08:29:35 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v3] In-Reply-To: References: Message-ID: <4V_VYkuPcPj-XGIIH55usNwO9CT3zE6mNTIC1ZmWzWE=.a7d0692f-9abd-44ae-99f2-fce64916acf0@github.com> > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Removed template ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/3b2aeec8..d5048756 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=01-02 Stats: 37 lines in 1 file changed: 0 ins; 6 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From sspitsyn at openjdk.org Tue Dec 5 08:32:33 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Dec 2023 08:32:33 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected In-Reply-To: References: Message-ID: <3ON__gPCupcsrNpP0wxI9gi6ur3d0TpcvZ47cFjLM2k=.f9152d39-9a2f-4745-a869-4cd74973eb2f@github.com> On Tue, 5 Dec 2023 04:41:29 GMT, David Holmes wrote: >> The fix is for a regression caused by: >> [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 >> >> The fix of 8308614 just triggered a known issue: >> [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option >> >> The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. >> >> Testing: >> - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix >> - In progress: Test with tiers 1-6 > > src/hotspot/share/prims/jvmtiThreadState.cpp line 562: > >> 560: if (JvmtiThreadState::seen_interp_only_mode() || >> 561: JvmtiExport::should_post_field_access() || >> 562: JvmtiExport::should_post_field_modification()){ > > The comment needs updating to explain the extra checks. > > Can't say I see the connection with [8316283](https://bugs.openjdk.org/browse/JDK-8316283) as no `-Xcomp` is involved in the current failures AFAICS. Thank you for the suggestion. I'll add a comment. The `-Xcomp` flag is not a root cause but a trigger. There is a general issue that a frame can be not deoptimized. The `-Xcomp` is a stress that helps to reproduce the issue as the `-Xcomp` option and `interp_only_mode` works in opposite directions. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16961#discussion_r1415125933 From stuefe at openjdk.org Tue Dec 5 09:33:35 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 09:33:35 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold In-Reply-To: References: Message-ID: <5TQwGxIBEccxjD7OvdYnyrTEGlD396zHY4BaYnWMbvU=.d6ccf718-b45a-4948-a3c5-756b161a9925@github.com> On Tue, 5 Dec 2023 05:31:48 GMT, David Holmes wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > src/hotspot/share/services/rsswatch.cpp line 63: > >> 61: >> 62: void update_limit() { >> 63: const size_t limit_100 = os::physical_memory(); > > Can this change dynamically? Yes; os::physical_memory() is, when run in a container, fed from the dynamically updated container limit. There is an existing issue (I think at least it is one) that affect this PR but should be solved separately: reading limits is expensive since we read the proc file system. But container limits only change very rarely. We currently have a timeout for container data, so we don't read the limits every time we access `os::physical_memory(),` but the timeout is hard coded and IIRC very short. Some milliseconds? This timeout should be a lot larger and possibly be configurable. Have not opened a JBS issue yet since I wanted feedback first from @jerboaa . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415233206 From stuefe at openjdk.org Tue Dec 5 09:37:36 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 09:37:36 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 05:37:12 GMT, David Holmes wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > src/hotspot/share/runtime/globals.hpp line 1372: > >> 1370: "memory size (e.g. \"2G\") or as a percentage of " \ >> 1371: "the total available memory on this machine or in this " \ >> 1372: "container (e.g. \"-XX:RssLimit=80%%\"). A value of 0 (default) " \ > > It would be more usual to take this as a fraction of available memory e.g. 0.8. > > That simplifies the parsing and validation logic. I think percent is easier to understand: - RSSLimit=100 - percent, I guess? Because such a small number makes no sense as limit? - RSSLimit=90 - percent, I guess? - RSSLimit=90000000 - not percent, because it is larger than 100? - RSSLimit=100.0 - percent because of the decimal point? Alternative would be a complementary "RssLimitPercent" switch, but I want to keep the number of switches at a minimum. > src/hotspot/share/services/rsswatch.cpp line 113: > >> 111: } else { >> 112: if (!parse_integer(s, (char**)&s, &limit) || limit == 0) { >> 113: vm_exit_during_initialization("Failed to parse RssLimit", "Not a valid limit size"); > > You specified that zero turned the feature off Ah, good catch. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415239254 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415239749 From stuefe at openjdk.org Tue Dec 5 09:42:34 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 09:42:34 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 05:43:05 GMT, David Holmes wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > src/hotspot/os/aix/os_aix.cpp line 1299: > >> 1297: } >> 1298: >> 1299: // Unimplemented > > Is this temporary or does AIX not support a way to get RSS? When I was still more deeply involved, I did not find a way. There may be one now; we also could re-use NMT commit total for that, if NMT is enabled. So, I am not sure. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415245869 From stuefe at openjdk.org Tue Dec 5 09:46:36 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 09:46:36 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 05:43:53 GMT, David Holmes wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > src/hotspot/os/bsd/os_bsd.cpp line 1473: > >> 1471: result = info.resident_size; >> 1472: } >> 1473: #endif // __APPLE__ > > Hmmm so no general BSD support either ... It is possible on BSD (e.g. using [kvm_getprocs](https://man.freebsd.org/cgi/man.cgi?query=kvm_getprocs&sektion=3&n=1) ) - however, I am no BSD expert and have no system to build BSD on. Note that in both cases - AIX and the BSDs - I am not better nor worse than other code that reads RSS, e.g. jfr_report_memory_info(). Which, btw, is a code unification possibility I plan on following up on in a separate RFE (reusing get_rss for use cases such as jfr_report_memory_info). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415250588 From aph at openjdk.org Tue Dec 5 09:47:35 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 5 Dec 2023 09:47:35 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: On Mon, 4 Dec 2023 15:43:23 GMT, Andrew Haley wrote: > > @theRealAph the `RestoreMXCSROnJNICalls` flag is a product flag not diagnostic. > > Ah, thanks, > > > Aliased flags are setup in arguments.cpp by editing this: > > OK. How about we split this into two, this first part without a CSR, and the second part, which creates the generic alias, with one? That way we can mitigate a live problem in this release. Please? One day left. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16851#issuecomment-1840394680 From fweimer at redhat.com Tue Dec 5 10:08:38 2023 From: fweimer at redhat.com (Florian Weimer) Date: Tue, 05 Dec 2023 11:08:38 +0100 Subject: Use of C++ dynamic global object initialization with thread guards In-Reply-To: <0f2be98c-f07b-4877-b78b-b0c38badfecf@littlepinkcloud.com> (Andrew Haley's message of "Mon, 4 Dec 2023 15:48:09 +0000") References: <87fs0izasf.fsf@oldenburg.str.redhat.com> <0f2be98c-f07b-4877-b78b-b0c38badfecf@littlepinkcloud.com> Message-ID: <87r0k1vu55.fsf@oldenburg.str.redhat.com> * Andrew Haley: > On 12/4/23 07:28, Florian Weimer wrote: >> Furthermore, most uses of C++ dynamic initialization involve a >> computation that is idempotent and have unused bit patterns in the >> initialized value. This means that a separate guard variable is not >> needed, and a simple atomic store/atomic load could be used. > > I used it in HotSpot code to trigger one-time resolution of some JDK > classes. These classes were in an incubator module, so I did not want > them to be loaded by default. I guess we could replace the > C++ mechanism by, one of our own, but that doesn't seem to me to be > much of a maintenance win. It pulls in a chunk of the C++ run-time, the complete C++ exception handling implementation is required. In GCC 12 and later, the C++ unwinder is no longer self-contained, so you pick up a dependency on glibc 2.35 or later if you link libstdc++ statically. Thanks, Florian From iwalulya at openjdk.org Tue Dec 5 10:17:36 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Tue, 5 Dec 2023 10:17:36 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v6] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: <6SIvBc11xioUfOBp7ywrO5T1wbE3yvwo6kCLXfYVz6c=.2284c684-38d3-436e-9bf0-34a662a5a73f@github.com> On Mon, 4 Dec 2023 12:39:59 GMT, Thomas Schatzl wrote: >> Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) >> >> Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). >> >> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. >> >> Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). >> >> Upcoming changes will >> * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. >> * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) >> * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism >> * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) >> * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. >> >> Please also first looking into the (small) PR this depends on. >> >> The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. >> >> Testing: tier1-7 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review: move class unloading outside of weak_refs_work Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16759#pullrequestreview-1764645335 From stuefe at openjdk.org Tue Dec 5 10:21:36 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 10:21:36 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: <9vcITlhDNt3bnnN_iG8JXZGEIZplS6CoZtfYnZQj-wI=.068ee413-e703-4743-84f3-b34a5617cfbe@github.com> On Tue, 28 Nov 2023 15:58:04 GMT, Andrew Haley wrote: >> Some buggy libraries corrupt the floating-point control register. Provide something similar to the x86 RestoreMXCSROnJNICalls. >> >> I realize that using the x86ish name "RestoreMXCSROnJNICalls" might be a little controversial, but it is a _global_ flag, not a CPU-specific one. And it's clearly intended for this purpose. It might have been better if that flag had been given a better name twentyish years ago, but we can't change it now. > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Fix thinko Looks good. I looked at https://help.totalview.io/previous_releases/2019/HTML/index.html#page/Reference_Guide%2FPowerFPSCRRegister_2.html%23, and there seems to be another exception bit, bit 15, the "Input denormal exception enable". That one is okay to be modified? ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16851#pullrequestreview-1764652762 From dholmes at openjdk.org Tue Dec 5 10:29:46 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Dec 2023 10:29:46 GMT Subject: RFR: 8320530: has_resolved_ref_index flag not restored after resetting entry [v2] In-Reply-To: References: <4Bi8mWx5pxfYciHnXHUla1X_BzUt_56q8MoRkCYc0dk=.50072528-a5d9-4708-995b-c0bd49f8c74b@github.com> Message-ID: On Thu, 30 Nov 2023 13:39:36 GMT, Coleen Phillimore wrote: >> I see, I said the opposite. I don't like the #ifdef ASSERT blocks. 3 lines vs 1. > > I think each variable should have DEBUG_ONLY() around it, not grouped though. > 3 lines vs 1 Note I said multi-line DEBUG_ONLY. Single lines should use DEBUG_ONLY ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16769#discussion_r1415315978 From shade at openjdk.org Tue Dec 5 10:38:52 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 5 Dec 2023 10:38:52 GMT Subject: RFR: 8321269: Require platforms to define DEFAULT_CACHE_LINE_SIZE In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 12:42:45 GMT, Aleksey Shipilev wrote: > Found it while doing new code that wants to know the cache line size. Currently, there is a fallback in `globalDefinitions.hpp` that defaults `DEFAULT_CACHE_LINE_SIZE` to `64` if platform does not define it. Instead of relying on default, force platform definitions to tell what is the reasonable default for the platform. This would simplify porting to other architectures, with less surprises for them. > > The actual sizes do not change. If any existing platform needs adjustments, those should be handled as separate issues. Thanks all! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16948#issuecomment-1840477719 From shade at openjdk.org Tue Dec 5 10:38:54 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 5 Dec 2023 10:38:54 GMT Subject: Integrated: 8321269: Require platforms to define DEFAULT_CACHE_LINE_SIZE In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 12:42:45 GMT, Aleksey Shipilev wrote: > Found it while doing new code that wants to know the cache line size. Currently, there is a fallback in `globalDefinitions.hpp` that defaults `DEFAULT_CACHE_LINE_SIZE` to `64` if platform does not define it. Instead of relying on default, force platform definitions to tell what is the reasonable default for the platform. This would simplify porting to other architectures, with less surprises for them. > > The actual sizes do not change. If any existing platform needs adjustments, those should be handled as separate issues. This pull request has now been integrated. Changeset: a56286f7 Author: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/a56286f7ad9a8110026f48eb45f1d7a273b2f9fb Stats: 7 lines in 4 files changed: 6 ins; 0 del; 1 mod 8321269: Require platforms to define DEFAULT_CACHE_LINE_SIZE Reviewed-by: stefank, stuefe, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/16948 From tschatzl at openjdk.org Tue Dec 5 10:40:57 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 5 Dec 2023 10:40:57 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v6] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: <3gKHRtjSjtbzvd3oXVu6w5axjWfSg9CDLVO53OipxTM=.373a814d-fc5c-41e4-86af-b8f133b6e255@github.com> On Mon, 4 Dec 2023 12:57:57 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review: move class unloading outside of weak_refs_work > > Only one minor & subjective comment. Thanks @albertnetymk @walulyai for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/16759#issuecomment-1840480439 From tschatzl at openjdk.org Tue Dec 5 10:40:59 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 5 Dec 2023 10:40:59 GMT Subject: Integrated: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading In-Reply-To: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Tue, 21 Nov 2023 11:03:12 GMT, Thomas Schatzl wrote: > Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) > > Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). > > The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. > > Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). > > Upcoming changes will > * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. > * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) > * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism > * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) > * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. > > Please also first looking into the (small) PR this depends on. > > The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. > > Testing: tier1-7 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 30817b74 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/30817b742300f10f566e6aee3a8c1f8af4ab3083 Stats: 483 lines in 28 files changed: 352 ins; 87 del; 44 mod 8317809: Insertion of free code blobs into code cache can be very slow during class unloading Reviewed-by: iwalulya, ayang ------------- PR: https://git.openjdk.org/jdk/pull/16759 From stuefe at openjdk.org Tue Dec 5 10:41:13 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 10:41:13 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: > We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes > a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. > > --- > > Motivation: > > The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. > > One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. > > Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. > > Letting the JVM handle this Limit has many advantages: > > - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. > > - Re-using the normal error reporting mechanism is powerful since: > - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. > - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. > - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. > > ---- > > Usage: > > Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. > `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. > > If given as percent, JVM will also react to container limit updates. > > Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: > > `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` > > ---- > > Patch: > > Implemented for Linux, MacOS and Windows. Left out AIX since there we have a long-... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: feedback david ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16938/files - new: https://git.openjdk.org/jdk/pull/16938/files/8db30011..f6f43ce4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16938&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16938&range=00-01 Stats: 16 lines in 7 files changed: 5 ins; 1 del; 10 mod Patch: https://git.openjdk.org/jdk/pull/16938.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16938/head:pull/16938 PR: https://git.openjdk.org/jdk/pull/16938 From stuefe at openjdk.org Tue Dec 5 10:41:17 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 10:41:17 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 05:49:58 GMT, David Holmes wrote: > Hi Thomas, > > I've taken a first pass through this and it seems okay in principle. A number of initial comments/suggestions below. > > Thanks. Thanks a lot, David! Makes me happy to see this finds acceptance at least in principle. I changed: - get_rss to get_RSS - removed the "0 means off" text, since I assume passing 0 would be likely a user error. Instead, I also added an error check for percentage = 0.0. - added a warning if the OS does not support this feature > src/hotspot/share/runtime/globals.hpp line 1378: > >> 1376: "If RssLimit is set, interval, in ms, at which the JVM will " \ >> 1377: "check the process resident set size." \ >> 1378: range(10, UINT_MAX)) \ > > Can we actually handle enrolling a periodic task with a UINT_MAX interval? No, we can't; I'll correct the limit. > src/hotspot/share/runtime/threads.cpp line 775: > >> 773: if (RssLimit != nullptr) { >> 774: RssWatcher::initialize(RssLimit); >> 775: } > > So I think if we are on AIX or regular BSD then we should at least give a warning that the flag will be ignored, and actually ignore it. Done > src/hotspot/share/services/rsswatch.hpp line 41: > >> 39: }; >> 40: >> 41: #endif // OS_LINUX_RSSWATCH_HPP > > Comment is wrong fixed ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1840480839 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415325628 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415331131 PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415330645 From dholmes at openjdk.org Tue Dec 5 10:41:18 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 5 Dec 2023 10:41:18 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 09:34:27 GMT, Thomas Stuefe wrote: >> src/hotspot/share/runtime/globals.hpp line 1372: >> >>> 1370: "memory size (e.g. \"2G\") or as a percentage of " \ >>> 1371: "the total available memory on this machine or in this " \ >>> 1372: "container (e.g. \"-XX:RssLimit=80%%\"). A value of 0 (default) " \ >> >> It would be more usual to take this as a fraction of available memory e.g. 0.8. >> >> That simplifies the parsing and validation logic. > > I think percent is easier to understand: > > - RSSLimit=100 - percent, I guess? Because such a small number makes no sense as limit? > - RSSLimit=90 - percent, I guess? > - RSSLimit=90000000 - not percent, because it is larger than 100? > - RSSLimit=100.0 - percent because of the decimal point? > > Alternative would be a complementary "RssLimitPercent" switch, but I want to keep the number of switches at a minimum. It would be 0.9 == 90% etc. so 0.x to 1.0 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415324803 From stuefe at openjdk.org Tue Dec 5 10:41:19 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 10:41:19 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 10:33:21 GMT, Thomas Stuefe wrote: >> src/hotspot/share/runtime/globals.hpp line 1378: >> >>> 1376: "If RssLimit is set, interval, in ms, at which the JVM will " \ >>> 1377: "check the process resident set size." \ >>> 1378: range(10, UINT_MAX)) \ >> >> Can we actually handle enrolling a periodic task with a UINT_MAX interval? > > No, we can't; I'll correct the limit. fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1415330935 From fyang at openjdk.org Tue Dec 5 10:42:45 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 5 Dec 2023 10:42:45 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v8] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 22:17:11 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Removed comment and break clause from default switch case. Thanks for the update. Several comments remain. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1507: > 1505: #define DO_ELEMENT_LOAD(reg, idx) \ > 1506: switch (eltype) { \ > 1507: case T_BOOLEAN: lb(reg, Address(ary, idx * elsize)); break; \ Since `T_BOOLEAN` is used to signify unsigned bytes [1], shouldn't we use `lbu` instead of `lb` here? Seems that the existing tests didn't cover this case? [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/ArraysSupport.java#L192 src/hotspot/cpu/riscv/riscv.ad line 10308: > 10306: instruct arrays_hashcode(iRegP_R11 ary, iRegI_R12 cnt, iRegI_R10 result, immI basic_type, > 10307: iRegLNoSp tmp1, iRegINoSp tmp2, > 10308: iRegINoSp tmp3, iRegLNoSp tmp4, rFlagsReg cr) Maybe declare `tmp2` and `tmp3` as `iRegLNoSp`? I see these two are used as 64-bit registers in `C2_MacroAssembler::arrays_hashcode`. src/hotspot/cpu/riscv/riscv.ad line 10312: > 10310: match(Set result (VectorizedHashCode (Binary ary cnt) (Binary result basic_type))); > 10311: effect(TEMP tmp1, TEMP tmp2, TEMP tmp3, TEMP tmp4, > 10312: USE_KILL ary, USE_KILL cnt, USE basic_type, KILL cr); I don't think we need to add `USE basic_type` to the effect list. It is an immediate input which is already there in the match rule. ------------- Changes requested by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16629#pullrequestreview-1763733536 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415331676 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415332917 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415334167 From epeter at openjdk.org Tue Dec 5 10:49:52 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 5 Dec 2023 10:49:52 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v5] In-Reply-To: References: Message-ID: > I'm making sure that `allocate_bci_to_data` is only called when holding the `extra_data_lock`, so that no concurrent calls of it can ever occur. > > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: make lock not safepointing ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/83cda010..c126f340 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=03-04 Stats: 54 lines in 11 files changed: 22 ins; 8 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From duke at openjdk.org Tue Dec 5 11:02:45 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Tue, 5 Dec 2023 11:02:45 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v8] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 10:37:43 GMT, Fei Yang wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Removed comment and break clause from default switch case. > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1507: > >> 1505: #define DO_ELEMENT_LOAD(reg, idx) \ >> 1506: switch (eltype) { \ >> 1507: case T_BOOLEAN: lb(reg, Address(ary, idx * elsize)); break; \ > > Since `T_BOOLEAN` is used to signify unsigned bytes [1], shouldn't we use `lbu` instead of `lb` here? Seems that the existing tests didn't cover this case? > > [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/ArraysSupport.java#L192 Agreed. > src/hotspot/cpu/riscv/riscv.ad line 10308: > >> 10306: instruct arrays_hashcode(iRegP_R11 ary, iRegI_R12 cnt, iRegI_R10 result, immI basic_type, >> 10307: iRegLNoSp tmp1, iRegINoSp tmp2, >> 10308: iRegINoSp tmp3, iRegLNoSp tmp4, rFlagsReg cr) > > Maybe declare `tmp2` and `tmp3` as `iRegLNoSp`? I see these two are used as 64-bit registers in `C2_MacroAssembler::arrays_hashcode`. Agreed. > src/hotspot/cpu/riscv/riscv.ad line 10312: > >> 10310: match(Set result (VectorizedHashCode (Binary ary cnt) (Binary result basic_type))); >> 10311: effect(TEMP tmp1, TEMP tmp2, TEMP tmp3, TEMP tmp4, >> 10312: USE_KILL ary, USE_KILL cnt, USE basic_type, KILL cr); > > I don't think we need to add `USE basic_type` to the effect list. It is an immediate input which is already there in the match rule. I've borrowed that part from its X86 counterpart: instruct arrays_hashcode(rdi_RegP ary1, rdx_RegI cnt1, rbx_RegI result, immU8 basic_type, ..., legRegD tmp_vec13, rRegI tmp1, rRegI tmp2, rRegI tmp3, rFlagsReg cr) %{ predicate(UseAVX >= 2); match(Set result (VectorizedHashCode (Binary ary1 cnt1) (Binary result basic_type))); effect(..., USE basic_type, KILL cr); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415365533 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415365678 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415365817 From mli at openjdk.org Tue Dec 5 11:15:34 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 5 Dec 2023 11:15:34 GMT Subject: RFR: 8318227: RISC-V: C2 ConvHF2F [v3] In-Reply-To: <13Ot4D45ppGcgnXjlGP1xrYEcZ8LejbI5cxjRruUD4c=.4cd4ca6f-8e4f-4679-9706-59a86d867b6f@github.com> References: <13Ot4D45ppGcgnXjlGP1xrYEcZ8LejbI5cxjRruUD4c=.4cd4ca6f-8e4f-4679-9706-59a86d867b6f@github.com> Message-ID: On Mon, 4 Dec 2023 03:08:38 GMT, Fei Yang wrote: > Hi Hamlin, updated change looks good to me. Please wait a while for the kernel patch to land. Thanks. Sure, I will wait for the kernel patch merged. Thanks for your reviewing! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16802#issuecomment-1840539541 From epeter at openjdk.org Tue Dec 5 11:30:48 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 5 Dec 2023 11:30:48 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v6] In-Reply-To: References: Message-ID: > I'm making sure that `allocate_bci_to_data` is only called when holding the `extra_data_lock`, so that no concurrent calls of it can ever occur. > > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: missed a case where I need to lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/c126f340..465c3815 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=04-05 Stats: 5 lines in 1 file changed: 5 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From duke at openjdk.org Tue Dec 5 11:40:37 2023 From: duke at openjdk.org (Tom Shull) Date: Tue, 5 Dec 2023 11:40:37 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: <1lKNGatJErizF49_uuVYcQvMk8O98NYJDcXWjVf7GNI=.19ab54ad-b12e-4a5a-9b46-184176be2f6d@github.com> On Tue, 28 Nov 2023 15:58:04 GMT, Andrew Haley wrote: >> Some buggy libraries corrupt the floating-point control register. Provide something similar to the x86 RestoreMXCSROnJNICalls. >> >> I realize that using the x86ish name "RestoreMXCSROnJNICalls" might be a little controversial, but it is a _global_ flag, not a CPU-specific one. And it's clearly intended for this purpose. It might have been better if that flag had been given a better name twentyish years ago, but we can't change it now. > > Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: > > Fix thinko src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4439: > 4437: // don't want non-IEEE rounding modes or floating-point traps. > 4438: bfi(tmp1, zr, 22, 4); // Clear DN, FZ, and Rmode > 4439: bfi(tmp1, zr, 8, 5); // Clear exception-control bits (8-12) (Related to both this PR and https://github.com/openjdk/jdk/pull/16637) Shouldn't also explicit flushing inputs to zero, i.e. when AH:FIZ is (1:1), be protected against? Also, is it necessary to clear DN? When looking at the spec, I think this should be allowed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16851#discussion_r1415447581 From tschatzl at openjdk.org Tue Dec 5 11:54:54 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 5 Dec 2023 11:54:54 GMT Subject: RFR: 8321369: Unproblemlist gc/cslocker/TestCSLocker.java Message-ID: Hi all, please review this fix to unproblemlist gc/cs/TestCSLocker.java; the CR to fix this [JDK-8310480](https://bugs.openjdk.org/browse/JDK-8310480) has already been closed as duplicate of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706) that removed the GCLocker for G1 which is the cause for the issue. Note that the test has only been problemlisted for linux-x64 previously, so it has already been run for a long time for other platforms and other collectors, so my testing did not extensively try all the other platforms/gc combinations (I did try with Serial and Parallel a few times with no issues; ZGC is excluded anyway in the test, and Shenandoah also does not use the GCLocker). Testing: test case with g1, gha Thanks, Thomas ------------- Commit messages: - 8321369 unproblemlist testcslocker.java Changes: https://git.openjdk.org/jdk/pull/16970/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16970&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321369 Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16970.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16970/head:pull/16970 PR: https://git.openjdk.org/jdk/pull/16970 From jkern at openjdk.org Tue Dec 5 12:11:46 2023 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 5 Dec 2023 12:11:46 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v3] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: encapsulate everything in os::Aix::dlopen ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/0f6716db..2d32c43b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=01-02 Stats: 175 lines in 2 files changed: 90 ins; 82 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From duke at openjdk.org Tue Dec 5 12:57:05 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Tue, 5 Dec 2023 12:57:05 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v9] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to support _vectorizedHashCode intrinsic on > RISC-V platform. The patch adds the "scalar" code for the intrinsic without > usage of any RVV instruction but provides manual unrolling of the appropriate > loop. The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Thanks, > -Yuri Gaevsky > > P.S. My OCA has been accepted recently (ygaevsky). > > ### Correctness checks > > Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. > > ### Performance results (the numbers for non-ints are similar) > > #### StarFive JH7110 board: > > > ArraysHashCode: without intrinsic with intrinsic > ------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > ------------------------------------------------------------------------------- > multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op > multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op > multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op > multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op > multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op > multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op > multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op > multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op > multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op > multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op > multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op > multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op > multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op > multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op > multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op > multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op > multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op > multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op > ------------------------------------------------------------------------------- > > #### T-Head RVB-ICE board: > > > ArraysHashCode: ... Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: Changed lb-->lbu for T_BOOLEAN and iRegINoSp-->iRegLNoSp for tmp2/tmp3. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16629/files - new: https://git.openjdk.org/jdk/pull/16629/files/a57afe9c..f955a061 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=07-08 Stats: 8 lines in 2 files changed: 0 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/16629.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16629/head:pull/16629 PR: https://git.openjdk.org/jdk/pull/16629 From duke at openjdk.org Tue Dec 5 12:57:08 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Tue, 5 Dec 2023 12:57:08 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v2] In-Reply-To: References: <9i5yHmpRi3-XqL5lw0-0IexhCDr2FOi5nT4dgY7cWao=.ab8a1d6e-c9fc-4108-820b-374ce7815463@github.com> Message-ID: On Tue, 5 Dec 2023 00:54:47 GMT, Fei Yang wrote: >>> The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Hey @ygaevsky, I can work on this real vectorized intrinsic implementation, please let me know how you think about it. >> If you already had a solution or started working on it, please ignore my message. >> >> Thanks. > >> @Hamlin-Li, @RealFYang: please take a look at the latest updates when you have time, thanks. > > Having another look. Thank you for your comments/suggestions, @RealFYang: I've updated the patch except riscv.ad part where I am not sure we need to do the suggested change for basic_type. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1840739121 From duke at openjdk.org Tue Dec 5 12:57:11 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Tue, 5 Dec 2023 12:57:11 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v8] In-Reply-To: References: Message-ID: <6wSifAiOMUtKIqRZ32xBEVO4qaMP0qkvR_c28s6H4ec=.507f7455-0928-47c5-b6e3-0a947eefc349@github.com> On Tue, 5 Dec 2023 10:59:48 GMT, Yuri Gaevsky wrote: >> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1507: >> >>> 1505: #define DO_ELEMENT_LOAD(reg, idx) \ >>> 1506: switch (eltype) { \ >>> 1507: case T_BOOLEAN: lb(reg, Address(ary, idx * elsize)); break; \ >> >> Since `T_BOOLEAN` is used to signify unsigned bytes [1], shouldn't we use `lbu` instead of `lb` here? Seems that the existing tests didn't cover this case? >> >> [1] https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/jdk/internal/util/ArraysSupport.java#L192 > > Agreed. Fixed. >> src/hotspot/cpu/riscv/riscv.ad line 10308: >> >>> 10306: instruct arrays_hashcode(iRegP_R11 ary, iRegI_R12 cnt, iRegI_R10 result, immI basic_type, >>> 10307: iRegLNoSp tmp1, iRegINoSp tmp2, >>> 10308: iRegINoSp tmp3, iRegLNoSp tmp4, rFlagsReg cr) >> >> Maybe declare `tmp2` and `tmp3` as `iRegLNoSp`? I see these two are used as 64-bit registers in `C2_MacroAssembler::arrays_hashcode`. > > Agreed. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415559969 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1415559639 From ihse at openjdk.org Tue Dec 5 13:02:47 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 5 Dec 2023 13:02:47 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 08:48:52 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: > > - Separate neon and sve functions into two source files > - Merge branch 'jdk:master' into JDK-8312425 > - Rename vmath to sleef in configure > - Address review comments in build system > - Add a bundled native lib in jdk as a bridge to libsleef > - Merge 'jdk:master' into JDK-8312425 > - Disable sleef by default > - Merge 'jdk:master' into JDK-8312425 > - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF So you need to check both the flag and the header file? Oh well, then this is probably as good as it gets. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1840748314 From ihse at openjdk.org Tue Dec 5 13:05:48 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Tue, 5 Dec 2023 13:05:48 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> Message-ID: On Tue, 5 Dec 2023 07:21:32 GMT, Xiaohong Gong wrote: >> also, ideally, you will add the corresponding specific overrides like in ffi: >> >> AC_ARG_WITH(libffi-include, [AS_HELP_STRING([--with-libffi-include], >> [specify directory for the libffi include files])]) >> AC_ARG_WITH(libffi-lib, [AS_HELP_STRING([--with-libffi-lib], >> [specify directory for the libffi library])]) > > Thanks for the suggestion @magicus ! > > The check in current `lib-sleef.m4` is very common: > > - If user has specified libsleef root by '--with-libsleef', we assume it is the manually built sleef lib. So only `lib/` and `include/` is checked. And the flags are set based on that path. > - If user has not specified the libsleef root, and no `SYSROOT` is set, we try `PKG_CHECK` (like what you suggested) > - Otherwise, check `sleef.h` > - We assume the sleef module is installed under one of the valid system paths if the header can be found. So just linking with `-lsleef` will success. > > It's an issue in current flow like what @theRealAph met. I will add the options like `--with-libsleef-lib` and `--with-libsleef-include` like ffi. Regarding to extending the check for`--with-libsleef`, I think we can just make it simple like what it is now. Or, we have to check all the potential valid lib paths like `lib/`, `lib64/`, or maybe `lib/aarch64-linux-gnu`. The same to the `include` part. @theRealAph @magicus , WDYT? I'm fine with adding just --with-libsleef-lib and --with-libsleef-include to specify them directly. This makes it at least possible to use, if not overly convenient, for people using a system like Andrew's. If it annoys someone too much, we can extend it later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1415576201 From eastigeevich at openjdk.org Tue Dec 5 13:08:47 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 5 Dec 2023 13:08:47 GMT Subject: Integrated: 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 In-Reply-To: References: Message-ID: <4MsYQRcfq9DUcqUpE9ffLz-NzlIefZoDghBnZ7ns8Fg=.727b2b97-18a3-4686-a4a6-a85455244574@github.com> On Mon, 4 Dec 2023 13:36:17 GMT, Evgeny Astigeevich wrote: > UseCryptoPmullForCRC32 enables to use crypto pmull instructions in CRC32 implementation. It is set to true for Neoverse V1. As the performance of the instructions is the same on Neoverse V2, UseCryptoPmullForCRC32 should be set to true for V2. This pull request has now been integrated. Changeset: 5b02188f Author: Evgeny Astigeevich URL: https://git.openjdk.org/jdk/commit/5b02188f723e0de3faf2d8150b676a4383e1f618 Stats: 9 lines in 1 file changed: 6 ins; 0 del; 3 mod 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 Reviewed-by: shade, ngasson ------------- PR: https://git.openjdk.org/jdk/pull/16949 From mgronlun at openjdk.org Tue Dec 5 13:20:05 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 5 Dec 2023 13:20:05 GMT Subject: RFR: 8211238: @Deprecated JFR event [v14] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: max limit number of recorded edges ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/91ec681a..b4766c39 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=12-13 Stats: 53 lines in 3 files changed: 28 ins; 16 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From eastigeevich at openjdk.org Tue Dec 5 13:22:46 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 5 Dec 2023 13:22:46 GMT Subject: RFR: 8321105: Enable UseCryptoPmullForCRC32 for Neoverse V2 In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 16:36:27 GMT, Nick Gasson wrote: >> UseCryptoPmullForCRC32 enables to use crypto pmull instructions in CRC32 implementation. It is set to true for Neoverse V1. As the performance of the instructions is the same on Neoverse V2, UseCryptoPmullForCRC32 should be set to true for V2. > > Marked as reviewed by ngasson (Reviewer). @nick-arm @shipilev Thank you for reviewing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16949#issuecomment-1840780001 From aph at openjdk.org Tue Dec 5 13:42:39 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 5 Dec 2023 13:42:39 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: <1lKNGatJErizF49_uuVYcQvMk8O98NYJDcXWjVf7GNI=.19ab54ad-b12e-4a5a-9b46-184176be2f6d@github.com> References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> <1lKNGatJErizF49_uuVYcQvMk8O98NYJDcXWjVf7GNI=.19ab54ad-b12e-4a5a-9b46-184176be2f6d@github.com> Message-ID: On Tue, 5 Dec 2023 11:37:48 GMT, Tom Shull wrote: >> Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix thinko > > src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp line 4439: > >> 4437: // don't want non-IEEE rounding modes or floating-point traps. >> 4438: bfi(tmp1, zr, 22, 4); // Clear DN, FZ, and Rmode >> 4439: bfi(tmp1, zr, 8, 5); // Clear exception-control bits (8-12) > > (Related to both this PR and https://github.com/openjdk/jdk/pull/16637) > > Shouldn't also explicit flushing inputs to zero, i.e. when AH:FIZ is (1:1), be protected against? > > Also, is it necessary to clear DN? When looking at the spec, I think this should be allowed. > (Related to both this PR and #16637) > > Shouldn't also explicit flushing inputs to zero, i.e. when AH:FIZ is (1:1), be protected against? I've avoided touching ArmV8.7 ALT_FP, but you might be right. AH:FIZ are both RES0, so it is safe to do so. > Also, is it necessary to clear DN? When looking at the spec, I think this should be allowed. I think it's wise to clear DN. We save and restore all of the FPCR at entry to Java, clearing DN, and it's not unreasonable to expect a JNI call not to mess with FPCR. Also, I think replacing NaN payload bits with the default NaN is pointless, and violates the principle of least surprise if not the spec of `longBitsToDouble`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16851#discussion_r1415629993 From eastigeevich at openjdk.org Tue Dec 5 13:51:36 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 5 Dec 2023 13:51:36 GMT Subject: RFR: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string In-Reply-To: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> References: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> Message-ID: On Tue, 5 Dec 2023 01:08:10 GMT, Yi-Fan Tsai wrote: > Test CheckLargePages was broken by the previous changes: > > [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. > > [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from > > Usable page sizes: 4k, 1G > > to > > Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. > > > This change includes: > - `os::can_execute_large_page_memory` uses `UseLargePages`, which could be implicitly enabled by `UseTransparentHugePages` as well. > - The regular expression in CheckLargePages is updated to capture only the page sizes. > - Test CheckLargePages is still kept in ProblemList.txt until [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795)) is resolved. Changes requested by eastigeevich (Committer). src/hotspot/os/linux/os_linux.cpp line 4007: > 4005: > 4006: bool os::can_execute_large_page_memory() { > 4007: return UseLargePages; Could you please exclude this change from the PR? The function will be removed by my PR. In `test/hotspot/jtreg/ProblemList.txt` could you please change compiler/codecache/CheckLargePages.java 8317831 linux-x64 to compiler/codecache/CheckLargePages.java 8319795 linux-x64 ------------- PR Review: https://git.openjdk.org/jdk/pull/16962#pullrequestreview-1765126244 PR Review Comment: https://git.openjdk.org/jdk/pull/16962#discussion_r1415642973 From stuefe at openjdk.org Tue Dec 5 13:55:51 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 13:55:51 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v3] In-Reply-To: References: Message-ID: <7V0zHrWeOjnDyHJuq3DFsb-BvaQvZbwE5zIGyxWvGNE=.48a0fc72-c70d-4e21-891a-5f4714bac830@github.com> On Tue, 5 Dec 2023 12:11:46 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > encapsulate everything in os::Aix::dlopen Excellent, this is how I have pictured a good solution. Very nice. A number of remarks, but nothing fundamental. src/hotspot/os/aix/os_aix.cpp line 1137: > 1135: if (ebuf != nullptr && ebuflen > 0) { > 1136: ::strncpy(ebuf, "dll_load: empty filename specified", ebuflen - 1); > 1137: } Are there any cases where we don't hand in the error buffer? If so, I would just assert ebuf and ebuflen. No need for this kind of flexibility. src/hotspot/os/aix/os_aix.cpp line 3051: > 3049: > 3050: // Simulate the library search algorithm of dlopen() (in os::dll_load) > 3051: int os::Aix::stat64x_via_LIBPATH(const char* path, struct stat64x* stat) { - no need to export this, make it filescope static - please return bool, with false = error - please rename it to something like "search_file_in_LIBPATH" src/hotspot/os/aix/os_aix.cpp line 3055: > 3053: return -1; > 3054: > 3055: char *path2 = strdup (path); Please use os::strdup and os::free. If you really intent to use the plain libc versions, use `::strdup` and `::free` to make sure - and indicate to code readers - you use the global libc variants. src/hotspot/os/aix/os_aix.cpp line 3059: > 3057: int idx = strlen(path2) - 1; > 3058: if (path2[idx] == ')') { > 3059: while (path2[idx] != '(' && idx > 0) idx--; ? Why not `strrchr()`? src/hotspot/os/aix/os_aix.cpp line 3067: > 3065: if (path2[0] == '/' || > 3066: (path2[0] == '.' && (path2[1] == '/' || > 3067: (path2[1] == '.' && path2[2] == '/')))) { This complexity is not needed, nor is it sufficient, since it does not handle relative paths ("mydirectory/hallo.so") https://www.ibm.com/docs/en/aix/7.1?topic=d-dlopen-subroutine "If FilePath contains a slash character, FilePath is used directly, and no directories are searched. " So, just scan for a '/' - if you find one, its a path to be opened directly: const bool use_as_filepath = strchr(path2, '/'); src/hotspot/os/aix/os_aix.cpp line 3085: > 3083: strcpy(libpath, env); > 3084: for (token = strtok_r(libpath, ":", &saveptr); token != nullptr; token = strtok_r(nullptr, ":", &saveptr)) { > 3085: sprintf(combined, "%s/%s", token, path2); You can save a lot of pain and manual labor by using `stringStream` here. stringStream combined; combined.print("%s/%s", token, path2); const char* combined_path_string = combined.base(); no need for manual allocation and byte counting. src/hotspot/os/aix/os_aix.cpp line 3099: > 3097: // filled by os::dll_load(). This way we mimic dl handle equality for a library > 3098: // opened a second time, as it is implemented on other platforms. > 3099: void* os::Aix::dlopen(const char* filename, int Flags) { Does not need to be exported, nor does os::AIX::dlclose. Make file scope static. See my remarks in os_posix.cpp. src/hotspot/os/aix/os_aix.cpp line 3103: > 3101: struct stat64x libstat; > 3102: > 3103: if (os::Aix::stat64x_via_LIBPATH(filename, &libstat)) { Please return bool, not unix int -1, this hurts my brain :-) src/hotspot/os/aix/os_aix.cpp line 3108: > 3106: if (result != nullptr) { > 3107: assert(false, "dll_load: Could not stat() file %s, but dlopen() worked; Have to improve stat()", filename); > 3108: } Since this is just assert code, I'd wrap all this stuff in #ifdef ASSERT. No need for needless dlopens otherwise. src/hotspot/os/aix/os_aix.cpp line 3125: > 3123: } > 3124: if (i == g_handletable_used) { > 3125: // library not still loaded. Check if there is space left in array s/still/yet src/hotspot/os/aix/os_aix.cpp line 3131: > 3129: pthread_mutex_unlock(&g_handletable_mutex); > 3130: assert(false, "max_handletable reached"); > 3131: return nullptr; Please, for the sake of release code, hand in an error buffer and fill it with something that makes sense, eg. "too many libraries loaded". The assert is still okay, I guess, since we don't expect it to fire during tests; if it does fire, it may indicate a problem in our handle table logic or a wrong assumption about handle ?quality. src/hotspot/os/aix/os_aix.cpp line 3133: > 3131: return nullptr; > 3132: } > 3133: // library not still loaded and still place in array, so load library s/still/yet src/hotspot/os/aix/os_aix.cpp line 3143: > 3141: g_handletable[i].devid = libstat.st_dev; > 3142: g_handletable[i].refcount = 1; > 3143: } Error handling: on error, call dlerror and return error string inside the error buffer you should hand in. All other platforms do this too. src/hotspot/os/aix/os_aix.cpp line 3150: > 3148: } > 3149: > 3150: int os::Aix::dlclose(void* lib) { can we call lib something better, maybe "handle"? src/hotspot/os/aix/os_aix.cpp line 3165: > 3163: // refcount == 0, so we have to ::dlclose() the lib > 3164: // and delete the entry from the array. > 3165: res = ::dlclose(lib); Handle dlclose error. We expect it to work; if it doesn't, it indicates that something is wrong with the handle logic, e.g. an invalid or already closed handle had been handed in. So, assert. src/hotspot/os/aix/os_aix.hpp line 185: > 183: // opened a second time, as it is implemented on other platforms. > 184: static void* dlopen(const char* filename, int Flags); > 185: static int dlclose(void* lib); Remove; should not be exported. src/hotspot/os/posix/os_posix.cpp line 735: > 733: l_path = ""; > 734: } > 735: int res = AIX_ONLY(os::Aix)::dlclose(lib); Lets do this cleaner, and in the style of hotspot coding elsewhere: - introduce a new function "os::pd_dll_unload(handle, errorbuf, errbuflen)". Add it to os.hpp, but somewhere non-public. The implementations will live in os_aix.cpp, os_bsd.cpp and os_linux.cpp. - make os::Aix::dlclose -> os::pd_dll_unload; the only difference is that you should fill in error buffer with either ::dlerror or, if you have errors in handle table, a text describing that error - on all other posix platforms (os_bsd.cpp + os_linux.cpp), implement a minimal version of os::pd_dll_unload() that calls ::dlunload, and on error calls ::dlerror and copies the string into errbuf - Here, call os::pd_dll_unload instead of ::dlclose/os::aix::dlclose - change the JFR code below to not use ::dlerror but the string returned from the buffer ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1762354122 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415568694 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415577023 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415588986 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415577568 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415585089 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415594396 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415625152 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415595301 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415596399 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415597594 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415599081 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415601350 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415607920 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415612828 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415619700 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415625511 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415638462 From stuefe at openjdk.org Tue Dec 5 13:55:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 13:55:54 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v2] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 12:33:26 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > improve handling of nonexisting files src/hotspot/os/aix/os_aix.cpp line 203: > 201: constexpr int max_handletable = 1024; > 202: static int g_handletable_used = 0; > 203: static struct handletableentry g_handletable[max_handletable] = {{0,0,0,0}}; style nits: - we usually write the * behind type, not before var name - `{{0,0}}` -> insert spaces src/hotspot/os/aix/os_aix.cpp line 1159: > 1157: result = ::dlopen(filename, dflags); > 1158: if (result != nullptr) { > 1159: assert(false, "dll_load: Could not stat() file %s, but dlopen() worked; Have to improve stat()", filename); use assert(result != nullptr) and remove condition ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1413843503 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1413846111 From stuefe at openjdk.org Tue Dec 5 13:55:56 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 5 Dec 2023 13:55:56 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v3] In-Reply-To: <7V0zHrWeOjnDyHJuq3DFsb-BvaQvZbwE5zIGyxWvGNE=.48a0fc72-c70d-4e21-891a-5f4714bac830@github.com> References: <7V0zHrWeOjnDyHJuq3DFsb-BvaQvZbwE5zIGyxWvGNE=.48a0fc72-c70d-4e21-891a-5f4714bac830@github.com> Message-ID: <4QEE7t_IctCIU9HwHs8qYxvjAxykfQDx00mYzavj9y0=.be9dc6ea-1dfc-4c7e-aa88-2cdc21a0e486@github.com> On Tue, 5 Dec 2023 13:21:35 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> encapsulate everything in os::Aix::dlopen > > src/hotspot/os/aix/os_aix.cpp line 3133: > >> 3131: return nullptr; >> 3132: } >> 3133: // library not still loaded and still place in array, so load library > > s/still/yet No need to be this verbose either, especially since the comment is somewhat misleading. "create entry at end of table" implies that we have a dynamically growing table and allocate new entries. Proposal: "Library not yet loaded; load it, then store its handle in handle table". ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1415605856 From shade at openjdk.org Tue Dec 5 13:57:45 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 5 Dec 2023 13:57:45 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes Message-ID: [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. Additional testing: - [x] Large build matrix of server/zero builds - [x] Linux AArch64 server fastdebug, `tier{1,2}` - [x] Linux x86_64 server fastdebug, `tier{1,2}` ------------- Commit messages: - Work Changes: https://git.openjdk.org/jdk/pull/16973/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16973&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8237842 Stats: 98 lines in 25 files changed: 27 ins; 16 del; 55 mod Patch: https://git.openjdk.org/jdk/pull/16973.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16973/head:pull/16973 PR: https://git.openjdk.org/jdk/pull/16973 From mgronlun at openjdk.org Tue Dec 5 14:05:55 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 5 Dec 2023 14:05:55 GMT Subject: RFR: 8211238: @Deprecated JFR event [v15] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: description and format specifier ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/b4766c39..0943b29d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=13-14 Stats: 13 lines in 2 files changed: 5 ins; 4 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From mli at openjdk.org Tue Dec 5 14:20:40 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 5 Dec 2023 14:20:40 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: <-1ZEb9zsjqsg6L2Rb_teeZePsRwKAxrMGBzjmCUERvk=.a3ea277a-dc10-4e6e-a3f4-4bfe66d0bbf3@github.com> References: <-1ZEb9zsjqsg6L2Rb_teeZePsRwKAxrMGBzjmCUERvk=.a3ea277a-dc10-4e6e-a3f4-4bfe66d0bbf3@github.com> Message-ID: On Mon, 4 Dec 2023 15:48:02 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3895: >> >>> 3893: __ enter(); >>> 3894: >>> 3895: __ push_reg(saved_regs, sp); >> >> Not sure if we need to push and pop `saved_regs `, as t2 is the only register in it, or maybe I miss something? > > t2 is used by C2 as general register, see R7 in riscv.ad. > As this may be inlined directly into the graph IR, i.e. no call to get here, t2 may be a live register. > saved_regs only contains t2 so there is just one spill and one restore. > > No? I see, thanks for explanation! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1415687071 From epeter at openjdk.org Tue Dec 5 14:25:50 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 5 Dec 2023 14:25:50 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v7] In-Reply-To: References: Message-ID: > I'm making sure that `allocate_bci_to_data` is only called when holding the `extra_data_lock`, so that no concurrent calls of it can ever occur. > > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: move a lock to earlier, to have order right with tty lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/465c3815..e0fc8d1b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=05-06 Stats: 17 lines in 1 file changed: 8 ins; 5 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From aph at openjdk.org Tue Dec 5 14:27:57 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 5 Dec 2023 14:27:57 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v4] In-Reply-To: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: > Some buggy libraries corrupt the floating-point control register. Provide something similar to the x86 RestoreMXCSROnJNICalls. > > I realize that using the x86ish name "RestoreMXCSROnJNICalls" might be a little controversial, but it is a _global_ flag, not a CPU-specific one. And it's clearly intended for this purpose. It might have been better if that flag had been given a better name twentyish years ago, but we can't change it now. Andrew Haley has updated the pull request incrementally with one additional commit since the last revision: Clear AH:FIZ ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16851/files - new: https://git.openjdk.org/jdk/pull/16851/files/02a7aaa0..d2afa417 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16851&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16851&range=02-03 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16851.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16851/head:pull/16851 PR: https://git.openjdk.org/jdk/pull/16851 From duke at openjdk.org Tue Dec 5 14:30:49 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Tue, 5 Dec 2023 14:30:49 GMT Subject: RFR: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string [v2] In-Reply-To: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> References: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> Message-ID: <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> > Test CheckLargePages was broken by the previous changes: > > [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from > > Usable page sizes: 4k, 1G > > to > > Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. > > > [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. > > This change includes: > - The regular expression in CheckLargePages is updated to capture only the page sizes. > - The static huge page will be fixed by [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795). Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Reserve only regular expression ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16962/files - new: https://git.openjdk.org/jdk/pull/16962/files/580a2f93..54c763ee Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16962&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16962&range=00-01 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16962.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16962/head:pull/16962 PR: https://git.openjdk.org/jdk/pull/16962 From eastigeevich at openjdk.org Tue Dec 5 15:00:36 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 5 Dec 2023 15:00:36 GMT Subject: RFR: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string [v2] In-Reply-To: <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> References: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> Message-ID: On Tue, 5 Dec 2023 14:30:49 GMT, Yi-Fan Tsai wrote: >> Test CheckLargePages was broken by the previous changes: >> >> [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from >> >> Usable page sizes: 4k, 1G >> >> to >> >> Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. >> >> >> [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. >> >> This change includes: >> - The regular expression in CheckLargePages is updated to capture only the page sizes. >> - The static huge page will be fixed by [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795). > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Reserve only regular expression I tested my PR #16582 with the fixed test. The test passed as expected. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16962#issuecomment-1840961527 From eastigeevich at openjdk.org Tue Dec 5 15:04:36 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 5 Dec 2023 15:04:36 GMT Subject: RFR: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string [v2] In-Reply-To: <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> References: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> Message-ID: On Tue, 5 Dec 2023 14:30:49 GMT, Yi-Fan Tsai wrote: >> Test CheckLargePages was broken by the previous changes: >> >> [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from >> >> Usable page sizes: 4k, 1G >> >> to >> >> Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. >> >> >> [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. >> >> This change includes: >> - The regular expression in CheckLargePages is updated to capture only the page sizes. >> - The static huge page will be fixed by [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795). > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Reserve only regular expression Hi @tstuefe, Could you please have a look? I'd like this to go before my #16582 because I'll be adding new test cases to `CheckLargePages.java`. I'll update `test/hotspot/jtreg/ProblemList.txt` in my #16582. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16962#issuecomment-1840970157 From coleenp at openjdk.org Tue Dec 5 15:50:59 2023 From: coleenp at openjdk.org (Coleen Phillimore) Date: Tue, 5 Dec 2023 15:50:59 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v6] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Mon, 4 Dec 2023 12:39:59 GMT, Thomas Schatzl wrote: >> Insert code blobs in a sorted fashion to exploit the finger-optimization when adding, making this procedure O(n) instead of O(n^2) >> >> Introduces a globally available ClassUnloadingContext that contains common methods pertaining to class and code unloading. GCs may use it to efficiently manage unlinked class loader datas and nmethods to allow use of common methods (unlink/merge). >> >> The steps typically are registering a new to be unlinked CLD/nmethod, and then purge its memory later. STW collectors perform this work in one big chunk taking the CodeCache_lock, for the entire duration, while concurrent collectors lock/unlock for every insertion to allow for concurrent users for the lock to progress. >> >> Some care has been taken to stay consistent with an "unloading = unlinking + purge" scheme; however particularly the existing CLD handling API (still) mixes unlinking and purging in its CLD::unload() call. To simplify this change that is mostly geared towards separating nmethod unlinking from purging, to make code blob freeing O(n) instead of O(n^2). >> >> Upcoming changes will >> * separate nmethod unregistering from nmethod purging to allow doing that in bulk (for the STW collectors); that can significantly reduce code purging time for the STW collectors. >> * better name the second stage of unlinking (called "cleaning" throughout, e.g. the work done in `G1CollectedHeap::complete_cleaning`) >> * untangle CLD unlinking and what's called "cleaning" now to allow moving more stuff into the second unlinking stage for better parallelism >> * G1: move some significant tasks from the remark pause to concurrent (unregistering nmethods, freeing code blobs and cld/metaspace purging) >> * Maybe move Serial/Parallel GC metaspace purging closer to other unlinking/purging code to keep things local and allow easier logging. >> >> Please also first looking into the (small) PR this depends on. >> >> The crash on linux-x86 is fixed by PR#16766 which I split out for quicker reviews. >> >> Testing: tier1-7 >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > ayang review: move class unloading outside of weak_refs_work src/hotspot/share/gc/shared/classUnloadingContext.cpp line 91: > 89: cld->classes_do(f); > 90: } > 91: } I don't understand why CLDG specific methods were moved here. They should be unaware of nmethod purging. and these 4 methods don't have any nmethod purging in them either and are specific to the CLDG implementation. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1415844874 From mgronlun at openjdk.org Tue Dec 5 16:05:58 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 5 Dec 2023 16:05:58 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: update event description ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/0943b29d..9f6bc68a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=14-15 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From jbachorik at openjdk.org Tue Dec 5 16:11:39 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Tue, 5 Dec 2023 16:11:39 GMT Subject: RFR: 8211238: @Deprecated JFR event [v15] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 14:05:55 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > description and format specifier src/hotspot/share/jfr/recorder/checkpoint/jfrCheckpointManager.cpp line 597: > 595: // committed, because the JFR system is yet to be started. > 596: // Therefore, the writer is cancelled before its destructor is run, > 597: // to avoid writing unnecessary inforamation into the checkpoint system. Suggestion: // to avoid writing unnecessary information into the checkpoint system. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1415881909 From eastigeevich at openjdk.org Tue Dec 5 16:17:41 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 5 Dec 2023 16:17:41 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v5] In-Reply-To: References: <81dXSHvLQMGj3s1BcBs8fmJUEoJpaU-5wBRSIjnztMM=.d53f8a2f-8353-49ec-8a9b-695b32f03d20@github.com> Message-ID: On Fri, 1 Dec 2023 21:25:19 GMT, Chris Plummer wrote: >> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: >> >> Apply man changes > > src/hotspot/share/code/codeCache.cpp line 1809: > >> 1807: } >> 1808: >> 1809: void CodeCache::write_perf_map(const char* filename) { > > Why not have a `filename == nullptr` indicate that the default should be used. Then you don't need CodeCache::DefaultPerfMapFile. You can just have a private `CodeCache::defaultPerfmapFileName()` method. Hi Chris, The current design of `write_perf_map` provides a clean and explicit interface. The purpose of the function is evident from its signature: to write a perf map into a specified file. This explicitness makes the code more readable and self-documenting. It reduces the need for developers to go to the implementation to figure out: what is the meaning of `nullptr`; where a filename will be taken from. It also serves as a contract between the caller and the function itself. By explicitly requiring a filename, the function sets clear expectations for the caller. I think `CodeCache::write_default_perf_map` hiding the filename of the default perf map might not be a good idea because it makes impossible to get the filename used in it. I prefer either method `CodeCache::defaultPerfmapFileName()` or class `CodeCache::DefaultPerfmapFileName`. The class is simpler to implement than the method (like it was earlier). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15871#discussion_r1415892122 From asmehra at openjdk.org Tue Dec 5 16:28:37 2023 From: asmehra at openjdk.org (Ashutosh Mehra) Date: Tue, 5 Dec 2023 16:28:37 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: <9G4hiTbrmRil3TIN5dX6HBAdgxzR0da-M9e5iSN-Wm0=.c61e5c75-0db2-4a1f-b7eb-4e0ff699b1ef@github.com> On Tue, 5 Dec 2023 10:36:29 GMT, Thomas Stuefe wrote: >> Hi Thomas, >> >> I've taken a first pass through this and it seems okay in principle. A number of initial comments/suggestions below. >> >> Thanks. > >> Hi Thomas, >> >> I've taken a first pass through this and it seems okay in principle. A number of initial comments/suggestions below. >> >> Thanks. > > Thanks a lot, David! > > Makes me happy to see this finds acceptance at least in principle. > > I changed: > - get_rss to get_RSS > - removed the "0 means off" text, since I assume passing 0 would be likely a user error. Instead, I also added an error check for percentage = 0.0. > - added a warning if the OS does not support this feature Hi @tstuefe this looks useful feature and seems to provides a way to deal with OOM killer in containers. If the user has set container memory limit to 256MB, then the RssLimit can be set to around 200MB. This would let the JVM catch the OOM before it is handled by the kernel. But I have one concern. The effectiveness of this solution really depends on how frequently the check is done. If there is a sudden memory spike, it should, ideally, last longer than `RssLimitCheckInterval` for RssWatcher to take the action. Flipping it the other way, we can say RssWatcher can catch memory spikes that last longer than `RssLimitCheckInterval`. Even then, it can catch the spike only as long as it is less than the container limit. This raises the question of determining the effective value of `RssLimit` and `RssLimitCheckInterval`. For instance, compilations can induce memory spike which may last for few hundred milliseconds at the most, which is much lesser than the default value of 5 secs for `RssLimitCheckInterval`. What are your thoughts on this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1841141950 From ddong at openjdk.org Tue Dec 5 16:36:50 2023 From: ddong at openjdk.org (Denghui Dong) Date: Tue, 5 Dec 2023 16:36:50 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC Message-ID: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> Hi, Could I have a review of this patch? In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. Best, Denghui ------------- Commit messages: - 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC Changes: https://git.openjdk.org/jdk/pull/16976/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16976&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321404 Stats: 53 lines in 3 files changed: 51 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16976.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16976/head:pull/16976 PR: https://git.openjdk.org/jdk/pull/16976 From cslucas at openjdk.org Tue Dec 5 16:57:59 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 5 Dec 2023 16:57:59 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v4] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 03:07:33 GMT, Hao Sun wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains five commits: >> >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > src/hotspot/share/jvmci/jvmciCodeInstaller.cpp line 1246: > >> 1244: if (_next_call_type == INVOKESTATIC || _next_call_type == INVOKESPECIAL) { >> 1245: // Need a static call stub for transitions from compiled to interpreted. >> 1246: C2_MacroAssembler masm(&buffer); > > Hi, I encountered one build failure: JDK build **without C2** fails on Linux/AArch64. > > The configure I used > > > --with-debug-level=release --with-jvm-features=-compiler2 --disable-precompiled-headers > > > The error log > > > === Output from failing command(s) repeated here === > * For target hotspot_variant-server_libjvm_gtest_objs_BUILD_GTEST_LIBJVM_link: > /usr/bin/ld: /tmp/build-release/hotspot/variant-server/libjvm/objs/jvmciCodeInstaller.o: in function `C2_MacroAssembler::C2_MacroAssembler(CodeBuffer*)': > make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' > /usr/bin/ld: make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' > collect2: error: ld returned 1 exit status > * For target hotspot_variant-server_libjvm_objs_BUILD_LIBJVM_link: > /usr/bin/ld: /tmp/build-release/hotspot/variant-server/libjvm/objs/jvmciCodeInstaller.o: in function `C2_MacroAssembler::C2_MacroAssembler(CodeBuffer*)': > make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' > /usr/bin/ld: make/hotspot/src/hotspot/share/opto/c2_MacroAssembler.hpp:38: undefined reference to `vtable for C2_MacroAssembler' > collect2: error: ld returned 1 exit status > > * All command lines available in /tmp/build-release/make-support/failure-logs. > === End of repeated output === > > > I suggest making the following change: > > Suggestion: > > MacroAssembler masm(&buffer); Thank you for catch @shqking ! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1415982687 From cslucas at openjdk.org Tue Dec 5 16:57:55 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Tue, 5 Dec 2023 16:57:55 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v5] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > # Testing status > > ## tier1 > > | | Win | Mac | Linux | > |----------|---------|---------|---------| > | ARM64 | | | | > | ARM32 | | | | > | x86 | | | | > | x64 | | | | > | PPC64 | | | | > | S390x | | | | > | RiscV | | | | Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Fix build, copyright dates, m4 files. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16484/files - new: https://git.openjdk.org/jdk/pull/16484/files/afe48fe7..d950806f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=03-04 Stats: 18 lines in 12 files changed: 0 ins; 3 del; 15 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From jbachorik at openjdk.org Tue Dec 5 17:19:40 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Tue, 5 Dec 2023 17:19:40 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: References: Message-ID: <_u_ldbTUnl3_Q8eKPU9K1glBV_XPQdW3l3X4tFHssfY=.e4e6494b-f301-4ba6-bee5-677232221429@github.com> On Tue, 5 Dec 2023 16:05:58 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > update event description src/jdk.jfr/share/classes/jdk/jfr/internal/Level.java line 41: > 39: * This settings is only supported for JVM events. > 40: * > 41: * @since 21 Should this be `@since 22` ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416023074 From jbachorik at openjdk.org Tue Dec 5 17:22:42 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Tue, 5 Dec 2023 17:22:42 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: References: Message-ID: <5YLPfwW7avOCq9kOGRkobZgkaorv6MoRRxWI3PoIMdI=.f74499b3-51d5-4a70-a235-ae63a04ab370@github.com> On Tue, 5 Dec 2023 16:05:58 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > update event description src/jdk.jfr/share/classes/jdk/jfr/internal/test/DeprecatedMethods.java line 25: > 23: package jdk.jfr.internal.test; > 24: > 25: public class DeprecatedMethods { Is this class (and also `DeprecatedThing`) supposed to be in the source tree rather than the test tree? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416028262 From mgronlun at openjdk.org Tue Dec 5 17:42:41 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 5 Dec 2023 17:42:41 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: <_u_ldbTUnl3_Q8eKPU9K1glBV_XPQdW3l3X4tFHssfY=.e4e6494b-f301-4ba6-bee5-677232221429@github.com> References: <_u_ldbTUnl3_Q8eKPU9K1glBV_XPQdW3l3X4tFHssfY=.e4e6494b-f301-4ba6-bee5-677232221429@github.com> Message-ID: <-1Pk3qF-Z8v-Cj8tL6sRBw2iynjrOjGGYRNuSzKo2QQ=.abf6862b-47eb-44bf-8583-413a4eb9becd@github.com> On Tue, 5 Dec 2023 17:16:50 GMT, Jaroslav Bachorik wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> update event description > > src/jdk.jfr/share/classes/jdk/jfr/internal/Level.java line 41: > >> 39: * This settings is only supported for JVM events. >> 40: * >> 41: * @since 21 > > Should this be `@since 22` ? You are right. Thank you for spotting. > src/jdk.jfr/share/classes/jdk/jfr/internal/test/DeprecatedMethods.java line 25: > >> 23: package jdk.jfr.internal.test; >> 24: >> 25: public class DeprecatedMethods { > > Is this class (and also `DeprecatedThing`) supposed to be in the source tree rather than the test tree? They have to go into the source tree because we only report deprecated methods located in the JDK :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416053762 PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416054856 From sgehwolf at openjdk.org Tue Dec 5 17:45:36 2023 From: sgehwolf at openjdk.org (Severin Gehwolf) Date: Tue, 5 Dec 2023 17:45:36 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: <5TQwGxIBEccxjD7OvdYnyrTEGlD396zHY4BaYnWMbvU=.d6ccf718-b45a-4948-a3c5-756b161a9925@github.com> References: <5TQwGxIBEccxjD7OvdYnyrTEGlD396zHY4BaYnWMbvU=.d6ccf718-b45a-4948-a3c5-756b161a9925@github.com> Message-ID: On Tue, 5 Dec 2023 09:29:39 GMT, Thomas Stuefe wrote: > This timeout should be a lot larger and possibly be configurable. Have not opened a JBS issue yet since I wanted feedback first from @jerboaa . There was some discussion around making the cache timeout configurable at some point but wasn't done, because it was not clear it was needed. See https://bugs.openjdk.org/browse/JDK-8296125. If there are more consumers and periodic polls of `os::physical_memory()` then it becomes more compelling to make this tunable (I haven't really looked at the patch yet). At the time, using `-XX:-UseDynamicNumberOfCompilerThreads` was sufficient. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1416058460 From jbachorik at openjdk.org Tue Dec 5 17:46:40 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Tue, 5 Dec 2023 17:46:40 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 16:05:58 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > update event description src/hotspot/share/jfr/support/jfrNativeLibraryLoadEvent.cpp line 37: > 35: if (_start_time != nullptr) { > 36: delete _start_time; > 37: } This should not be necessary. Delete should not fail on `nullptr`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416060305 From mgronlun at openjdk.org Tue Dec 5 17:49:43 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 5 Dec 2023 17:49:43 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 17:43:41 GMT, Jaroslav Bachorik wrote: >> Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: >> >> update event description > > src/hotspot/share/jfr/support/jfrNativeLibraryLoadEvent.cpp line 37: > >> 35: if (_start_time != nullptr) { >> 36: delete _start_time; >> 37: } > > This should not be necessary. Delete should not fail on `nullptr`. start_time is a JfrCHeapObj, which is an allocator that hooks allocations and deallocations. With a direct delete, it will account for the size of the object even though its null. It should probably be fixed in the allocator, but I needed this to get the accounting back on track. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416064517 From jbachorik at openjdk.org Tue Dec 5 18:09:37 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Tue, 5 Dec 2023 18:09:37 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 17:46:58 GMT, Markus Gr?nlund wrote: >> src/hotspot/share/jfr/support/jfrNativeLibraryLoadEvent.cpp line 37: >> >>> 35: if (_start_time != nullptr) { >>> 36: delete _start_time; >>> 37: } >> >> This should not be necessary. Delete should not fail on `nullptr`. > > start_time is a JfrCHeapObj, which is an allocator that hooks allocations and deallocations. With a direct delete, it will account for the size of the object even though its null. It should probably be fixed in the allocator, but I needed this to get the accounting back on track. Thanks for the explanation. I got a bit confused ... >> src/jdk.jfr/share/classes/jdk/jfr/internal/test/DeprecatedMethods.java line 25: >> >>> 23: package jdk.jfr.internal.test; >>> 24: >>> 25: public class DeprecatedMethods { >> >> Is this class (and also `DeprecatedThing`) supposed to be in the source tree rather than the test tree? > > They have to go into the source tree because we only report deprecated methods located in the JDK :) Uff, magic :) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416089579 PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416088493 From mgronlun at openjdk.org Tue Dec 5 18:12:43 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Tue, 5 Dec 2023 18:12:43 GMT Subject: RFR: 8211238: @Deprecated JFR event [v16] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 18:06:32 GMT, Jaroslav Bachorik wrote: >> start_time is a JfrCHeapObj, which is an allocator that hooks allocations and deallocations. With a direct delete, it will account for the size of the object even though its null. It should probably be fixed in the allocator, but I needed this to get the accounting back on track. > > Thanks for the explanation. I got a bit confused ... Thanks for noticing; I just checked that the allocator handles nullptr poorly. It should be easy to fix, but I will do it separately. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16931#discussion_r1416093433 From tschatzl at openjdk.org Tue Dec 5 18:49:52 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 5 Dec 2023 18:49:52 GMT Subject: RFR: 8317809: Insertion of free code blobs into code cache can be very slow during class unloading [v6] In-Reply-To: References: <_dcFF70_w7IXSjb6w-HuHCBkPyS3a6NlzejtqdfdYnM=.74e0f9df-eca3-49b0-be68-2d5824c16003@github.com> Message-ID: On Tue, 5 Dec 2023 15:47:57 GMT, Coleen Phillimore wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> ayang review: move class unloading outside of weak_refs_work > > src/hotspot/share/gc/shared/classUnloadingContext.cpp line 91: > >> 89: cld->classes_do(f); >> 90: } >> 91: } > > I don't understand why CLDG specific methods were moved here. They should be unaware of nmethod purging. and these 4 methods don't have any nmethod purging in them either and are specific to the CLDG implementation. The idea is to have GC take control how the unloading CLDs are stored/which data structure it is going to use to manage them to ultimately allow more control about class unloading for parallelization. Which on the one hand makes pauses shorter (for stw collectors), and on the other hand decreases the time the CLDG_lock is held (not sure it is nice that the concurrent collectors currently may hold that one for ~100ms in my test...). I believe having the linked list of unloading CLDs embedded in the CLDs for use by the GC not only seems wrong (i.e. it's a GC data structure located in runtime code) but is also very limiting (need to have one for all, fixed singly linked list). This change moves knowledge of how unloading CLDs are managed to GC area - runtime code just tells GC that a particular CLD is unloading. (Currently the `ClassUnloadingContext` also calls the `unload` method during registration to keep current functionality, but the plan is to separate the step of registration and actual unloading to allow custom handling of the second part; the registering, although it's still walking a singly linked list, is comparatively fast). These four methods provide a thin abstraction over the CLDs that are unloading (that runtime doesn't need and should not worry about imo). With that in place it is possible to slice the actual unloading work into phases according to dependencies (depending on GC if desired), potentially overlapping with other existing phases in collectors already allowing that (e.g. the parallel code unloading, but that is only an implementation detail to reduce overall parallel phases), or even moving some of that work sometime else (the `CLD::unload()` method unfortunately currently may do some memory freeing too). However most time is spent in notifying various components which can be parallelized (at least parallelize the different types of notifications). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16759#discussion_r1416139587 From jjoo at openjdk.org Tue Dec 5 19:47:02 2023 From: jjoo at openjdk.org (Jonathan Joo) Date: Tue, 5 Dec 2023 19:47:02 GMT Subject: Integrated: 8315149: Add hsperf counters for CPU time of internal GC threads In-Reply-To: References: Message-ID: On Mon, 31 Jul 2023 01:50:07 GMT, Jonathan Joo wrote: > 8315149: Add hsperf counters for CPU time of internal GC threads This pull request has now been integrated. Changeset: 9e570105 Author: Jonathan Joo Committer: Man Cao URL: https://git.openjdk.org/jdk/commit/9e570105c30a6e462d08931e2010cef9cd5a6031 Stats: 449 lines in 19 files changed: 444 ins; 4 del; 1 mod 8315149: Add hsperf counters for CPU time of internal GC threads Co-authored-by: Man Cao Co-authored-by: Stefan Johansson Reviewed-by: simonis, manc, sjohanss ------------- PR: https://git.openjdk.org/jdk/pull/15082 From cjplummer at openjdk.org Tue Dec 5 20:15:42 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Tue, 5 Dec 2023 20:15:42 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v5] In-Reply-To: References: <81dXSHvLQMGj3s1BcBs8fmJUEoJpaU-5wBRSIjnztMM=.d53f8a2f-8353-49ec-8a9b-695b32f03d20@github.com> Message-ID: On Tue, 5 Dec 2023 16:14:31 GMT, Evgeny Astigeevich wrote: >> src/hotspot/share/code/codeCache.cpp line 1809: >> >>> 1807: } >>> 1808: >>> 1809: void CodeCache::write_perf_map(const char* filename) { >> >> Why not have a `filename == nullptr` indicate that the default should be used. Then you don't need CodeCache::DefaultPerfMapFile. You can just have a private `CodeCache::defaultPerfmapFileName()` method. > > Hi Chris, > The current design of `write_perf_map` provides a clean and explicit interface. The purpose of the function is evident from its signature: to write a perf map into a specified file. This explicitness makes the code more readable and self-documenting. It reduces the need for developers to go to the implementation to figure out: what is the meaning of `nullptr`; where a filename will be taken from. It also serves as a contract between the caller and the function itself. By explicitly requiring a filename, the function sets clear expectations for the caller. > > I think `CodeCache::write_default_perf_map` hiding the filename of the default perf map might not be a good idea because it makes impossible to get the filename used in it. I prefer either method `CodeCache::defaultPerfmapFileName()` or class `CodeCache::DefaultPerfmapFileName`. The class is simpler to implement than the method (like it was earlier). The default filename was already "hidden" before these changes, so at the very least things are not being made any worse, but I don't see why any users `write_perf_map` would ever need the default filename. I just felt that adding and exporting a class whose only purpose is to provide the default name seemed like unnecessary overkill. I'm not so sure having a public CodeCache::defaultPerfmapFileName() API and two `write_perf_map` APIs isn't overkill also. There is nothing wrong with a null filename argument signally to use some default name. You can also have the filename arg default to `nullptr`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15871#discussion_r1416228456 From matsaave at openjdk.org Tue Dec 5 20:30:34 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 5 Dec 2023 20:30:34 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v2] In-Reply-To: References: Message-ID: On Sat, 2 Dec 2023 00:38:58 GMT, Ioi Lam wrote: >> This is a simple clean up that moves the code for initializing the CDS config states from arguments.cpp to cdsConfig.cpp >> >> I renamed a few functions, but otherwise the code is unchanged. >> >> - `get_default_shared_archive_path()` -> `default_archive_path()` >> - `GetSharedArchivePath()` -> `static_archive_path()` >> - `GetSharedDynamicArchivePath()` -> `dynamic_archive_path()` >> >> There's also less `#if INCLUDE_CDS` since the entire cdsConfig.cpp file is compiled only if CDS is enabled. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > fixed indentation Thanks for addressing my comments. Approved! ------------- Marked as reviewed by matsaave (Committer). PR Review: https://git.openjdk.org/jdk/pull/16868#pullrequestreview-1766049580 From eastigeevich at openjdk.org Tue Dec 5 20:57:33 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 5 Dec 2023 20:57:33 GMT Subject: RFR: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string [v2] In-Reply-To: <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> References: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> Message-ID: On Tue, 5 Dec 2023 14:30:49 GMT, Yi-Fan Tsai wrote: >> Test CheckLargePages was broken by the previous changes: >> >> [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from >> >> Usable page sizes: 4k, 1G >> >> to >> >> Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. >> >> >> [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. >> >> This change includes: >> - The regular expression in CheckLargePages is updated to capture only the page sizes. >> - The static huge page will be fixed by [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795). > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Reserve only regular expression lgtm ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/16962#pullrequestreview-1766090663 From sspitsyn at openjdk.org Tue Dec 5 22:51:47 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Dec 2023 22:51:47 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v2] In-Reply-To: References: Message-ID: > This is a trivial fix for a regression caused by: > [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 > > The fix of 8308614 just triggered a known issue: > [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option > > The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. > > Testing: > - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix > - In progress: Test with tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: extended comment to cover the watchpoint extra checks ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16961/files - new: https://git.openjdk.org/jdk/pull/16961/files/c08484ac..7d47fd12 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16961&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16961&range=00-01 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16961.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16961/head:pull/16961 PR: https://git.openjdk.org/jdk/pull/16961 From dcubed at openjdk.org Tue Dec 5 23:03:33 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Tue, 5 Dec 2023 23:03:33 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 22:51:47 GMT, Serguei Spitsyn wrote: >> This is a trivial fix for a regression caused by: >> [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 >> >> The fix of 8308614 just triggered a known issue: >> [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option >> >> The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. >> >> Testing: >> - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix >> - In progress: Test with tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: extended comment to cover the watchpoint extra checks Thumbs up. This is a trivial fix. You'll need to fix the whitespace complaint before integration. ------------- Marked as reviewed by dcubed (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16961#pullrequestreview-1766264031 From iklam at openjdk.org Tue Dec 5 23:13:11 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 5 Dec 2023 23:13:11 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v3] In-Reply-To: References: Message-ID: > This is a simple clean up that moves the code for initializing the CDS config states from arguments.cpp to cdsConfig.cpp > > I renamed a few functions, but otherwise the code is unchanged. > > - `get_default_shared_archive_path()` -> `default_archive_path()` > - `GetSharedArchivePath()` -> `static_archive_path()` > - `GetSharedDynamicArchivePath()` -> `dynamic_archive_path()` > > There's also less `#if INCLUDE_CDS` since the entire cdsConfig.cpp file is compiled only if CDS is enabled. Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' into 8320935-move-cds-config-code-from-arguments-cpp - fixed indentation - code alignment - step4 - step3 - step2 - step1 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16868/files - new: https://git.openjdk.org/jdk/pull/16868/files/01dd47bc..a080edeb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16868&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16868&range=01-02 Stats: 84382 lines in 1756 files changed: 39063 ins; 38780 del; 6539 mod Patch: https://git.openjdk.org/jdk/pull/16868.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16868/head:pull/16868 PR: https://git.openjdk.org/jdk/pull/16868 From sspitsyn at openjdk.org Tue Dec 5 23:31:33 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Dec 2023 23:31:33 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 22:51:47 GMT, Serguei Spitsyn wrote: >> This is a trivial fix for a regression caused by: >> [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 >> >> The fix of 8308614 just triggered a known issue: >> [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option >> >> The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. >> >> Testing: >> - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix >> - In progress: Test with tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: extended comment to cover the watchpoint extra checks Dan, thank you a lot for quick review! I'll fix the whitespace issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16961#issuecomment-1841792398 From sspitsyn at openjdk.org Tue Dec 5 23:36:46 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Dec 2023 23:36:46 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v3] In-Reply-To: References: Message-ID: <4ac4Z72qGn_-S7p43eNHZRgx4mmXxGNgXR1f7W06aQE=.8b328e1f-6378-4637-a76c-17c581f31a24@github.com> > This is a trivial fix for a regression caused by: > [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 > > The fix of 8308614 just triggered a known issue: > [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option > > The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. > > Testing: > - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix > - In progress: Test with tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: fixed trailing whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16961/files - new: https://git.openjdk.org/jdk/pull/16961/files/7d47fd12..34da9c6e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16961&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16961&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16961.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16961/head:pull/16961 PR: https://git.openjdk.org/jdk/pull/16961 From sspitsyn at openjdk.org Tue Dec 5 23:44:41 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 5 Dec 2023 23:44:41 GMT Subject: Integrated: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected In-Reply-To: References: Message-ID: <2dccnNqjKvOXXz9hLRAwDsrAt9wr-K6XwjH0yTNC4V0=.4343c65c-3829-45da-bf7e-c37f71451ead@github.com> On Tue, 5 Dec 2023 00:23:45 GMT, Serguei Spitsyn wrote: > This is a trivial fix for a regression caused by: > [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 > > The fix of 8308614 just triggered a known issue: > [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option > > The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. > > Testing: > - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix > - In progress: Test with tiers 1-6 This pull request has now been integrated. Changeset: 905137d4 Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/905137d4065eb40bef6946bdc6bb688d6018a89d Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected Reviewed-by: dcubed ------------- PR: https://git.openjdk.org/jdk/pull/16961 From dholmes at openjdk.org Wed Dec 6 01:14:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 01:14:40 GMT Subject: RFR: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string [v2] In-Reply-To: <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> References: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> <14MnNbVh0nMXZJF8Pm8EwjABcX0cn_ZcXBylyIMbcJA=.b51aefd1-3399-4927-980b-fd960c61e73c@github.com> Message-ID: On Tue, 5 Dec 2023 14:30:49 GMT, Yi-Fan Tsai wrote: >> Test CheckLargePages was broken by the previous changes: >> >> [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from >> >> Usable page sizes: 4k, 1G >> >> to >> >> Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. >> >> >> [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. >> >> This change includes: >> - The regular expression in CheckLargePages is updated to capture only the page sizes. >> - The static huge page will be fixed by [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795). > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Reserve only regular expression I'm going to approve this test fix as the change to the regex itself seems reasonable, even though because the test remains excluded by [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795) we won't actually be testing it. If any issues arise when the test is removed from the ProblemList it is expected that [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795) will deal with that. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16962#pullrequestreview-1766409690 From xgong at openjdk.org Wed Dec 6 01:29:42 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 6 Dec 2023 01:29:42 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 13:00:04 GMT, Magnus Ihse Bursie wrote: > So you need to check both the flag and the header file? Oh well, then this is probably as good as it gets. Yes, we have to check both the flag and the header file. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1841926432 From dholmes at openjdk.org Wed Dec 6 02:28:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 02:28:45 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 23:01:20 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: extended comment to cover the watchpoint extra checks > > Thumbs up. This is a trivial fix. > > You'll need to fix the whitespace complaint before integration. @dcubed-ojdk I would not consider this a trivial fix at all - the need to add the additional conditions is not at all obvious! And even if they were, that would make this a small/simple fix, not "trivial" as defined for the "one review needed" rule. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16961#issuecomment-1841982060 From dholmes at openjdk.org Wed Dec 6 02:35:49 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 02:35:49 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v3] In-Reply-To: <4ac4Z72qGn_-S7p43eNHZRgx4mmXxGNgXR1f7W06aQE=.8b328e1f-6378-4637-a76c-17c581f31a24@github.com> References: <4ac4Z72qGn_-S7p43eNHZRgx4mmXxGNgXR1f7W06aQE=.8b328e1f-6378-4637-a76c-17c581f31a24@github.com> Message-ID: <64Mn3SR0rOVpZMb6CRVJwKfJIySscp5cBy9hyJkMEs4=.faea7c80-3c92-4d5c-9baf-d80e3b2714c0@github.com> On Tue, 5 Dec 2023 23:36:46 GMT, Serguei Spitsyn wrote: >> This is a trivial fix for a regression caused by: >> [8308614](https://bugs.openjdk.org/browse/JDK-8308614) Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 >> >> The fix of 8308614 just triggered a known issue: >> [8316283](https://bugs.openjdk.org/browse/JDK-8316283) field watch events are not always posted with -Xcomp option >> >> The fix is just a work around with the extra checks with the `JvmtiExport::should_post_field_access()` and `JvmtiExport::should_post_field_modification()`. >> >> Testing: >> - The test `runtime/jni/FastGetField/FastGetField.java` does not fail anymore with this fix >> - In progress: Test with tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > fixed trailing whitespace src/hotspot/share/prims/jvmtiThreadState.cpp line 561: > 559: // it is an important optimization to create JvmtiThreadState objects lazily. > 560: // This optimization is disabled when watchpoint capabilities are present. It is to > 561: // work around a bug with virtual thread frames which can be not deoptimized in time. Suggestion: "This optimization is *also* disabled when ..." The phrase "which can be not deoptimized in time." is unclear. Are we racing with deoptimization? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16961#discussion_r1416579588 From duke at openjdk.org Wed Dec 6 02:37:41 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Wed, 6 Dec 2023 02:37:41 GMT Subject: Integrated: 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string In-Reply-To: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> References: <59K9mUFzngrq77p0DVcyuoHADtmPN1SB2fZJToQnxJk=.394c23d2-f87b-4775-9416-3d384fc1817c@github.com> Message-ID: On Tue, 5 Dec 2023 01:08:10 GMT, Yi-Fan Tsai wrote: > Test CheckLargePages was broken by the previous changes: > > [JDK-8310233](https://bugs.openjdk.org/browse/JDK-8310233) changes the pagesize logs from > > Usable page sizes: 4k, 1G > > to > > Large page support enabled. Usable page sizes: 4k, 1G. Default large page size: 1G. > > > [JDK-8261894](https://bugs.openjdk.org/browse/JDK-8261894) removes `UseHugeTLBFS`. It was also removed from `os::can_execute_large_page_memory`, and `CodeCache::page_size` cannot use huge pages anymore. > > This change includes: > - The regular expression in CheckLargePages is updated to capture only the page sizes. > - The static huge page will be fixed by [JDK-8319795](https://bugs.openjdk.org/browse/JDK-8319795). This pull request has now been integrated. Changeset: 86b27b78 Author: Yi-Fan Tsai Committer: David Holmes URL: https://git.openjdk.org/jdk/commit/86b27b784e20f7cdadd241f7feedd024482baa8f Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8317831: compiler/codecache/CheckLargePages.java fails on OL 8.8 with unexpected memory string Reviewed-by: eastigeevich, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/16962 From dholmes at openjdk.org Wed Dec 6 02:55:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 02:55:33 GMT Subject: RFR: 8321369: Unproblemlist gc/cslocker/TestCSLocker.java In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 10:19:02 GMT, Thomas Schatzl wrote: > Hi all, > > please review this fix to unproblemlist gc/cs/TestCSLocker.java; the CR to fix this [JDK-8310480](https://bugs.openjdk.org/browse/JDK-8310480) has already been closed as duplicate of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706) that removed the GCLocker for G1 which is the cause for the issue. > > Note that the test has only been problemlisted for linux-x64 previously, so it has already been run for a long time for other platforms and other collectors, so my testing did not extensively try all the other platforms/gc combinations (I did try with Serial and Parallel a few times with no issues; ZGC is excluded anyway in the test, and Shenandoah also does not use the GCLocker). > > Testing: test case with g1, gha > > Thanks, > Thomas Looks good and trivial. Thanks for cleaning up the PL. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16970#pullrequestreview-1766542397 From dholmes at openjdk.org Wed Dec 6 04:03:31 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 04:03:31 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC In-Reply-To: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> Message-ID: On Tue, 5 Dec 2023 16:31:24 GMT, Denghui Dong wrote: > Hi, > > Could I have a review of this patch? > > In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. > > This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. > > Best, > Denghui @D-D-H adding a new manageable flag requires a CSR request to be approved. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16976#issuecomment-1842046395 From dholmes at openjdk.org Wed Dec 6 04:11:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 04:11:33 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC In-Reply-To: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> Message-ID: On Tue, 5 Dec 2023 16:31:24 GMT, Denghui Dong wrote: > Hi, > > Could I have a review of this patch? > > In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. > > This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. > > Best, > Denghui Functional changes seem fine, but I think the test is in the wrong place as it is not a dcmd test. Perhaps just place it in test/hotspot/jtreg/serviceability/HeapDump? ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16976#pullrequestreview-1766597251 From dholmes at openjdk.org Wed Dec 6 05:00:35 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 05:00:35 GMT Subject: RFR: JDK-8320892: AArch64: Restore FPU control state after JNI [v3] In-Reply-To: References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: On Tue, 5 Dec 2023 09:44:24 GMT, Andrew Haley wrote: > OK. How about we split this into two, this first part without a CSR, and the second part, which creates the generic alias, with one? That way we can mitigate a live problem in this release. I'm fine with splitting it if needed, but I also re-evaluated this as a P3 so it can still go into 22 after RDP 1. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16851#issuecomment-1842086869 From dholmes at openjdk.org Wed Dec 6 05:04:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 05:04:33 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: <9ZTuSYp_utYFLxv7eDQFSimjFtY007yUurc1CUgfNnA=.20d65535-11e6-46d1-b3fe-710cd75ed18d@github.com> References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> <9ZTuSYp_utYFLxv7eDQFSimjFtY007yUurc1CUgfNnA=.20d65535-11e6-46d1-b3fe-710cd75ed18d@github.com> Message-ID: On Mon, 4 Dec 2023 11:37:25 GMT, Matthias Baesken wrote: > The cases where we directly dlopen/dlsym/fcn-call in the JDK codebase are probably not covered by the JNI checker, right ? Right. JNI checking is for checking actual JNI API functions, so I don't see where this would go. A call from Java code into a native method (from which native code could trigger the problem) is not a JNI call. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1842089745 From dholmes at openjdk.org Wed Dec 6 05:24:35 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 05:24:35 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 10:41:13 GMT, Thomas Stuefe wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > feedback david src/hotspot/share/services/rsswatch.cpp line 91: > 89: int chars_read = 0; > 90: if (sscanf(s, "%lf%c%n", &v, &sign, &chars_read) >= 2 && sign == '%') { > 91: if (v > 100.0 || v == 0.0) { `v <= 0.0`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1416679734 From iklam at openjdk.org Wed Dec 6 05:25:37 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Dec 2023 05:25:37 GMT Subject: RFR: 8320935: Move CDS config initialization code to cdsConfig.cpp [v3] In-Reply-To: References: Message-ID: On Sat, 2 Dec 2023 03:36:05 GMT, Calvin Cheung wrote: >> Ioi Lam has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: >> >> - Merge branch 'master' into 8320935-move-cds-config-code-from-arguments-cpp >> - fixed indentation >> - code alignment >> - step4 >> - step3 >> - step2 >> - step1 > > Marked as reviewed by ccheung (Reviewer). Thanks @calvinccheung @matias9927 @tstuefe for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16868#issuecomment-1842107804 From dholmes at openjdk.org Wed Dec 6 05:27:35 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 05:27:35 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 10:41:13 GMT, Thomas Stuefe wrote: >> We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes >> a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. >> >> --- >> >> Motivation: >> >> The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. >> >> One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. >> >> Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. >> >> Letting the JVM handle this Limit has many advantages: >> >> - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. >> >> - Re-using the normal error reporting mechanism is powerful since: >> - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. >> - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. >> - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. >> >> ---- >> >> Usage: >> >> Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. >> `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. >> >> If given as percent, JVM will also react to container limit updates. >> >> Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: >> >> `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` >> >> ---- >> >> Patch: >> >> Im... > > Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: > > feedback david src/hotspot/share/services/rsswatch.cpp line 120: > 118: log_warning(os, rss)("RssLimit specified, but not supported by the Operating System."); > 119: return; > 120: } This seems a strange place for this. I would expect it to be in the init function. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1416681581 From iklam at openjdk.org Wed Dec 6 05:28:41 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 6 Dec 2023 05:28:41 GMT Subject: Integrated: 8320935: Move CDS config initialization code to cdsConfig.cpp In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 23:24:53 GMT, Ioi Lam wrote: > This is a simple clean up that moves the code for initializing the CDS config states from arguments.cpp to cdsConfig.cpp > > I renamed a few functions, but otherwise the code is unchanged. > > - `get_default_shared_archive_path()` -> `default_archive_path()` > - `GetSharedArchivePath()` -> `static_archive_path()` > - `GetSharedDynamicArchivePath()` -> `dynamic_archive_path()` > > There's also less `#if INCLUDE_CDS` since the entire cdsConfig.cpp file is compiled only if CDS is enabled. This pull request has now been integrated. Changeset: 4c96aac9 Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/4c96aac9c0aa450b0b6859ded8dfff856222ad58 Stats: 696 lines in 8 files changed: 346 ins; 327 del; 23 mod 8320935: Move CDS config initialization code to cdsConfig.cpp Reviewed-by: ccheung, matsaave, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/16868 From dholmes at openjdk.org Wed Dec 6 05:34:36 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 05:34:36 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: <8ZnUiNc8Y_cgiFskoRHt7CArFObXeDdqDZL1cvAjk6s=.1c030123-6ff6-411f-b4a8-e6aa240a86f1@github.com> On Tue, 5 Dec 2023 09:43:21 GMT, Thomas Stuefe wrote: >> src/hotspot/os/bsd/os_bsd.cpp line 1473: >> >>> 1471: result = info.resident_size; >>> 1472: } >>> 1473: #endif // __APPLE__ >> >> Hmmm so no general BSD support either ... > > It is possible on BSD (e.g. using [kvm_getprocs](https://man.freebsd.org/cgi/man.cgi?query=kvm_getprocs&sektion=3&n=1) ) - however, I am no BSD expert and have no system to build BSD on. > > Note that in both cases - AIX and the BSDs - I am not better nor worse than other code that reads RSS, e.g. jfr_report_memory_info(). Which, btw, is a code unification possibility I plan on following up on in a separate RFE (reusing get_rss for use cases such as jfr_report_memory_info). I don't like seeing disparity in platform support like this but understand we can't always supply the details for every port on the initial integration. I would like to see RFE's filed to have the missing details provided on those platforms so that the respective port maintainers know there is something missing. I suspect that is not the case with `jfr_report_memory_info`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1416684558 From dholmes at openjdk.org Wed Dec 6 05:34:39 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 05:34:39 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 05:24:48 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> feedback david > > src/hotspot/share/services/rsswatch.cpp line 120: > >> 118: log_warning(os, rss)("RssLimit specified, but not supported by the Operating System."); >> 119: return; >> 120: } > > This seems a strange place for this. I would expect it to be in the init function. Sorry -ignore this. The git UI had me confused about where this was. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1416685476 From ddong at openjdk.org Wed Dec 6 05:50:02 2023 From: ddong at openjdk.org (Denghui Dong) Date: Wed, 6 Dec 2023 05:50:02 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v2] In-Reply-To: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> Message-ID: <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> > Hi, > > Could I have a review of this patch? > > In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. > > This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. > > Best, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: change the location of test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16976/files - new: https://git.openjdk.org/jdk/pull/16976/files/d62a507f..442b7f47 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16976&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16976&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16976.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16976/head:pull/16976 PR: https://git.openjdk.org/jdk/pull/16976 From dholmes at openjdk.org Wed Dec 6 05:57:33 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 6 Dec 2023 05:57:33 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v2] In-Reply-To: <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> Message-ID: On Wed, 6 Dec 2023 05:50:02 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this patch? >> >> In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. >> >> This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. >> >> Best, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > change the location of test Seems fine. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16976#pullrequestreview-1766708495 From stuefe at openjdk.org Wed Dec 6 07:12:35 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 6 Dec 2023 07:12:35 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 10:36:29 GMT, Thomas Stuefe wrote: >> Hi Thomas, >> >> I've taken a first pass through this and it seems okay in principle. A number of initial comments/suggestions below. >> >> Thanks. > >> Hi Thomas, >> >> I've taken a first pass through this and it seems okay in principle. A number of initial comments/suggestions below. >> >> Thanks. > > Thanks a lot, David! > > Makes me happy to see this finds acceptance at least in principle. > > I changed: > - get_rss to get_RSS > - removed the "0 means off" text, since I assume passing 0 would be likely a user error. Instead, I also added an error check for percentage = 0.0. > - added a warning if the OS does not support this feature > Hi @tstuefe this looks useful feature and seems to provides a way to deal with OOM killer in containers. If the user has set container memory limit to 256MB, then the RssLimit can be set to around 200MB. This would let the JVM catch the OOM before it is handled by the kernel. But I have one concern. The effectiveness of this solution really depends on how frequently the check is done. If there is a sudden memory spike, it should, ideally, last longer than `RssLimitCheckInterval` for RssWatcher to take the action. Flipping it the other way, we can say RssWatcher can catch memory spikes that last longer than `RssLimitCheckInterval`. Even then, it can catch the spike only as long as it is less than the container limit. This raises the question of determining the effective value of `RssLimit` and `RssLimitCheckInterval`. For instance, compilations can induce memory spike which may last for few hundred milliseconds at the most, which is much lesser than the default value of 5 secs for ` RssLimitCheckInterval`. What are your thoughts on this? > Hi @tstuefe this looks useful feature and seems to provides a way to deal with OOM killer in containers. If the user has set container memory limit to 256MB, then the RssLimit can be set to around 200MB. This would let the JVM catch the OOM before it is handled by the kernel. But I have one concern. The effectiveness of this solution really depends on how frequently the check is done. If there is a sudden memory spike, it should, ideally, last longer than `RssLimitCheckInterval` for RssWatcher to take the action. Flipping it the other way, we can say RssWatcher can catch memory spikes that last longer than `RssLimitCheckInterval`. Even then, it can catch the spike only as long as it is less than the container limit. This raises the question of determining the effective value of `RssLimit` and `RssLimitCheckInterval`. For instance, compilations can induce memory spike which may last for few hundred milliseconds at the most, which is much lesser than the default value of 5 secs for ` RssLimitCheckInterval`. What are your thoughts on this? Your concern is valid; there is no bullet-proof way to do this. I originally chose to make the default interval low since I feared that reading procfs would be too expensive. However, after some testing I see that is at this interval, so much caution is not necessary; I will lower the default interval to 1 second. Furthermore, I plan to make the interval adaptive: if we detect a large RSS spike or are within n% of the limit, I plan to lower the interval temporarily. Since that requires more testing and tuning, I will do this in a separate RFE. For compiler (and for hotspot-induced mallocs generally) we already have -XX:MallocLimit, that is independent on polling and works real-time. In addition, for the compiler we have compilation memory limits via compile command. But in the end, there remains an unknown that is unsolvable. Any spike can be shorter than any interval-based check we do. This PR is an abridged form for something I did for SAP: https://stuefe.de/posts/vitals/sapmachine-high-memory-reports/ - there, I use a system of three "danger zones" that each trigger different actions. Would love to get such a solution upstream at some point. I wish the kernel would give us SIGDANGER like on AIX. That would be the real solution. When I originally implemented the SAP solution, I had also looked at container-intrinsic solutions, but did not find any that were reliable. Note that OOM-kills can also come from some framework just scrapping the whole container; so it may not even be the kernel that kills us, the whole VM may go away. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1842210562 From mbaesken at openjdk.org Wed Dec 6 07:28:33 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 6 Dec 2023 07:28:33 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Mon, 4 Dec 2023 11:56:03 GMT, Andrew Haley wrote: > > /label jfr > > I'm not sure I understand the issue, but adding a field to an event because of a GCC bug seems excessive. > > It's a nasty hard-to-find bug that breaks Java compatibility. Some people have wondered if this is a real-world problem, and the answer is that it's happening, right now, in Oracle's CI testing. Interesting, do you have some details about the 'Oracle CI testing' occurrence ? If so, what lib caused it ? Do you think it would be beneficial to have it in the JFR for this particular case (maybe as a separate event if this is prefered over the current suggestion) ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1842238320 From mbaesken at openjdk.org Wed Dec 6 07:33:33 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 6 Dec 2023 07:33:33 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> <9ZTuSYp_utYFLxv7eDQFSimjFtY007yUurc1CUgfNnA=.20d65535-11e6-46d1-b3fe-710cd75ed18d@github.com> Message-ID: On Wed, 6 Dec 2023 05:01:44 GMT, David Holmes wrote: > > The cases where we directly dlopen/dlsym/fcn-call in the JDK codebase are probably not covered by the JNI checker, right ? > > Right. JNI checking is for checking actual JNI API functions, so I don't see where this would go. A call from Java code into a native method (from which native code could trigger the problem) is not a JNI call. To be fair, those cases where we directly call dlopen/dlsym/fcn-call in the JDK codebase and trigger the issue, are not well covered anyway by the current HS coding because they are not going through the os::dll_load, so the check and error message is only observed in the next os::dll_load afterwards. And I think those cases currently cannot be corrected in os::dll_load because the fp env is already 'bad' before os::dll_load . ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1842250608 From stuefe at openjdk.org Wed Dec 6 07:43:37 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 6 Dec 2023 07:43:37 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 05:40:50 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> feedback david > > src/hotspot/share/services/rsswatch.hpp line 2: > >> 1: /* >> 2: * Copyright (c) 1999, 2023, Oracle and/or its affiliates. All rights reserved. > > Copyright should not include 1999. Sorry, forgot to commit that change appearantly. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1416823811 From stuefe at openjdk.org Wed Dec 6 08:00:34 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 6 Dec 2023 08:00:34 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v2] In-Reply-To: <8ZnUiNc8Y_cgiFskoRHt7CArFObXeDdqDZL1cvAjk6s=.1c030123-6ff6-411f-b4a8-e6aa240a86f1@github.com> References: <8ZnUiNc8Y_cgiFskoRHt7CArFObXeDdqDZL1cvAjk6s=.1c030123-6ff6-411f-b4a8-e6aa240a86f1@github.com> Message-ID: <6bYrd8h1JMJbI9YTGoQJNqsCI3-x_lPLG76I0n-kldM=.d60d1770-aa3f-48bf-856b-2efb91e0d2c8@github.com> On Wed, 6 Dec 2023 05:30:36 GMT, David Holmes wrote: >> It is possible on BSD (e.g. using [kvm_getprocs](https://man.freebsd.org/cgi/man.cgi?query=kvm_getprocs&sektion=3&n=1) ) - however, I am no BSD expert and have no system to build BSD on. >> >> Note that in both cases - AIX and the BSDs - I am not better nor worse than other code that reads RSS, e.g. jfr_report_memory_info(). Which, btw, is a code unification possibility I plan on following up on in a separate RFE (reusing get_rss for use cases such as jfr_report_memory_info). > > I don't like seeing disparity in platform support like this but understand we can't always supply the details for every port on the initial integration. I would like to see RFE's filed to have the missing details provided on those platforms so that the respective port maintainers know there is something missing. I suspect that is not the case with `jfr_report_memory_info`. JBS is down for maintenance; will open bugs when its back up. Platform support is always an issue; BSD support, in particular, seems spotty. I am not even sure BSD is maintained by anyone in the head release. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16938#discussion_r1416852897 From stuefe at openjdk.org Wed Dec 6 08:09:32 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 6 Dec 2023 08:09:32 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> <9ZTuSYp_utYFLxv7eDQFSimjFtY007yUurc1CUgfNnA=.20d65535-11e6-46d1-b3fe-710cd75ed18d@github.com> Message-ID: On Wed, 6 Dec 2023 07:31:20 GMT, Matthias Baesken wrote: > > The cases where we directly dlopen/dlsym/fcn-call in the JDK codebase are probably not covered by the JNI checker, right ? > > Right. JNI checking is for checking actual JNI API functions, so I don't see where this would go. A call from Java code into a native method (from which native code could trigger the problem) is not a JNI call. It is a flag to enable checks on third-party JNI code, so in my mind it fits perfectly. It also enables periodic signal handler checks. So, what's preventing us to fatal out when we detect this kind of problem and if -Xcheck:jni is set? It would be possible to do this in a delayed fashion. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1842368279 From stuefe at openjdk.org Wed Dec 6 08:13:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 6 Dec 2023 08:13:55 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: Message-ID: > We have `MallocLimit`, a way to trigger errors when reaching a given malloc load threshold. This PR proposes > a complementary switch, `RSSLimit`, that does the same based on the Resident Set Size of the process. > > --- > > Motivation: > > The main usage for this option is to analyze OOM kills. OOM kills can happen at various layers: the process may be either killed by the kernel OOM killer, or the whole container may get scrapped if it uses too much memory. > > One rarely has any information on the nature of the OOM, or if there even was one, and if yes, if the JVM was the culprit or just an innocent bystander. In these situations, getting a voluntary abort *before* the process gets killed from outside can give us valuable information. > > Another use of this feature can be testing: specifying an envelope of "reasonable" RSS for testing to check the expected footprint of the JVM. Also useful for a global test-wide setting to catch obvious footprint degradations early. > > Letting the JVM handle this Limit has many advantages: > > - since the limit is artificial, error reporting is not affected. Other mechanisms (e.g. ulimit) are likely to prevent effective error reporting. I usually get torn hs-err files when a limit restriction hits since error reporting needs dynamic memory (regrettably) and space on the stack to do its work. > > - Re-using the normal error reporting mechanism is powerful since: > - hs-err files contain lots of information already: machine memory status, NMT summary, heap information etc. > - Using `OnError`, that mechanism is expandable: we can run many further diagnostics like Metaspace or Compiler memory reports, detailed NMT reports, System memory maps, and even heap dumps. > - Using `ErrorLogToStd(out|err)` will redirect the hs-err file and let us see what's happening in cloud situations where file systems are often ephemeral. > > ---- > > Usage: > > Limit is given either as an absolute number or as a relative percentage of the total memory of the machine or the container, e.g. > `-XX:RssLimit=2G` or `-XX:RssLimit=80%`. > > If given as percent, JVM will also react to container limit updates. > > Example: we run the JVM inside a container as the sole payload process. Limit its RSS to 90% of the container limit, and in case we run into the limit, fire a heap dump: > > `java -XX:+UnlockDiagnosticVMOptions -XX:RssLimit=80% '-XX:OnError=jcmd %p GC.heap_dump my-dump' -Xlog:os+rss ` > > ---- > > Patch: > > Implemented for Linux, MacOS and Windows. Left out AIX since there we have a long-... Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: Add specific percentage switch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16938/files - new: https://git.openjdk.org/jdk/pull/16938/files/f6f43ce4..b4b6becd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16938&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16938&range=01-02 Stats: 105 lines in 5 files changed: 33 ins; 25 del; 47 mod Patch: https://git.openjdk.org/jdk/pull/16938.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16938/head:pull/16938 PR: https://git.openjdk.org/jdk/pull/16938 From stuefe at openjdk.org Wed Dec 6 08:21:35 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 6 Dec 2023 08:21:35 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: Message-ID: <3pfgWe1NIoMrOXlGqLsyJCsgPgMZ6AJtlxSy64o76o8=.ecc470d4-12c2-4b1b-9da9-1155ceb8329e@github.com> On Tue, 5 Dec 2023 05:49:58 GMT, David Holmes wrote: >> Thomas Stuefe has updated the pull request incrementally with one additional commit since the last revision: >> >> Add specific percentage switch > > Hi Thomas, > > I've taken a first pass through this and it seems okay in principle. A number of initial comments/suggestions below. > > Thanks. @dholmes-ora New Version: I attempted to change the percentage parsing as you wished (for RssLimit=fraction-of-1.0) but then decided not to. I don't have the percentage sign to tell percentage form apart from absolute form. I cannot just use `scanf` with `%f` since that would also parse values without decimal point that are meant to be absolute. More importantly, I dislike decimal points in arguments - it makes usage, tests, and documentation locale-specific. I opted, therefore, for a separate RssLimitPercent switch. We have many examples of switches that take percentage, and most of them are integers ranging 0..100. So, by adding this switch I followed standard procedure. I regret having a new switch, but on the bright side, I reuse the standard switch limit checks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1842397114 From xgong at openjdk.org Wed Dec 6 09:14:58 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 6 Dec 2023 09:14:58 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7] In-Reply-To: References: Message-ID: > Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). > > SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. > > To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. > > Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. > > [1] https://github.com/openjdk/jdk/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Add "--with-libsleef-lib" and "--with-libsleef-include" options ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16234/files - new: https://git.openjdk.org/jdk/pull/16234/files/ee5caf6d..f3ff0672 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=05-06 Stats: 124 lines in 3 files changed: 67 ins; 33 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/16234.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234 PR: https://git.openjdk.org/jdk/pull/16234 From xgong at openjdk.org Wed Dec 6 09:15:03 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 6 Dec 2023 09:15:03 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 16:26:02 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains ten additional commits since the last revision: >> >> - Separate neon and sve functions into two source files >> - Merge branch 'jdk:master' into JDK-8312425 >> - Rename vmath to sleef in configure >> - Address review comments in build system >> - Add a bundled native lib in jdk as a bridge to libsleef >> - Merge 'jdk:master' into JDK-8312425 >> - Disable sleef by default >> - Merge 'jdk:master' into JDK-8312425 >> - 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF > > doc/building.md line 639: > >> 637: >> 638: libsleef, the [SIMD Library for Evaluating Elementary Functions]( >> 639: https://sleef.org/) is required when building libvmath.so on Linux/aarch64 > > This is incorrect. The library is not required, but if it is present, we will build libvmath with it. > > Edit: Or rather, this is misleading. Technically it is correct, since you state that it is required when building libvmath.so, but it is easy to mistake for being required for building the JDK. The reader presumably does not know what libvmath.so is or how it is used. > > Please rephrase this to so that it is clear that this is optional, but will provide performance benefits to the resulting JDK if present. You do not need to mention libvmath.so here, for no other dependency do we declare what parts of the JDK that require it -- it is not essential for this document. > > Also see if you can make this paragraph and the one at the end be a bit more tighter, not the last paragraph seems to be both repeat and contradict this one. Hi @magicus , the doc is updated. Thanks for your comment on this! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1416964362 From xgong at openjdk.org Wed Dec 6 09:15:05 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Wed, 6 Dec 2023 09:15:05 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5] In-Reply-To: References: <9VeMdTAJPaPZDg9ZW7FVJOf9XGl4gGqAS-2g8SFc9c0=.36792cd5-66d9-4abc-ba0c-aee3478627f4@github.com> Message-ID: <81tcSvaI8RdngSM-Nq051Vwl81u9NyEgVr8BCV44KYw=.74051087-23af-441f-825a-83dfa70c9426@github.com> On Tue, 5 Dec 2023 13:03:22 GMT, Magnus Ihse Bursie wrote: >> Thanks for the suggestion @magicus ! >> >> The check in current `lib-sleef.m4` is very common: >> >> - If user has specified libsleef root by '--with-libsleef', we assume it is the manually built sleef lib. So only `lib/` and `include/` is checked. And the flags are set based on that path. >> - If user has not specified the libsleef root, and no `SYSROOT` is set, we try `PKG_CHECK` (like what you suggested) >> - Otherwise, check `sleef.h` >> - We assume the sleef module is installed under one of the valid system paths if the header can be found. So just linking with `-lsleef` will success. >> >> It's an issue in current flow like what @theRealAph met. I will add the options like `--with-libsleef-lib` and `--with-libsleef-include` like ffi. Regarding to extending the check for`--with-libsleef`, I think we can just make it simple like what it is now. Or, we have to check all the potential valid lib paths like `lib/`, `lib64/`, or maybe `lib/aarch64-linux-gnu`. The same to the `include` part. @theRealAph @magicus , WDYT? > > I'm fine with adding just --with-libsleef-lib and --with-libsleef-include to specify them directly. This makes it at least possible to use, if not overly convenient, for people using a system like Andrew's. If it annoys someone too much, we can extend it later. Added these two options in latest commit. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1416962901 From stefank at openjdk.org Wed Dec 6 09:18:35 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 09:18:35 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 11:23:31 GMT, Aleksey Shipilev wrote: > [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. > > The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. > > Additional testing: > - [x] Large build matrix of server/zero builds > - [x] Linux AArch64 server fastdebug, `tier{1,2}` > - [x] Linux x86_64 server fastdebug, `tier{1,2}` The *2 has been a contentious point for a long time. This is what I think happened: In 'JDK-8049737: Contended Locking reorder and cache line bucket' DEFAULT_CACHE_LINE was increased from 64 to 128. This was done for the ObjectMonitors, but it had an effect on other data structures in other parts of the JVM (Esp. in the GC). There were performance measurements done to try to see if the 64 to 128 change made any difference, but AFAIK, no performance difference could be seen, but the *2 change was left in place anyways. And then 'JDK-8235931: add OM_CACHE_LINE_SIZE and use smaller size on SPARCv9 and X64' came along and changed the ObjectMonitor code to use its own define without the *2 but leaving the DEFAULT_CACHE_LINE_SIZE and padding code to still use *2! We have asked Intel if really should be padding with two cache lines as we say here: // Hardware prefetchers on current implementations may pull 2 cache lines // on access, therefore we pessimistically assume twice the cache line size // for padding. #define DEFAULT_PADDING_SIZE (DEFAULT_CACHE_LINE_SIZE*2) and their answer was that their hardware stopped doing that over 10 years ago (this question was asked a few years ago). For ZGC we tried to poke at his a bit but gave up and added our own: const size_t ZPlatformCacheLineSize = 64; So, with all that said. Do we really need to keep this *2 in the x64 code? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1842484452 From kim.barrett at oracle.com Wed Dec 6 09:21:15 2023 From: kim.barrett at oracle.com (Kim Barrett) Date: Wed, 6 Dec 2023 09:21:15 +0000 Subject: Use of C++ dynamic global object initialization with thread guards In-Reply-To: <87fs0izasf.fsf@oldenburg.str.redhat.com> References: <87fs0izasf.fsf@oldenburg.str.redhat.com> Message-ID: <514F5A61-E3C2-4B24-A567-EF19C4292989@oracle.com> > On Dec 4, 2023, at 2:28 AM, Florian Weimer wrote: > > As far as I understand it, the Hotspot C++ style guide advises not use > C++ run-time library features. Largely true. > However, it seems that the use of > dynamic initialization guards (that involve calls to __cxa_guard_acquire > and __cxa_guard_release) has increased quite a bit over the years. Also true. We weren?t supposed to be using that at all before adopting C++11/14, which defined function scoped static initialization to be thread safe. But now that we have that feature, we?re allowed to use it, and it?s sometimes/often better than hand-rolled alternatives, at least as far as source code simplicity and readability. > The implementation of __cxa_guard_acquire is not entirely trivial > because it detects recursive initialization and throws > __gnu_cxx::recursive_init_error, which means that it pulls in the C++ > unwinder (at least with a traditional GNU/Linux build of libstdc++.a). Does it? Seems like it shouldn?t. We build with -fno-exceptions, and the definition of throw_recursive_init_exception is conditionalized on __cpp_exceptions, only throwing when that macro is defined. It calls __builtin_trap() if that macro isn?t defined. And that?s the sort of thing I expected to find. For something from the runtime library to throw when under the influence of -fno-exceptions seems like a bug. > Furthermore, most uses of C++ dynamic initialization involve a > computation that is idempotent and have unused bit patterns in the > initialized value. This means that a separate guard variable is not > needed, and a simple atomic store/atomic load could be used. That?s the kind of complexity we?d really rather avoid if possible. > In other cases, the use of global objects seems unnecessary. For > example, src/hotspot/share/jfr/recorder/checkpoint/jfrCheckpointManager.cpp > has a dynamically initialized static variable max_elem_size: > > [? code snipped ?] > The min_element_size() member function is inline and just returns > _min_element_size, which is declared const, so it cannot change over > time. This means that caching that value is pointless, and the static > should probably removed. I agree this seems like an inappropriate use of static. It only works here because the manager object being referenced is a singleton. > Would it make sense to minimize the use of __cxa_guard_acquire and > __cxa_guard_release? There are currently 400 such calls, but many of > them appear in templated code, so I could get it down to ~80 calls with > about a days of work. I don?t think so, since I currently don?t think the described problem (pulling in exception unwinding) actually exists. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From fyang at openjdk.org Wed Dec 6 09:27:46 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 6 Dec 2023 09:27:46 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v9] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 12:57:05 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Changed lb-->lbu for T_BOOLEAN and iRegINoSp-->iRegLNoSp for tmp2/tmp3. So I tried this on sifive unmatched. Unfortunately, I see some performance regressions with this change. Before: Benchmark (size) Mode Cnt Score Error Units ArraysHashCode.bytes 1 avgt 15 19.737 ? 5.405 ns/op ArraysHashCode.bytes 10 avgt 15 56.102 ? 3.191 ns/op ArraysHashCode.bytes 100 avgt 15 317.126 ? 3.452 ns/op ArraysHashCode.bytes 10000 avgt 15 28380.470 ? 20.709 ns/op ArraysHashCode.chars 1 avgt 15 15.532 ? 2.623 ns/op ArraysHashCode.chars 10 avgt 15 59.603 ? 2.440 ns/op ArraysHashCode.chars 100 avgt 15 333.995 ? 3.834 ns/op ArraysHashCode.chars 10000 avgt 15 29464.768 ? 16.751 ns/op ArraysHashCode.ints 1 avgt 15 16.031 ? 2.820 ns/op ArraysHashCode.ints 10 avgt 15 59.506 ? 3.980 ns/op ArraysHashCode.ints 100 avgt 15 335.514 ? 4.695 ns/op ArraysHashCode.ints 10000 avgt 15 33966.175 ? 929.859 ns/op ArraysHashCode.multibytes 1 avgt 15 7.840 ? 0.110 ns/op ArraysHashCode.multibytes 10 avgt 15 34.727 ? 0.547 ns/op ArraysHashCode.multibytes 100 avgt 15 193.085 ? 0.814 ns/op ArraysHashCode.multibytes 10000 avgt 15 16610.239 ? 27.290 ns/op ArraysHashCode.multichars 1 avgt 15 7.853 ? 0.092 ns/op ArraysHashCode.multichars 10 avgt 15 35.059 ? 0.241 ns/op ArraysHashCode.multichars 100 avgt 15 203.483 ? 0.413 ns/op ArraysHashCode.multichars 10000 avgt 15 18819.804 ? 75.487 ns/op ArraysHashCode.multiints 1 avgt 15 7.878 ? 0.104 ns/op ArraysHashCode.multiints 10 avgt 15 35.232 ? 0.196 ns/op ArraysHashCode.multiints 100 avgt 15 211.087 ? 1.914 ns/op ArraysHashCode.multiints 10000 avgt 15 30172.693 ? 1447.757 ns/op ArraysHashCode.multishorts 1 avgt 15 7.788 ? 0.046 ns/op ArraysHashCode.multishorts 10 avgt 15 35.504 ? 0.465 ns/op ArraysHashCode.multishorts 100 avgt 15 203.530 ? 0.342 ns/op ArraysHashCode.multishorts 10000 avgt 15 18801.799 ? 77.159 ns/op ArraysHashCode.shorts 1 avgt 15 19.685 ? 5.413 ns/op ArraysHashCode.shorts 10 avgt 15 59.583 ? 4.684 ns/op ArraysHashCode.shorts 100 avgt 15 333.170 ? 5.367 ns/op ArraysHashCode.shorts 10000 avgt 15 29455.665 ? 13.302 ns/op After: Benchmark (size) Mode Cnt Score Error Units ArraysHashCode.bytes 1 avgt 15 18.575 ? 3.780 ns/op ArraysHashCode.bytes 10 avgt 15 55.394 ? 4.610 ns/op ArraysHashCode.bytes 100 avgt 15 340.807 ? 3.387 ns/op ArraysHashCode.bytes 10000 avgt 15 31506.478 ? 27.694 ns/op ArraysHashCode.chars 1 avgt 15 15.966 ? 2.291 ns/op ArraysHashCode.chars 10 avgt 15 56.524 ? 4.301 ns/op ArraysHashCode.chars 100 avgt 15 343.389 ? 3.272 ns/op ArraysHashCode.chars 10000 avgt 15 31520.717 ? 13.290 ns/op ArraysHashCode.ints 1 avgt 15 16.078 ? 3.977 ns/op ArraysHashCode.ints 10 avgt 15 55.467 ? 2.845 ns/op ArraysHashCode.ints 100 avgt 15 344.500 ? 3.531 ns/op ArraysHashCode.ints 10000 avgt 15 36234.542 ? 39.191 ns/op ArraysHashCode.multibytes 1 avgt 15 7.816 ? 0.072 ns/op ArraysHashCode.multibytes 10 avgt 15 29.617 ? 0.257 ns/op ArraysHashCode.multibytes 100 avgt 15 183.986 ? 0.236 ns/op ArraysHashCode.multibytes 10000 avgt 15 18349.268 ? 28.711 ns/op ArraysHashCode.multichars 1 avgt 15 7.821 ? 0.050 ns/op ArraysHashCode.multichars 10 avgt 15 29.293 ? 0.273 ns/op ArraysHashCode.multichars 100 avgt 15 186.538 ? 0.404 ns/op ArraysHashCode.multichars 10000 avgt 15 20149.487 ? 87.300 ns/op ArraysHashCode.multiints 1 avgt 15 7.847 ? 0.044 ns/op ArraysHashCode.multiints 10 avgt 15 29.765 ? 1.082 ns/op ArraysHashCode.multiints 100 avgt 15 193.887 ? 0.360 ns/op ArraysHashCode.multiints 10000 avgt 15 30997.145 ? 420.328 ns/op ArraysHashCode.multishorts 1 avgt 15 7.856 ? 0.128 ns/op ArraysHashCode.multishorts 10 avgt 15 29.231 ? 0.434 ns/op ArraysHashCode.multishorts 100 avgt 15 187.044 ? 0.289 ns/op ArraysHashCode.multishorts 10000 avgt 15 20146.327 ? 89.985 ns/op ArraysHashCode.shorts 1 avgt 15 15.162 ? 4.191 ns/op ArraysHashCode.shorts 10 avgt 15 54.279 ? 2.661 ns/op ArraysHashCode.shorts 100 avgt 15 343.085 ? 4.204 ns/op ArraysHashCode.shorts 10000 avgt 15 31536.455 ? 23.874 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1842496746 From stefank at openjdk.org Wed Dec 6 09:28:09 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 09:28:09 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v5] In-Reply-To: References: Message-ID: > There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: > > > if (UseTransparentHugePages && !HugePages::supports_thp()) { > if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { > log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); > } > UseLargePages = UseTransparentHugePages = false; > return; > } > > > This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: > > /sys/kernel/mm/transparent_hugepage/enabled: never > /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise > > > the above code will force ZGC to run without THPs. > > This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: > > 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. > > 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. > > 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. > > The result of this change can be seen in these tables: > > ZGC large pages log output: > > E (T) = Enabled (Transparent) > E (T, OS) = Enabled (Transparent, OS enforced) > D = Disabled > D = Disabled (OS enforced) > > -XX:+UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+--------+---------+------- > always | E (T) | E (T) | E (T) > within_size | E (T) | E (T) | E (T) > advise | E (T) | E (T) | E (T) > never | D (OS) | D (OS) | D (OS) > deny | D (OS) | D (OS) | D (OS) > force | E (T) | E (T) | E (T) > > -XX:-UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+-----------+-----------+------- > always | E (T, OS) | E (T, OS) | E (T, OS) > within_size | E (T, OS) | E (T, OS) | E (T, OS) > advise | D | D | D > never | D | D | D > deny | D | D | D > force ... Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge remote-tracking branch 'upstream/master' into 8319969_zgc_thp_workaround - More precise THP warning messages - Move _thp_requested out from HugePages - Small tweaks - 8319969: os::large_page_init() turns off THPs for ZGC ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16690/files - new: https://git.openjdk.org/jdk/pull/16690/files/ec05ba30..532ecdd3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16690&range=03-04 Stats: 765672 lines in 3725 files changed: 156377 ins; 534341 del; 74954 mod Patch: https://git.openjdk.org/jdk/pull/16690.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16690/head:pull/16690 PR: https://git.openjdk.org/jdk/pull/16690 From iwalulya at openjdk.org Wed Dec 6 09:38:33 2023 From: iwalulya at openjdk.org (Ivan Walulya) Date: Wed, 6 Dec 2023 09:38:33 GMT Subject: RFR: 8319313: G1: Rename G1EvacFailureInjector appropriately In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 15:29:22 GMT, Thomas Schatzl wrote: > Hi all, > > please review this rename of `G1EvacFailureInjector` and associated options to `G1AllocationFailureInjector` according to the results of the discussion for the review of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706). > > To facilitate review the first commit implements the renaming changes, the second moves the affected files only. > > Testing: gha, local gc/g1 tests > > Thanks, > Thomas Marked as reviewed by iwalulya (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16905#pullrequestreview-1767101961 From shade at openjdk.org Wed Dec 6 09:44:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Dec 2023 09:44:35 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 09:15:35 GMT, Stefan Karlsson wrote: > So, with all that said. Do we really need to keep this *2 in the x64 code? Maybe, maybe not. Anyhow, let's not change multiple things at the same time. I am provisionally good with dropping `*2`, as long as we have a targeted study on its effects or the absence of them, and/or Intel and AMD folks agree this should go. What this change does is clearly separating the notions of "this is what we expect the cache line to be on given platform" (`DEFAULT_CACHE_LINE_SIZE`) and "this is how by how much we should pad" (`DEFAULT_PADDING_SIZE`). While I agree both might be the same value for known platforms, they convey different _intents_. This is also why, AFAICS, C++ spec calls it `hardware_{constructive,destructive}_interference_size`, not just "cache line size". This PR effectively renames `DEFAULT_CACHE_LINE_SIZE` -> `DEFAULT_PADDING_SIZE`. We leave `DEFAULT_CACHE_LINE_SIZE` for cases where we want to know CL size outside the padding contexts. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1842526833 From stefank at openjdk.org Wed Dec 6 10:01:35 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 10:01:35 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: Message-ID: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> On Tue, 5 Dec 2023 11:23:31 GMT, Aleksey Shipilev wrote: > [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. > > The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. > > Additional testing: > - [x] Large build matrix of server/zero builds > - [x] Linux AArch64 server fastdebug, `tier{1,2}` > - [x] Linux x86_64 server fastdebug, `tier{1,2}` Sure. It think it is a good patch. However, given that you added this commment: // Hardware prefetchers on current implementations may pull 2 cache lines // on access, therefore we pessimistically assume twice the cache line size // for padding. Do you have anything that backs up the claim that this is the case for "current implementations"? Maybe @sviswa7 can help answering if this is still the case for Intel hardware? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1842555270 From shade at openjdk.org Wed Dec 6 10:10:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Dec 2023 10:10:35 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> References: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> Message-ID: On Wed, 6 Dec 2023 09:58:42 GMT, Stefan Karlsson wrote: > Sure. It think it is a good patch. However, given that you added this commment: > > ``` > // Hardware prefetchers on current implementations may pull 2 cache lines > // on access, therefore we pessimistically assume twice the cache line size > // for padding. > ``` > > Do you have anything that backs up the claim that this is the case for "current implementations"? I was merely trying to explain why `*2` is even there. The common explanation is referring to a "common wisdom" about adjacent cache line prefetchers. Granted, that might have been true only a decade ago, and it might not hold true anymore. I can change "current" to "some" or "some old" if you think that is more neutral. Again, I don't think this PR should be discussing `*2` thing. It should be a separate deep dive and clear few-liner change later. That reminds me we would probably change the default for `ContendedPaddingWidth`, probably by tying it to `DEFAULT_PADDING_SIZE` once this lands. We should not be doing it here, though, because it would have footprint implications on platforms with large cache lines, like S390X. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1842566889 From stefank at openjdk.org Wed Dec 6 10:16:32 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 10:16:32 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 11:23:31 GMT, Aleksey Shipilev wrote: > [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. > > The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. > > Additional testing: > - [x] Large build matrix of server/zero builds > - [x] Linux AArch64 server fastdebug, `tier{1,2}` > - [x] Linux x86_64 server fastdebug, `tier{1,2}` Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16973#pullrequestreview-1767183316 From stefank at openjdk.org Wed Dec 6 10:16:34 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 10:16:34 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> Message-ID: On Wed, 6 Dec 2023 10:06:40 GMT, Aleksey Shipilev wrote: > > Sure. It think it is a good patch. However, given that you added this commment: > > ``` > > // Hardware prefetchers on current implementations may pull 2 cache lines > > // on access, therefore we pessimistically assume twice the cache line size > > // for padding. > > ``` > > > > > > > > > > > > > > > > > > > > > > > > Do you have anything that backs up the claim that this is the case for "current implementations"? > > I was merely trying to explain why `*2` is even there. The common explanation is referring to a "common wisdom" about adjacent cache line prefetchers. Granted, that might have been true only a decade ago, and it might not hold true anymore. I can change "current" to "some" or "some old" if you think that is more neutral. > > Again, I don't think this PR should be discussing `*2` thing. It should be a separate deep dive and clear few-liner change later. That reminds me we would probably change the default for `ContendedPaddingWidth`, probably by tying it to `DEFAULT_PADDING_SIZE` once this lands. We should not be doing it here, though, because it would have footprint implications on platforms with large cache lines, like S390X. Well I bring this up because you are adding a comment that further sediments the understanding of the need to use *2. As I said I think the patch looks good and I'm just taking the opportunity to talk about this now that someone is yet again juggling around this *2 value. Changing to "some" is probably a good idea. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1842577610 From shade at openjdk.org Wed Dec 6 10:28:00 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Dec 2023 10:28:00 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> Message-ID: On Wed, 6 Dec 2023 10:13:38 GMT, Stefan Karlsson wrote: > Well I bring this up because you are adding a comment that further sediments the understanding of the need to use *2. As I said I think the patch looks good and I'm just taking the opportunity to talk about this now that someone is yet again juggling around this *2 value. Changing to "some" is probably a good idea. Yes, true. I rewrote the comment to hopefully much weaker wording that now implies we are not actually that sure `*2` adjustment is needed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1842594038 From shade at openjdk.org Wed Dec 6 10:27:58 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Dec 2023 10:27:58 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: > [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. > > The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. > > Additional testing: > - [x] Large build matrix of server/zero builds > - [x] Linux AArch64 server fastdebug, `tier{1,2}` > - [x] Linux x86_64 server fastdebug, `tier{1,2}` Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Better verbiage for *2 adjustment for x86_64 - Merge branch 'master' into JDK-8237842-cache-line-padding-defs - Work ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16973/files - new: https://git.openjdk.org/jdk/pull/16973/files/33c1d249..0e8929a5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16973&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16973&range=00-01 Stats: 31530 lines in 320 files changed: 9149 ins; 21344 del; 1037 mod Patch: https://git.openjdk.org/jdk/pull/16973.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16973/head:pull/16973 PR: https://git.openjdk.org/jdk/pull/16973 From stefank at openjdk.org Wed Dec 6 10:33:35 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 10:33:35 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16973#pullrequestreview-1767217881 From tschatzl at openjdk.org Wed Dec 6 10:37:43 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 6 Dec 2023 10:37:43 GMT Subject: RFR: 8321369: Unproblemlist gc/cslocker/TestCSLocker.java In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 02:52:29 GMT, David Holmes wrote: >> Hi all, >> >> please review this fix to unproblemlist gc/cs/TestCSLocker.java; the CR to fix this [JDK-8310480](https://bugs.openjdk.org/browse/JDK-8310480) has already been closed as duplicate of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706) that removed the GCLocker for G1 which is the cause for the issue. >> >> Note that the test has only been problemlisted for linux-x64 previously, so it has already been run for a long time for other platforms and other collectors, so my testing did not extensively try all the other platforms/gc combinations (I did try with Serial and Parallel a few times with no issues; ZGC is excluded anyway in the test, and Shenandoah also does not use the GCLocker). >> >> Testing: test case with g1, gha >> >> Thanks, >> Thomas > > Looks good and trivial. Thanks for cleaning up the PL. thanks @dholmes-ora for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16970#issuecomment-1842606045 From tschatzl at openjdk.org Wed Dec 6 10:37:45 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 6 Dec 2023 10:37:45 GMT Subject: Integrated: 8321369: Unproblemlist gc/cslocker/TestCSLocker.java In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 10:19:02 GMT, Thomas Schatzl wrote: > Hi all, > > please review this fix to unproblemlist gc/cs/TestCSLocker.java; the CR to fix this [JDK-8310480](https://bugs.openjdk.org/browse/JDK-8310480) has already been closed as duplicate of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706) that removed the GCLocker for G1 which is the cause for the issue. > > Note that the test has only been problemlisted for linux-x64 previously, so it has already been run for a long time for other platforms and other collectors, so my testing did not extensively try all the other platforms/gc combinations (I did try with Serial and Parallel a few times with no issues; ZGC is excluded anyway in the test, and Shenandoah also does not use the GCLocker). > > Testing: test case with g1, gha > > Thanks, > Thomas This pull request has now been integrated. Changeset: 7fbfb3b7 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/7fbfb3b74a283261027e6c293e1a5dbc354cf0af Stats: 1 line in 1 file changed: 0 ins; 1 del; 0 mod 8321369: Unproblemlist gc/cslocker/TestCSLocker.java Reviewed-by: dholmes ------------- PR: https://git.openjdk.org/jdk/pull/16970 From jbachorik at openjdk.org Wed Dec 6 10:38:53 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Wed, 6 Dec 2023 10:38:53 GMT Subject: RFR: 8313816: Accessing jmethodID might lead to spurious crashes [v11] In-Reply-To: References: Message-ID: On Mon, 4 Dec 2023 23:32:52 GMT, David Holmes wrote: > The skara tooling does not currently support our rules but it remains as always that non-trivial Hotspot changes require two reviewers. Thanks, I will keep this in mind. And I apologise for not following the process, though not intentionally. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16662#issuecomment-1842611183 From fweimer at redhat.com Wed Dec 6 10:51:36 2023 From: fweimer at redhat.com (Florian Weimer) Date: Wed, 06 Dec 2023 11:51:36 +0100 Subject: Use of C++ dynamic global object initialization with thread guards In-Reply-To: <514F5A61-E3C2-4B24-A567-EF19C4292989@oracle.com> (Kim Barrett's message of "Wed, 6 Dec 2023 09:21:15 +0000") References: <87fs0izasf.fsf@oldenburg.str.redhat.com> <514F5A61-E3C2-4B24-A567-EF19C4292989@oracle.com> Message-ID: <87lea7d2o7.fsf@oldenburg.str.redhat.com> * Kim Barrett: >> The implementation of __cxa_guard_acquire is not entirely trivial >> because it detects recursive initialization and throws >> __gnu_cxx::recursive_init_error, which means that it pulls in the C++ >> unwinder (at least with a traditional GNU/Linux build of libstdc++.a). > > Does it? Seems like it shouldn?t. We build with -fno-exceptions, and > the definition of throw_recursive_init_exception is conditionalized on > __cpp_exceptions, only throwing when that macro is defined. It calls > __builtin_trap() if that macro isn?t defined. With upstream GCC (and presumably most distributions), there's one libstdc++.a with one implementation of __cxa_guard_acquire, and it's built with exception support. It's supposed to be possible to build libstdc++ without exception support, but upstream GCC doesn't do this automatically for you if the target supports exception handling. In principle, the GCC specs mechanism allows you to treat -fno-exceptions as a linker flag and link against a custom no-exceptions build of libstdc++.a. Maybe this is what your toolchain is doing if you don't see the unwinder symbols in your builds? It should be easy enough to check if you have a build with a symbol table: look for a call in __cxa_throw in the disassembly of __cxa_guard_acquire.cold or __cxa_guard_acquire. One of our builds looks like this: 00000000002997df <__cxa_guard_acquire.cold>: 2997df: bf 08 00 00 00 mov $0x8,%edi 2997e4: e8 77 c0 e9 00 callq 1135860 <__cxa_allocate_exception > 2997e9: 48 89 c7 mov %rax,%rdi 2997ec: 48 89 c5 mov %rax,%rbp 2997ef: e8 7c b6 e9 00 callq 1134e70 <_ZN9__gnu_cxx20recursive _init_errorC1Ev> 2997f4: 48 8d 15 35 b6 e9 00 lea 0xe9b635(%rip),%rdx # 1134 e30 <_ZN9__gnu_cxx20recursive_init_errorD1Ev> 2997fb: 48 8d 35 be e5 24 01 lea 0x124e5be(%rip),%rsi # 14e 7dc0 <_ZTIN9__gnu_cxx20recursive_init_errorE> 299802: 48 89 ef mov %rbp,%rdi 299805: e8 46 b4 e9 00 callq 1134c50 <__cxa_throw> Arguably this is a gap in what GCC provides. There are three ways to address it: ship a no-exceptions variant of libstdc++.a, introduce a non-throwing symbol that replaces __cxa_guard_acquire, or stop throwing altogether on reentrant initialization (which probably isn't required by the C++ standard). The libstdc++.a variant does not address the dynamically linked version of libstdc++, which conceptually has the same problem. But I don't expect libstdc++ to duplicate all potentially throwing functions to accommodate the non-exception case. With increasing use of libstdc++ facilities in Hotspot, the libstdc++.a variant may be the only feasible long-term approach that is both maintainable on the GCC side and truly avoids an unwinder dependency. (I don't want to turn this into a Restaurant Sketch scenario?there is non-trivial libstdc++ usage beyond __cxa_guard_acquire in Hotspot. I just wanted to start with a fairly simple example.) Thanks, Florian From mgronlun at openjdk.org Wed Dec 6 11:00:06 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Wed, 6 Dec 2023 11:00:06 GMT Subject: RFR: 8211238: @Deprecated JFR event [v17] In-Reply-To: References: Message-ID: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: reviewer feedback ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16931/files - new: https://git.openjdk.org/jdk/pull/16931/files/9f6bc68a..db96aebb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16931&range=15-16 Stats: 3 lines in 3 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16931.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16931/head:pull/16931 PR: https://git.openjdk.org/jdk/pull/16931 From ihse at openjdk.org Wed Dec 6 11:48:44 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Dec 2023 11:48:44 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7] In-Reply-To: References: Message-ID: <0ya82eFBzsE0U96QMoP7OKmd7PAvW7GFXYP_iD_HTqE=.f12ca572-4a4e-4fbd-947b-e11f0aad81a1@github.com> On Wed, 6 Dec 2023 09:14:58 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Add "--with-libsleef-lib" and "--with-libsleef-include" options make/modules/jdk.incubator.vector/Lib.gmk line 45: > 43: $(eval $(call SetupJdkLibrary, BUILD_LIBVMATH, \ > 44: NAME := vmath, \ > 45: CFLAGS := $(CFLAGS_JDKLIB) $(LIBSLEEF_CFLAGS) -fvisibility=default, \ Why `-fvisibility=default`? (Sorry, only noticed this now) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1417156921 From ihse at openjdk.org Wed Dec 6 11:53:42 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Wed, 6 Dec 2023 11:53:42 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 09:14:58 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Add "--with-libsleef-lib" and "--with-libsleef-include" options All the makefile changes we've discussed previously now look good. However, I just noticed the additional -f flag. Why are you not exporting the functions from source code instead? That is the way we normally do it in JDK libraries. In your case, it seems like you only need to add the export to the macro. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1842720243 From tschatzl at openjdk.org Wed Dec 6 12:19:13 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 6 Dec 2023 12:19:13 GMT Subject: RFR: 8319313: G1: Rename G1EvacFailureInjector appropriately [v2] In-Reply-To: References: Message-ID: > Hi all, > > please review this rename of `G1EvacFailureInjector` and associated options to `G1AllocationFailureInjector` according to the results of the discussion for the review of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706). > > To facilitate review the first commit implements the renaming changes, the second moves the affected files only. > > Testing: gha, local gc/g1 tests > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: some additional minor change of "evac_failure" to "allocation_failure" ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16905/files - new: https://git.openjdk.org/jdk/pull/16905/files/0c3c1bc2..a161d1fb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16905&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16905&range=00-01 Stats: 3 lines in 2 files changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16905.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16905/head:pull/16905 PR: https://git.openjdk.org/jdk/pull/16905 From duke at openjdk.org Wed Dec 6 12:39:45 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Wed, 6 Dec 2023 12:39:45 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v9] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 09:23:28 GMT, Fei Yang wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Changed lb-->lbu for T_BOOLEAN and iRegINoSp-->iRegLNoSp for tmp2/tmp3. > > So I tried this on sifive unmatched. Unfortunately, I see some performance regressions with this change. > Before: > > Benchmark (size) Mode Cnt Score Error Units > ArraysHashCode.bytes 1 avgt 15 19.737 ? 5.405 ns/op > ArraysHashCode.bytes 10 avgt 15 56.102 ? 3.191 ns/op > ArraysHashCode.bytes 100 avgt 15 317.126 ? 3.452 ns/op > ArraysHashCode.bytes 10000 avgt 15 28380.470 ? 20.709 ns/op > ArraysHashCode.chars 1 avgt 15 15.532 ? 2.623 ns/op > ArraysHashCode.chars 10 avgt 15 59.603 ? 2.440 ns/op > ArraysHashCode.chars 100 avgt 15 333.995 ? 3.834 ns/op > ArraysHashCode.chars 10000 avgt 15 29464.768 ? 16.751 ns/op > ArraysHashCode.ints 1 avgt 15 16.031 ? 2.820 ns/op > ArraysHashCode.ints 10 avgt 15 59.506 ? 3.980 ns/op > ArraysHashCode.ints 100 avgt 15 335.514 ? 4.695 ns/op > ArraysHashCode.ints 10000 avgt 15 33966.175 ? 929.859 ns/op > ArraysHashCode.multibytes 1 avgt 15 7.840 ? 0.110 ns/op > ArraysHashCode.multibytes 10 avgt 15 34.727 ? 0.547 ns/op > ArraysHashCode.multibytes 100 avgt 15 193.085 ? 0.814 ns/op > ArraysHashCode.multibytes 10000 avgt 15 16610.239 ? 27.290 ns/op > ArraysHashCode.multichars 1 avgt 15 7.853 ? 0.092 ns/op > ArraysHashCode.multichars 10 avgt 15 35.059 ? 0.241 ns/op > ArraysHashCode.multichars 100 avgt 15 203.483 ? 0.413 ns/op > ArraysHashCode.multichars 10000 avgt 15 18819.804 ? 75.487 ns/op > ArraysHashCode.multiints 1 avgt 15 7.878 ? 0.104 ns/op > ArraysHashCode.multiints 10 avgt 15 35.232 ? 0.196 ns/op > ArraysHashCode.multiints 100 avgt 15 211.087 ? 1.914 ns/op > ArraysHashCode.multiints 10000 avgt 15 30172.693 ? 1447.757 ns/op > ArraysHashCode.multishorts 1 avgt 15 7.788 ? 0.046 ns/op > ArraysHashCode.multishorts 10 avgt 15 35.504 ? 0.465 ns/op > ArraysHashCode.multishorts 100 avgt 15 203.530 ? 0.342 ns/op > ArraysHashCode.multishorts 10000 avgt 15 18801.799 ? 77.159 ns/op > ArraysHashCode.shorts 1 avgt 15 19.685 ? 5.413 ns/op > ArraysHashCode.shorts 10 avgt 15 59.583 ? 4.684 ns/op > ArraysHashCode.shorts 100 avgt 15 333.170 ? ... @RealFYang - many thanks for checking my results. I can confirm the regressions found by you on '_Sifive Unmatched_' and '_StarFive JH7110_' but not on '_T-Head RVB-ICE_'. Sifive Unmatched: Benchmark (size) Mode Cnt Score Error Score Error Units ArraysHashCode.bytes 10 avgt 15 64.318 ? 2.035 54.062 ? 5.229 ns/op ArraysHashCode.bytes 100 avgt 15 317.512 ? 1.944 352.153 ? 2.802 ns/op ArraysHashCode.bytes 1000 avgt 15 2875.723 ? 5.773 3185.938 ? 11.533 ns/op ArraysHashCode.bytes 10000 avgt 15 28386.639 ? 42.689 31602.956 ? 101.607 ns/op ArraysHashCode.chars 10 avgt 15 61.976 ? 5.562 52.773 ? 0.572 ns/op ArraysHashCode.chars 100 avgt 15 332.949 ? 5.650 347.923 ? 2.549 ns/op ArraysHashCode.chars 1000 avgt 15 2991.818 ? 4.396 3186.808 ? 3.248 ns/op ArraysHashCode.chars 10000 avgt 15 29528.339 ? 30.613 31580.887 ? 24.868 ns/op ArraysHashCode.ints 10 avgt 15 62.328 ? 5.592 57.784 ? 5.973 ns/op ArraysHashCode.ints 100 avgt 15 328.240 ? 0.556 342.637 ? 4.174 ns/op ArraysHashCode.ints 1000 avgt 15 2984.384 ? 0.865 3175.596 ? 5.490 ns/op ArraysHashCode.ints 10000 avgt 15 33830.448 ? 55.105 36310.776 ? 39.960 ns/op ArraysHashCode.multibytes 10 avgt 15 34.593 ? 0.318 29.497 ? 0.164 ns/op ArraysHashCode.multibytes 100 avgt 15 193.281 ? 0.548 184.273 ? 0.293 ns/op ArraysHashCode.multibytes 1000 avgt 15 1651.863 ? 7.210 1816.328 ? 15.627 ns/op ArraysHashCode.multibytes 10000 avgt 15 16653.946 ? 42.054 18371.511 ? 63.512 ns/op ArraysHashCode.multichars 10 avgt 15 35.104 ? 0.228 29.222 ? 0.165 ns/op ArraysHashCode.multichars 100 avgt 15 203.873 ? 0.170 187.397 ? 0.249 ns/op ArraysHashCode.multichars 1000 avgt 15 1900.463 ? 5.472 2018.710 ? 7.485 ns/op ArraysHashCode.multichars 10000 avgt 15 18991.838 ? 124.180 20257.406 ? 236.705 ns/op ArraysHashCode.multiints 10 avgt 15 35.232 ? 0.159 29.439 ? 0.199 ns/op ArraysHashCode.multiints 100 avgt 15 211.201 ? 0.330 197.014 ? 0.242 ns/op ArraysHashCode.multiints 1000 avgt 15 2225.082 ? 13.261 2363.767 ? 7.716 ns/op ArraysHashCode.multiints 10000 avgt 15 31441.156 ? 574.230 33215.921 ? 392.809 ns/op ArraysHashCode.multishorts 10 avgt 15 35.068 ? 0.164 29.237 ? 0.103 ns/op ArraysHashCode.multishorts 100 avgt 15 203.904 ? 0.393 186.776 ? 0.481 ns/op ArraysHashCode.multishorts 1000 avgt 15 1897.351 ? 7.325 2019.786 ? 6.929 ns/op ArraysHashCode.multishorts 10000 avgt 15 18907.910 ? 77.537 20349.999 ? 363.546 ns/op ArraysHashCode.shorts 10 avgt 15 56.984 ? 0.414 54.080 ? 2.779 ns/op ArraysHashCode.shorts 100 avgt 15 327.769 ? 2.124 347.356 ? 3.553 ns/op ArraysHashCode.shorts 1000 avgt 15 2988.008 ? 8.788 3175.180 ? 1.255 ns/op ArraysHashCode.shorts 10000 avgt 15 29520.151 ? 58.051 31577.025 ? 66.598 ns/op T-Head RVB-ICE: Benchmark (size) Mode Cnt Score Error Score Error Units ArraysHashCode.bytes 10 avgt 15 51.762 ? 0.378 46.688 ? 0.274 ns/op ArraysHashCode.bytes 100 avgt 15 282.922 ? 0.891 228.657 ? 1.265 ns/op ArraysHashCode.bytes 1000 avgt 15 2550.826 ? 3.773 1790.795 ? 3.891 ns/op ArraysHashCode.bytes 10000 avgt 15 25165.387 ? 72.982 17510.131 ? 30.293 ns/op ArraysHashCode.chars 10 avgt 15 53.831 ? 0.225 46.425 ? 0.231 ns/op ArraysHashCode.chars 100 avgt 15 285.555 ? 0.813 237.081 ? 1.216 ns/op ArraysHashCode.chars 1000 avgt 15 2558.697 ? 6.253 1893.147 ? 3.187 ns/op ArraysHashCode.chars 10000 avgt 15 25162.198 ? 67.487 18558.521 ? 49.710 ns/op ArraysHashCode.ints 10 avgt 15 52.046 ? 0.170 46.364 ? 0.240 ns/op ArraysHashCode.ints 100 avgt 15 285.322 ? 0.924 238.531 ? 0.901 ns/op ArraysHashCode.ints 1000 avgt 15 2557.016 ? 6.214 1892.898 ? 3.588 ns/op ArraysHashCode.ints 10000 avgt 15 25272.506 ? 116.785 18599.428 ? 45.966 ns/op ArraysHashCode.multibytes 10 avgt 15 26.190 ? 0.129 18.942 ? 0.103 ns/op ArraysHashCode.multibytes 100 avgt 15 160.660 ? 0.400 116.278 ? 0.287 ns/op ArraysHashCode.multibytes 1000 avgt 15 1366.136 ? 9.128 908.505 ? 2.456 ns/op ArraysHashCode.multibytes 10000 avgt 15 13290.571 ? 21.784 8975.082 ? 18.118 ns/op ArraysHashCode.multichars 10 avgt 15 26.630 ? 0.153 19.760 ? 0.178 ns/op ArraysHashCode.multichars 100 avgt 15 164.371 ? 0.438 118.326 ? 0.334 ns/op ArraysHashCode.multichars 1000 avgt 15 1399.774 ? 3.369 1031.653 ? 2.678 ns/op ArraysHashCode.multichars 10000 avgt 15 13321.995 ? 27.186 9653.414 ? 44.167 ns/op ArraysHashCode.multiints 10 avgt 15 25.878 ? 0.093 19.145 ? 0.217 ns/op ArraysHashCode.multiints 100 avgt 15 169.129 ? 0.551 126.062 ? 0.239 ns/op ArraysHashCode.multiints 1000 avgt 15 1405.709 ? 8.575 1046.932 ? 3.110 ns/op ArraysHashCode.multiints 10000 avgt 15 13712.716 ? 28.671 10414.196 ? 19.044 ns/op ArraysHashCode.multishorts 10 avgt 15 26.614 ? 0.240 19.742 ? 0.169 ns/op ArraysHashCode.multishorts 100 avgt 15 164.488 ? 0.296 119.336 ? 0.434 ns/op ArraysHashCode.multishorts 1000 avgt 15 1396.339 ? 3.062 1032.205 ? 4.604 ns/op ArraysHashCode.multishorts 10000 avgt 15 13475.962 ? 27.035 9694.721 ? 38.618 ns/op ArraysHashCode.shorts 10 avgt 15 52.145 ? 0.671 50.525 ? 0.560 ns/op ArraysHashCode.shorts 100 avgt 15 284.563 ? 0.663 236.216 ? 1.160 ns/op ArraysHashCode.shorts 1000 avgt 15 2565.527 ? 5.313 1895.385 ? 3.632 ns/op ArraysHashCode.shorts 10000 avgt 15 25160.261 ? 78.896 18562.613 ? 62.708 ns/op StarFive JH7110 Benchmark (size) Mode Cnt Score Error Score Error Units ArraysHashCode.bytes 10 avgt 15 40.824 ? 0.148 39.565 ? 0.069 ns/op ArraysHashCode.bytes 100 avgt 15 250.700 ? 1.212 268.188 ? 0.864 ns/op ArraysHashCode.bytes 1000 avgt 15 2290.265 ? 5.880 2524.649 ? 5.811 ns/op ArraysHashCode.bytes 10000 avgt 15 22593.288 ? 4.222 25114.007 ? 41.732 ns/op ArraysHashCode.chars 10 avgt 15 45.736 ? 0.356 40.116 ? 0.043 ns/op ArraysHashCode.chars 100 avgt 15 261.146 ? 0.383 270.978 ? 0.697 ns/op ArraysHashCode.chars 1000 avgt 15 2371.921 ? 0.679 2526.244 ? 6.915 ns/op ArraysHashCode.chars 10000 avgt 15 23421.998 ? 2.523 25095.058 ? 71.641 ns/op ArraysHashCode.ints 10 avgt 15 45.419 ? 0.007 40.214 ? 0.171 ns/op ArraysHashCode.ints 100 avgt 15 262.007 ? 1.421 270.151 ? 0.429 ns/op ArraysHashCode.ints 1000 avgt 15 2371.512 ? 0.402 2525.866 ? 1.160 ns/op ArraysHashCode.ints 10000 avgt 15 29589.412 ? 13.383 31285.973 ? 163.943 ns/op ArraysHashCode.multibytes 10 avgt 15 27.079 ? 0.172 22.170 ? 0.114 ns/op ArraysHashCode.multibytes 100 avgt 15 155.937 ? 0.745 146.639 ? 0.165 ns/op ArraysHashCode.multibytes 1000 avgt 15 1291.860 ? 2.052 1415.878 ? 2.299 ns/op ArraysHashCode.multibytes 10000 avgt 15 12777.667 ? 2.627 14123.991 ? 68.454 ns/op ArraysHashCode.multichars 10 avgt 15 28.005 ? 0.136 22.282 ? 0.077 ns/op ArraysHashCode.multichars 100 avgt 15 166.277 ? 0.593 151.245 ? 0.524 ns/op ArraysHashCode.multichars 1000 avgt 15 1432.254 ? 5.252 1513.058 ? 5.612 ns/op ArraysHashCode.multichars 10000 avgt 15 14125.547 ? 52.368 15030.018 ? 89.649 ns/op ArraysHashCode.multiints 10 avgt 15 26.678 ? 0.220 22.650 ? 0.260 ns/op ArraysHashCode.multiints 100 avgt 15 179.278 ? 0.316 165.724 ? 0.606 ns/op ArraysHashCode.multiints 1000 avgt 15 1605.150 ? 2.885 1684.205 ? 2.334 ns/op ArraysHashCode.multiints 10000 avgt 15 16517.998 ? 71.869 17220.866 ? 65.161 ns/op ArraysHashCode.multishorts 10 avgt 15 27.507 ? 0.133 22.270 ? 0.056 ns/op ArraysHashCode.multishorts 100 avgt 15 166.159 ? 0.377 152.968 ? 0.696 ns/op ArraysHashCode.multishorts 1000 avgt 15 1430.556 ? 1.569 1509.756 ? 1.608 ns/op ArraysHashCode.multishorts 10000 avgt 15 14114.634 ? 5.609 14992.519 ? 11.684 ns/op ArraysHashCode.shorts 10 avgt 15 45.606 ? 0.289 40.080 ? 0.006 ns/op ArraysHashCode.shorts 100 avgt 15 261.538 ? 0.954 270.748 ? 3.619 ns/op ArraysHashCode.shorts 1000 avgt 15 2385.091 ? 2.082 2527.360 ? 8.349 ns/op ArraysHashCode.shorts 10000 avgt 15 23420.647 ? 3.081 25066.912 ? 4.099 ns/op Let me check which updates for the patch caused such results. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1842778944 From aph at openjdk.org Wed Dec 6 15:35:52 2023 From: aph at openjdk.org (Andrew Haley) Date: Wed, 6 Dec 2023 15:35:52 GMT Subject: Integrated: JDK-8320892: AArch64: Restore FPU control state after JNI In-Reply-To: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> References: <-Jv5Xvre3lonydwQ5uzYN3QB8V0VIuORIhM1RtIdW5g=.06167df9-7268-4945-8e18-a04d19ee97e1@github.com> Message-ID: On Tue, 28 Nov 2023 14:26:11 GMT, Andrew Haley wrote: > Some buggy libraries corrupt the floating-point control register. Provide something similar to the x86 RestoreMXCSROnJNICalls. > > I realize that using the x86ish name "RestoreMXCSROnJNICalls" might be a little controversial, but it is a _global_ flag, not a CPU-specific one. And it's clearly intended for this purpose. It might have been better if that flag had been given a better name twentyish years ago, but we can't change it now. This pull request has now been integrated. Changeset: 50f31240 Author: Andrew Haley URL: https://git.openjdk.org/jdk/commit/50f31240555888018f0f496ab29c8a5932dce459 Stats: 28 lines in 5 files changed: 26 ins; 0 del; 2 mod 8320892: AArch64: Restore FPU control state after JNI Reviewed-by: adinn, stuefe ------------- PR: https://git.openjdk.org/jdk/pull/16851 From egahlin at openjdk.org Wed Dec 6 16:12:41 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Wed, 6 Dec 2023 16:12:41 GMT Subject: RFR: 8211238: @Deprecated JFR event [v17] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 11:00:06 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > reviewer feedback Marked as reviewed by egahlin (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16931#pullrequestreview-1768015541 From tschatzl at openjdk.org Wed Dec 6 16:38:02 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 6 Dec 2023 16:38:02 GMT Subject: RFR: 8319313: G1: Rename G1EvacFailureInjector appropriately [v3] In-Reply-To: References: Message-ID: > Hi all, > > please review this rename of `G1EvacFailureInjector` and associated options to `G1AllocationFailureInjector` according to the results of the discussion for the review of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706). > > To facilitate review the first commit implements the renaming changes, the second moves the affected files only. > > Testing: gha, local gc/g1 tests > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: rename flags after internal discussion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16905/files - new: https://git.openjdk.org/jdk/pull/16905/files/a161d1fb..b23c057d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16905&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16905&range=01-02 Stats: 49 lines in 9 files changed: 0 ins; 0 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/16905.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16905/head:pull/16905 PR: https://git.openjdk.org/jdk/pull/16905 From kvn at openjdk.org Wed Dec 6 16:50:41 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 6 Dec 2023 16:50:41 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: <2pWVRM2IPJeoOtjkcRvGK8GWwiIFr35yHfgCFKM-9to=.499f3689-5e10-49d2-992c-e86e28b6477a@github.com> On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work My few cents about `x2` ;^) Looking on [JDK-8049737](https://bugs.openjdk.org/browse/JDK-8049737) changes and it is used to make sure that padding is greater then cache line. In `padded.hpp` we have several macros `DEFINE_PAD_MINUS_SIZE` and `PADDED_END_SIZE` which calculate offset as (default_padding - size) . ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843276529 From ayang at openjdk.org Wed Dec 6 17:19:34 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Wed, 6 Dec 2023 17:19:34 GMT Subject: RFR: 8319313: G1: Rename G1EvacFailureInjector appropriately [v3] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 16:38:02 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this rename of `G1EvacFailureInjector` and associated options to `G1AllocationFailureInjector` according to the results of the discussion for the review of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706). >> >> To facilitate review the first commit implements the renaming changes, the second moves the affected files only. >> >> Testing: gha, local gc/g1 tests >> >> Thanks, >> Thomas > > Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: > > rename flags after internal discussion Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16905#pullrequestreview-1768187550 From shade at openjdk.org Wed Dec 6 18:25:34 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Dec 2023 18:25:34 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: <2pWVRM2IPJeoOtjkcRvGK8GWwiIFr35yHfgCFKM-9to=.499f3689-5e10-49d2-992c-e86e28b6477a@github.com> References: <2pWVRM2IPJeoOtjkcRvGK8GWwiIFr35yHfgCFKM-9to=.499f3689-5e10-49d2-992c-e86e28b6477a@github.com> Message-ID: On Wed, 6 Dec 2023 16:47:59 GMT, Vladimir Kozlov wrote: > My few cents about `x2` ;^) Looking on [JDK-8049737](https://bugs.openjdk.org/browse/JDK-8049737) changes and it is used to make sure that padding is greater then cache line. Yes, I think the intent was the same: cater for weird prefetcher interactions. > In `padded.hpp` we have several macros `DEFINE_PAD_MINUS_SIZE` and `PADDED_END_SIZE` which calculate offset as (default_padding - size) . Sure, I think this PR covers these paths too, or am I missing something there? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843440916 From kvn at openjdk.org Wed Dec 6 18:39:37 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 6 Dec 2023 18:39:37 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: <84Xg6o3F8qlJqU2JQAkTNNq_NibWVWsM1jQ-jy1HMlM=.fbf9d29a-e3a5-43a6-976c-e0c556afbfad@github.com> On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work If intention of original changes (add padding) was to avoid false sharing why we do it only for x64? From [JDK-8049737](https://bugs.openjdk.org/browse/JDK-8049737): INFO: offset(_header)=0 INFO: offset(_owner)=24 WARNING: the _header and _owner fields are closer than a cache line which permits false sharing. WARNING: ObjectMonitor size is not a multiple of a cache line which permits false sharing. Orignal changes also use double size for SPARCv9. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843465065 PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843466117 From stefank at openjdk.org Wed Dec 6 19:10:47 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 19:10:47 GMT Subject: RFR: 8319969: os::large_page_init() turns off THPs for ZGC [v5] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 09:28:09 GMT, Stefan Karlsson wrote: >> There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: >> >> >> if (UseTransparentHugePages && !HugePages::supports_thp()) { >> if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { >> log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); >> } >> UseLargePages = UseTransparentHugePages = false; >> return; >> } >> >> >> This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: >> >> /sys/kernel/mm/transparent_hugepage/enabled: never >> /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise >> >> >> the above code will force ZGC to run without THPs. >> >> This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: >> >> 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. >> >> 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. >> >> 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. >> >> The result of this change can be seen in these tables: >> >> ZGC large pages log output: >> >> E (T) = Enabled (Transparent) >> E (T, OS) = Enabled (Transparent, OS enforced) >> D = Disabled >> D = Disabled (OS enforced) >> >> -XX:+UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+--------+---------+------- >> always | E (T) | E (T) | E (T) >> within_size | E (T) | E (T) | E (T) >> advise | E (T) | E (T) | E (T) >> never | D (OS) | D (OS) | D (OS) >> deny | D (OS) | D (OS) | D (OS) >> force | E (T) | E (T) | E (T) >> >> -XX:-UseTransparentHugePages >> >> shem \ anon | always | madvise | never >> ------------+-----------+-----------+------- >> always | E (T, OS) | E (T, OS) | E (T, OS) >> within_size | E (T, OS) | E (T, OS) | E (T, OS) >> advise | D ... > > Stefan Karlsson has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge remote-tracking branch 'upstream/master' into 8319969_zgc_thp_workaround > - More precise THP warning messages > - Move _thp_requested out from HugePages > - Small tweaks > - 8319969: os::large_page_init() turns off THPs for ZGC Thanks all for reviewing! I've run this through our tier1-tier5 testing. ------------- PR Review: https://git.openjdk.org/jdk/pull/16690#pullrequestreview-1768432046 From stefank at openjdk.org Wed Dec 6 19:10:49 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Wed, 6 Dec 2023 19:10:49 GMT Subject: Integrated: 8319969: os::large_page_init() turns off THPs for ZGC In-Reply-To: References: Message-ID: On Thu, 16 Nov 2023 13:30:48 GMT, Stefan Karlsson wrote: > There is code in `os::large_page_init()` that checks `/sys/kernel/mm/transparent_hugepage/enabled` and forcefully turns off `UseTransparentHugePages` if anonymous THPs are disabled in the OS: > > > if (UseTransparentHugePages && !HugePages::supports_thp()) { > if (!FLAG_IS_DEFAULT(UseTransparentHugePages)) { > log_warning(pagesize)("UseTransparentHugePages disabled, transparent huge pages are not supported by the operating system."); > } > UseLargePages = UseTransparentHugePages = false; > return; > } > > > This is problematic because ZGC doesn't use the `/sys/kernel/mm/transparent_hugepage/enabled` THPs, but instead the `/sys/kernel/mm/transparent_hugepage/shmem_enabled` THPs. So, with the following settings: > > /sys/kernel/mm/transparent_hugepage/enabled: never > /sys/kernel/mm/transparent_hugepage/shmem_enabled: advise > > > the above code will force ZGC to run without THPs. > > This PR is a proposal for how to work around this in the ZGC code without disturbing the the rest of the JVM too much. The patch: > > 1) remembers the initial values for UseLargePages and UseTransparentHugePages and saves those so that ZGC can continue using THPs even though they have been disabled for the rest of the JVM. > > 2) adds better logic to figure out if ZGC is actually going to get THPs for the heap or not. This is then used to more accurately log the current situation and allows for a precise usage of `madvise + MADV_HUGEPAGE`. > > 3) tweaks the generic pagesize logging to better reflect the situation when anonymous THPs are disabled but shared memory THPs are enabled and ZGC is used. > > The result of this change can be seen in these tables: > > ZGC large pages log output: > > E (T) = Enabled (Transparent) > E (T, OS) = Enabled (Transparent, OS enforced) > D = Disabled > D = Disabled (OS enforced) > > -XX:+UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+--------+---------+------- > always | E (T) | E (T) | E (T) > within_size | E (T) | E (T) | E (T) > advise | E (T) | E (T) | E (T) > never | D (OS) | D (OS) | D (OS) > deny | D (OS) | D (OS) | D (OS) > force | E (T) | E (T) | E (T) > > -XX:-UseTransparentHugePages > > shem \ anon | always | madvise | never > ------------+-----------+-----------+------- > always | E (T, OS) | E (T, OS) | E (T, OS) > within_size | E (T, OS) | E (T, OS) | E (T, OS) > advise | D | D | D > never | D | D | D > deny | D | D | D > force ... This pull request has now been integrated. Changeset: f4822605 Author: Stefan Karlsson URL: https://git.openjdk.org/jdk/commit/f4822605af44f63e5928f2f279df3f76c01a25a2 Stats: 338 lines in 10 files changed: 291 ins; 13 del; 34 mod 8319969: os::large_page_init() turns off THPs for ZGC Reviewed-by: stuefe, aboldtch ------------- PR: https://git.openjdk.org/jdk/pull/16690 From shade at openjdk.org Wed Dec 6 19:37:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Dec 2023 19:37:35 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: <84Xg6o3F8qlJqU2JQAkTNNq_NibWVWsM1jQ-jy1HMlM=.fbf9d29a-e3a5-43a6-976c-e0c556afbfad@github.com> References: <84Xg6o3F8qlJqU2JQAkTNNq_NibWVWsM1jQ-jy1HMlM=.fbf9d29a-e3a5-43a6-976c-e0c556afbfad@github.com> Message-ID: On Wed, 6 Dec 2023 18:36:23 GMT, Vladimir Kozlov wrote: > If intention of original changes (add padding) was to avoid false sharing why we do it only for x64? From [JDK-8049737](https://bugs.openjdk.org/browse/JDK-8049737): I don't understand the question. We add padding for all architectures to avoid false sharing. Avoiding false sharing _nominally_ requires padding by cache line size. That is why the default for `DEFAULT_PADDING_SIZE` is `DEFAULT_CACHE_LINE_SIZE`. x86 has (had?) a peculiarity with adjacent cache line hardware prefetchers that did require padding for *twice* the cache size to avoid false sharing in those unusual circumstances. I _guess_ SPARCv9 had the similar problem? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843566239 From shade at openjdk.org Wed Dec 6 19:52:38 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Wed, 6 Dec 2023 19:52:38 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work I submitted the RFE for lifting the `*2` padding for x86_64 here: https://bugs.openjdk.org/browse/JDK-8321481 ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843588110 From sspitsyn at openjdk.org Wed Dec 6 19:53:44 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 6 Dec 2023 19:53:44 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v2] In-Reply-To: References: Message-ID: On Tue, 5 Dec 2023 23:01:20 GMT, Daniel D. Daugherty wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: extended comment to cover the watchpoint extra checks > > Thumbs up. This is a trivial fix. > > You'll need to fix the whitespace complaint before integration. > @dcubed-ojdk I would not consider this a trivial fix at all - the need to add the additional conditions is not at all obvious! > And even if they were, that would make this a small/simple fix, not "trivial" as defined for the "one review needed" rule. Sorry, David. It was kind of obvious to me as this tweak is a work around the field watch related regression. Of course, it would better to wait for you to finish review but it was not clear if you are going to complete it or not. As you know, it is very uncomfortable to do last minute push even if it is trivial. :( ------------- PR Comment: https://git.openjdk.org/jdk/pull/16961#issuecomment-1843587781 From sspitsyn at openjdk.org Wed Dec 6 20:18:47 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 6 Dec 2023 20:18:47 GMT Subject: RFR: 8321219: runtime/jni/FastGetField: assert(is_interpreted_frame()) failed: interpreted frame expected [v3] In-Reply-To: <64Mn3SR0rOVpZMb6CRVJwKfJIySscp5cBy9hyJkMEs4=.faea7c80-3c92-4d5c-9baf-d80e3b2714c0@github.com> References: <4ac4Z72qGn_-S7p43eNHZRgx4mmXxGNgXR1f7W06aQE=.8b328e1f-6378-4637-a76c-17c581f31a24@github.com> <64Mn3SR0rOVpZMb6CRVJwKfJIySscp5cBy9hyJkMEs4=.faea7c80-3c92-4d5c-9baf-d80e3b2714c0@github.com> Message-ID: On Wed, 6 Dec 2023 02:33:21 GMT, David Holmes wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> fixed trailing whitespace > > src/hotspot/share/prims/jvmtiThreadState.cpp line 561: > >> 559: // it is an important optimization to create JvmtiThreadState objects lazily. >> 560: // This optimization is disabled when watchpoint capabilities are present. It is to >> 561: // work around a bug with virtual thread frames which can be not deoptimized in time. > > Suggestion: "This optimization is *also* disabled when ..." > > The phrase "which can be not deoptimized in time." is unclear. Are we racing with deoptimization? Than you for the suggestion. In fact, I did not finish with the scalability related optimizations and will continue in 23. Will correct this comment as you suggest when there is any chance. > The phrase "which can be not deoptimized in time." is unclear. Are we racing with deoptimization? Yes, this comment can be not fully correct as you noted. I do not fully understand optimization vs deoptimization mechanisms. I've already spent a lot of time trying to isolate this deoptimization issue but still need to continue this work in 23. Good news is that it can be reliably reproducible but only with a full run any of 4-6 tier. It is not reproducible locally yet. My understanding is that the deoptimization needs some time to happen. We mark frames as needed to deoptimize and they are really deoptimized upon return of the execution control. However, there can be some subtle details depending on the execution path. There can be more then one bug in this area. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16961#discussion_r1417929323 From duke at openjdk.org Wed Dec 6 21:58:55 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Wed, 6 Dec 2023 21:58:55 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to support _vectorizedHashCode intrinsic on > RISC-V platform. The patch adds the "scalar" code for the intrinsic without > usage of any RVV instruction but provides manual unrolling of the appropriate > loop. The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Thanks, > -Yuri Gaevsky > > P.S. My OCA has been accepted recently (ygaevsky). > > ### Correctness checks > > Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. > > ### Performance results (the numbers for non-ints are similar) > > #### StarFive JH7110 board: > > > ArraysHashCode: without intrinsic with intrinsic > ------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > ------------------------------------------------------------------------------- > multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op > multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op > multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op > multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op > multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op > multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op > multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op > multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op > multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op > multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op > multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op > multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op > multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op > multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op > multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op > multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op > multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op > multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op > ------------------------------------------------------------------------------- > > #### T-Head RVB-ICE board: > > > ArraysHashCode: ... Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16629/files - new: https://git.openjdk.org/jdk/pull/16629/files/f955a061..99f91d04 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=08-09 Stats: 45 lines in 3 files changed: 13 ins; 6 del; 26 mod Patch: https://git.openjdk.org/jdk/pull/16629.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16629/head:pull/16629 PR: https://git.openjdk.org/jdk/pull/16629 From duke at openjdk.org Wed Dec 6 22:09:40 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Wed, 6 Dec 2023 22:09:40 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 21:58:55 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. The results of commit 99f91d0 are below. Sifive Unmatched: Benchmark (size) Mode Cnt Score Error Score Error Units ArraysHashCode.bytes 10 avgt 15 65.190 ? 0.954 45.527 ? 1.771 ns/op ArraysHashCode.bytes 100 avgt 15 321.443 ? 5.586 258.807 ? 4.922 ns/op ArraysHashCode.bytes 1000 avgt 15 2878.206 ? 9.105 2347.219 ? 8.947 ns/op ArraysHashCode.bytes 10000 avgt 15 28421.840 ? 35.467 23160.425 ? 30.340 ns/op ArraysHashCode.chars 10 avgt 15 64.544 ? 1.713 50.808 ? 2.629 ns/op ArraysHashCode.chars 100 avgt 15 338.919 ? 1.623 265.971 ? 4.874 ns/op ArraysHashCode.chars 1000 avgt 15 2986.972 ? 4.009 2336.699 ? 2.537 ns/op ArraysHashCode.chars 10000 avgt 15 29474.441 ? 14.634 23161.582 ? 29.067 ns/op ArraysHashCode.ints 10 avgt 15 57.104 ? 2.517 46.034 ? 0.602 ns/op ArraysHashCode.ints 100 avgt 15 330.264 ? 4.543 258.327 ? 1.517 ns/op ArraysHashCode.ints 1000 avgt 15 2995.208 ? 3.188 2339.664 ? 6.849 ns/op ArraysHashCode.ints 10000 avgt 15 33855.312 ? 115.319 27836.954 ? 27.304 ns/op ArraysHashCode.multibytes 10 avgt 15 34.378 ? 0.230 27.076 ? 0.108 ns/op ArraysHashCode.multibytes 100 avgt 15 193.131 ? 0.370 141.907 ? 0.244 ns/op ArraysHashCode.multibytes 1000 avgt 15 1651.909 ? 7.812 1377.842 ? 10.299 ns/op ArraysHashCode.multibytes 10000 avgt 15 16620.685 ? 37.854 13960.556 ? 43.473 ns/op ArraysHashCode.multichars 10 avgt 15 35.104 ? 0.195 26.308 ? 0.127 ns/op ArraysHashCode.multichars 100 avgt 15 204.391 ? 0.233 144.662 ? 0.337 ns/op ArraysHashCode.multichars 1000 avgt 15 1902.088 ? 6.922 1579.549 ? 7.266 ns/op ArraysHashCode.multichars 10000 avgt 15 18905.923 ? 79.263 15952.155 ? 68.664 ns/op ArraysHashCode.multiints 10 avgt 15 35.111 ? 0.093 26.551 ? 0.264 ns/op ArraysHashCode.multiints 100 avgt 15 211.251 ? 0.550 153.683 ? 0.208 ns/op ArraysHashCode.multiints 1000 avgt 15 2223.176 ? 8.982 1927.689 ? 7.075 ns/op ArraysHashCode.multiints 10000 avgt 15 31567.767 ? 249.609 29463.762 ? 186.245 ns/op ArraysHashCode.multishorts 10 avgt 15 35.311 ? 0.313 26.372 ? 0.116 ns/op ArraysHashCode.multishorts 100 avgt 15 203.294 ? 0.241 144.988 ? 0.494 ns/op ArraysHashCode.multishorts 1000 avgt 15 1898.485 ? 6.704 1579.381 ? 5.600 ns/op ArraysHashCode.multishorts 10000 avgt 15 18855.850 ? 66.545 15718.005 ? 75.154 ns/op ArraysHashCode.shorts 10 avgt 15 56.418 ? 0.186 47.488 ? 2.261 ns/op ArraysHashCode.shorts 100 avgt 15 337.844 ? 1.202 256.671 ? 0.761 ns/op ArraysHashCode.shorts 1000 avgt 15 2988.457 ? 6.158 2337.570 ? 2.510 ns/op ArraysHashCode.shorts 10000 avgt 15 29506.107 ? 41.616 23148.772 ? 40.625 ns/op T-Head RVB-ICE: Benchmark (size) Mode Cnt Score Error Score Error Units ArraysHashCode.bytes 10 avgt 15 53.463 ? 0.274 46.625 ? 0.247 ns/op ArraysHashCode.bytes 100 avgt 15 280.976 ? 1.478 225.197 ? 1.141 ns/op ArraysHashCode.bytes 1000 avgt 15 2553.393 ? 4.925 1818.613 ? 3.789 ns/op ArraysHashCode.bytes 10000 avgt 15 25138.794 ? 39.992 16787.514 ? 59.261 ns/op ArraysHashCode.chars 10 avgt 15 52.075 ? 0.246 45.924 ? 0.561 ns/op ArraysHashCode.chars 100 avgt 15 283.441 ? 0.743 237.660 ? 1.074 ns/op ArraysHashCode.chars 1000 avgt 15 2562.833 ? 3.370 1915.665 ? 4.166 ns/op ArraysHashCode.chars 10000 avgt 15 25168.219 ? 94.226 18843.917 ? 51.859 ns/op ArraysHashCode.ints 10 avgt 15 52.126 ? 0.382 46.739 ? 0.366 ns/op ArraysHashCode.ints 100 avgt 15 283.643 ? 0.901 242.191 ? 0.776 ns/op ArraysHashCode.ints 1000 avgt 15 2556.508 ? 6.937 1913.271 ? 2.920 ns/op ArraysHashCode.ints 10000 avgt 15 25171.578 ? 51.725 18835.638 ? 49.785 ns/op ArraysHashCode.multibytes 10 avgt 15 26.432 ? 0.157 18.762 ? 0.184 ns/op ArraysHashCode.multibytes 100 avgt 15 160.788 ? 0.484 117.339 ? 0.285 ns/op ArraysHashCode.multibytes 1000 avgt 15 1366.697 ? 9.217 923.814 ? 4.709 ns/op ArraysHashCode.multibytes 10000 avgt 15 13360.445 ? 22.830 9350.136 ? 18.251 ns/op ArraysHashCode.multichars 10 avgt 15 26.732 ? 0.181 19.234 ? 0.136 ns/op ArraysHashCode.multichars 100 avgt 15 164.043 ? 0.310 117.900 ? 0.386 ns/op ArraysHashCode.multichars 1000 avgt 15 1398.259 ? 2.765 1030.563 ? 2.701 ns/op ArraysHashCode.multichars 10000 avgt 15 13331.460 ? 21.356 9749.817 ? 23.566 ns/op ArraysHashCode.multiints 10 avgt 15 25.972 ? 0.135 18.745 ? 0.155 ns/op ArraysHashCode.multiints 100 avgt 15 169.487 ? 0.357 125.620 ? 0.330 ns/op ArraysHashCode.multiints 1000 avgt 15 1399.977 ? 9.000 1036.132 ? 3.237 ns/op ArraysHashCode.multiints 10000 avgt 15 13760.907 ? 23.137 10324.485 ? 18.437 ns/op ArraysHashCode.multishorts 10 avgt 15 26.541 ? 0.223 19.389 ? 0.151 ns/op ArraysHashCode.multishorts 100 avgt 15 163.990 ? 0.301 117.797 ? 0.419 ns/op ArraysHashCode.multishorts 1000 avgt 15 1402.545 ? 3.285 1031.649 ? 7.023 ns/op ArraysHashCode.multishorts 10000 avgt 15 13349.611 ? 25.599 9778.011 ? 19.135 ns/op ArraysHashCode.shorts 10 avgt 15 52.037 ? 0.265 46.881 ? 0.636 ns/op ArraysHashCode.shorts 100 avgt 15 285.775 ? 0.702 244.200 ? 1.012 ns/op ArraysHashCode.shorts 1000 avgt 15 2553.894 ? 5.309 1926.098 ? 3.496 ns/op ArraysHashCode.shorts 10000 avgt 15 25201.063 ? 95.129 18843.485 ? 73.870 ns/op StarFive JH7110 Benchmark (size) Mode Cnt Score Error Score Error Units ArraysHashCode.bytes 10 avgt 15 41.093 ? 0.541 34.051 ? 0.032 ns/op ArraysHashCode.bytes 100 avgt 15 250.250 ? 0.846 201.460 ? 0.631 ns/op ArraysHashCode.bytes 1000 avgt 15 2283.792 ? 0.293 1855.048 ? 0.337 ns/op ArraysHashCode.bytes 10000 avgt 15 22613.649 ? 85.647 18454.512 ? 93.310 ns/op ArraysHashCode.chars 10 avgt 15 45.441 ? 0.108 34.747 ? 0.008 ns/op ArraysHashCode.chars 100 avgt 15 261.762 ? 1.081 203.169 ? 0.118 ns/op ArraysHashCode.chars 1000 avgt 15 2372.976 ? 1.541 1856.964 ? 4.764 ns/op ArraysHashCode.chars 10000 avgt 15 23429.722 ? 6.530 18390.679 ? 2.956 ns/op ArraysHashCode.ints 10 avgt 15 45.530 ? 0.284 34.744 ? 0.005 ns/op ArraysHashCode.ints 100 avgt 15 261.117 ? 0.721 203.332 ? 0.218 ns/op ArraysHashCode.ints 1000 avgt 15 2373.573 ? 3.175 1856.836 ? 0.223 ns/op ArraysHashCode.ints 10000 avgt 15 29624.472 ? 44.767 24626.598 ? 54.767 ns/op ArraysHashCode.multibytes 10 avgt 15 26.975 ? 0.259 19.854 ? 0.114 ns/op ArraysHashCode.multibytes 100 avgt 15 156.220 ? 0.247 113.744 ? 0.366 ns/op ArraysHashCode.multibytes 1000 avgt 15 1296.236 ? 7.541 1073.224 ? 4.383 ns/op ArraysHashCode.multibytes 10000 avgt 15 12779.460 ? 2.007 10593.835 ? 5.531 ns/op ArraysHashCode.multichars 10 avgt 15 27.520 ? 0.102 19.992 ? 0.054 ns/op ArraysHashCode.multichars 100 avgt 15 166.026 ? 0.695 117.982 ? 0.639 ns/op ArraysHashCode.multichars 1000 avgt 15 1430.447 ? 1.517 1165.180 ? 5.783 ns/op ArraysHashCode.multichars 10000 avgt 15 14134.839 ? 6.270 11499.764 ? 37.546 ns/op ArraysHashCode.multiints 10 avgt 15 26.872 ? 0.066 20.127 ? 0.083 ns/op ArraysHashCode.multiints 100 avgt 15 178.919 ? 0.245 132.377 ? 0.484 ns/op ArraysHashCode.multiints 1000 avgt 15 1607.719 ? 2.903 1339.118 ? 8.704 ns/op ArraysHashCode.multiints 10000 avgt 15 16390.804 ? 49.820 13706.741 ? 11.994 ns/op ArraysHashCode.multishorts 10 avgt 15 27.749 ? 0.165 20.011 ? 0.096 ns/op ArraysHashCode.multishorts 100 avgt 15 166.625 ? 0.592 119.115 ? 0.324 ns/op ArraysHashCode.multishorts 1000 avgt 15 1429.682 ? 1.607 1165.839 ? 6.013 ns/op ArraysHashCode.multishorts 10000 avgt 15 14199.682 ? 6.483 11493.880 ? 6.484 ns/op ArraysHashCode.shorts 10 avgt 15 45.878 ? 0.348 34.768 ? 0.116 ns/op ArraysHashCode.shorts 100 avgt 15 260.598 ? 0.079 203.937 ? 0.078 ns/op ArraysHashCode.shorts 1000 avgt 15 2374.712 ? 7.961 1857.542 ? 0.248 ns/op ArraysHashCode.shorts 10000 avgt 15 23428.899 ? 4.859 18433.195 ? 34.224 ns/op The improvements on SiFive/StarFive came after the move of all memory loads up to the start of the loop. Differences between Out-of-Order versus In-Order CPUs? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1843766166 From sgibbons at openjdk.org Wed Dec 6 22:23:47 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Wed, 6 Dec 2023 22:23:47 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v3] In-Reply-To: References: Message-ID: > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Support UU IndexOf ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16753/files - new: https://git.openjdk.org/jdk/pull/16753/files/e614b86f..5e03173e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=01-02 Stats: 20 lines in 1 file changed: 6 ins; 0 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From kvn at openjdk.org Wed Dec 6 22:37:36 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 6 Dec 2023 22:37:36 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work @dcubed-ojdk, as author of [JDK-8049737](https://bugs.openjdk.org/browse/JDK-8049737) changes, do you remember why we use double cacheline for padding? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843796580 From sviswanathan at openjdk.org Wed Dec 6 23:07:37 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 6 Dec 2023 23:07:37 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> References: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> Message-ID: On Wed, 6 Dec 2023 09:58:42 GMT, Stefan Karlsson wrote: > Sure. It think it is a good patch. However, given that you added this commment: > > ``` > // Hardware prefetchers on current implementations may pull 2 cache lines > // on access, therefore we pessimistically assume twice the cache line size > // for padding. > ``` > > Do you have anything that backs up the claim that this is the case for "current implementations"? Maybe @sviswa7 can help answering if this is still the case for Intel hardware? >From my understanding: Padding to 64 byte is needed to avoid cache line false sharing. Padding to 256 byte is recommended for heavily accessed contended data to avoid false sharing induced by prefetchers. Padding to 128 byte may be sufficient for general contended data. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1843830251 From xgong at openjdk.org Thu Dec 7 06:41:16 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Dec 2023 06:41:16 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v8] In-Reply-To: References: Message-ID: <_hHpYHZtmwFRLVj4waIAI1iWq8AINtdwx2Wtp-ZztrM=.c7f07b96-914f-4f67-a2a7-761be6e36e92@github.com> > Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). > > SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. > > To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. > > Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. > > [1] https://github.com/openjdk/jdk/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 12 additional commits since the last revision: - Remove -fvisibility in makefile and add the attribute in source code - Merge branch 'jdk:master' into JDK-8312425 - Add "--with-libsleef-lib" and "--with-libsleef-include" options - Separate neon and sve functions into two source files - Merge branch 'jdk:master' into JDK-8312425 - Rename vmath to sleef in configure - Address review comments in build system - Add a bundled native lib in jdk as a bridge to libsleef - Merge 'jdk:master' into JDK-8312425 - Disable sleef by default - ... and 2 more: https://git.openjdk.org/jdk/compare/6feb6794...c55357b6 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16234/files - new: https://git.openjdk.org/jdk/pull/16234/files/f3ff0672..c55357b6 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=06-07 Stats: 83778 lines in 1591 files changed: 38309 ins; 39305 del; 6164 mod Patch: https://git.openjdk.org/jdk/pull/16234.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234 PR: https://git.openjdk.org/jdk/pull/16234 From sspitsyn at openjdk.org Thu Dec 7 07:08:47 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 7 Dec 2023 07:08:47 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable Message-ID: This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. The deadlocking scenario is well described by Patricio in a bug report comment. In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. New test was developed by Patricio: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` The test is very nice as it reliably in 100% reproduces the deadlock without the fix. The test is never failing with this fix. Testing: - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` - tested with mach5 tiers 1-6 ------------- Commit messages: - added @summary to new test SuspendWithInterruptLock.java - add new test SuspendWithInterruptLock.java - 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable Changes: https://git.openjdk.org/jdk/pull/17011/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311218 Stats: 183 lines in 15 files changed: 178 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From stefank at openjdk.org Thu Dec 7 07:41:35 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Thu, 7 Dec 2023 07:41:35 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes In-Reply-To: References: <501CMWdxAr9LoeE8uLipvQDWmf9g1hfV0M8Tm7f0iOU=.0122fdac-61ed-4ff4-91a4-0495e980d929@github.com> Message-ID: On Wed, 6 Dec 2023 23:05:14 GMT, Sandhya Viswanathan wrote: > > Sure. It think it is a good patch. However, given that you added this commment: > > ``` > > // Hardware prefetchers on current implementations may pull 2 cache lines > > // on access, therefore we pessimistically assume twice the cache line size > > // for padding. > > ``` > > > > > > > > > > > > > > > > > > > > > > > > Do you have anything that backs up the claim that this is the case for "current implementations"? Maybe @sviswa7 can help answering if this is still the case for Intel hardware? > > From my understanding: Padding to 64 byte is needed to avoid cache line false sharing. Padding to 256 byte is recommended for heavily accessed contended data to avoid false sharing induced by prefetchers. Padding to 128 byte may be sufficient for general contended data. Thanks, Sandhya. I think I found the statement that I was remembering: https://mail.openjdk.org/pipermail/zgc-dev/2018-March/000184.html >FWIW, adjacent cache line prefetching is usually enabled for clients (desktops, laptops) and disabled for servers. It has been this way for a long time. For servers, the bandwidth penalty of adjacent cache line prefetching was likely the determining factor in this difference. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1844820964 From shade at openjdk.org Thu Dec 7 09:01:39 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Thu, 7 Dec 2023 09:01:39 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work Folks, I submitted https://bugs.openjdk.org/browse/JDK-8321481 yesterday to figure out the padding situation on x86_64. Please migrate your comments about `*2` there, so they are not lost. I transplanted some of the comments from this PR there. Let's not allow to scope creep here. This PR is very specifically for splitting the definitions _without_ the actual value changes, and it allows to clearly set the padding size if we decide it should not really match the cache line size. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1844940457 From tschatzl at openjdk.org Thu Dec 7 09:07:17 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 7 Dec 2023 09:07:17 GMT Subject: RFR: 8319313: G1: Rename G1EvacFailureInjector appropriately [v4] In-Reply-To: References: Message-ID: <6lYTb7oNHXnVk8ftqiDv5TSK0vSS6FJUhvzhE0VdwEQ=.7894dfb8-71a3-41ae-88b7-1fdbd7aae1a5@github.com> > Hi all, > > please review this rename of `G1EvacFailureInjector` and associated options to `G1AllocationFailureInjector` according to the results of the discussion for the review of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706). > > To facilitate review the first commit implements the renaming changes, the second moves the affected files only. > > Testing: gha, local gc/g1 tests > > Thanks, > Thomas Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: Fix typos in test :( ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16905/files - new: https://git.openjdk.org/jdk/pull/16905/files/b23c057d..6d1191c5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16905&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16905&range=02-03 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16905.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16905/head:pull/16905 PR: https://git.openjdk.org/jdk/pull/16905 From mli at openjdk.org Thu Dec 7 09:22:41 2023 From: mli at openjdk.org (Hamlin Li) Date: Thu, 7 Dec 2023 09:22:41 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: <3Yo55KvqIwqYNK2Mj73eKJpwh-qkfS4P-0xkWD6rn5A=.efa8ef37-2968-4131-bdb1-0d03f64ad930@github.com> On Wed, 6 Dec 2023 21:58:55 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. This is interesting, as in another patch (https://github.com/openjdk/jdk/pull/16453), a little perf improvement is acheived in reverse way. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1844976434 From xgong at openjdk.org Thu Dec 7 09:30:01 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Dec 2023 09:30:01 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9] In-Reply-To: References: Message-ID: > Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). > > SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. > > To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. > > Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. > > [1] https://github.com/openjdk/jdk/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Fix potential attribute issue ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16234/files - new: https://git.openjdk.org/jdk/pull/16234/files/c55357b6..7a4be736 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16234&range=07-08 Stats: 5 lines in 1 file changed: 4 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16234.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16234/head:pull/16234 PR: https://git.openjdk.org/jdk/pull/16234 From xgong at openjdk.org Thu Dec 7 09:30:05 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Thu, 7 Dec 2023 09:30:05 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7] In-Reply-To: <0ya82eFBzsE0U96QMoP7OKmd7PAvW7GFXYP_iD_HTqE=.f12ca572-4a4e-4fbd-947b-e11f0aad81a1@github.com> References: <0ya82eFBzsE0U96QMoP7OKmd7PAvW7GFXYP_iD_HTqE=.f12ca572-4a4e-4fbd-947b-e11f0aad81a1@github.com> Message-ID: On Wed, 6 Dec 2023 11:46:03 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: >> >> Add "--with-libsleef-lib" and "--with-libsleef-include" options > > make/modules/jdk.incubator.vector/Lib.gmk line 45: > >> 43: $(eval $(call SetupJdkLibrary, BUILD_LIBVMATH, \ >> 44: NAME := vmath, \ >> 45: CFLAGS := $(CFLAGS_JDKLIB) $(LIBSLEEF_CFLAGS) -fvisibility=default, \ > > Why `-fvisibility=default`? (Sorry, only noticed this now) Yeah. Considering all the symbols in this lib are global and need to be exported, I added this flag here instead of the source code. I'v removed this in latest commit, and added the attribute visibility in source code like other jdk code. Please help to review again. Thanks a lot! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16234#discussion_r1418458207 From ihse at openjdk.org Thu Dec 7 09:42:45 2023 From: ihse at openjdk.org (Magnus Ihse Bursie) Date: Thu, 7 Dec 2023 09:42:45 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9] In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 09:30:01 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Fix potential attribute issue Build changes finally look good. Great, actually! Thanks for persisting, despite the many rounds of review. You will still need the 2 hotspot reviews for the hotspot part of the patch. ------------- Marked as reviewed by ihse (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16234#pullrequestreview-1769660206 From avoitylov at openjdk.org Thu Dec 7 10:31:53 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Thu, 7 Dec 2023 10:31:53 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly Message-ID: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. ------------- Commit messages: - JDK-8321515 implementation Changes: https://git.openjdk.org/jdk/pull/17017/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17017&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321515 Stats: 50 lines in 3 files changed: 24 ins; 6 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/17017.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17017/head:pull/17017 PR: https://git.openjdk.org/jdk/pull/17017 From jbachorik at openjdk.org Thu Dec 7 10:43:37 2023 From: jbachorik at openjdk.org (Jaroslav Bachorik) Date: Thu, 7 Dec 2023 10:43:37 GMT Subject: RFR: 8211238: @Deprecated JFR event [v17] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 11:00:06 GMT, Markus Gr?nlund wrote: >> Greetings, >> >> please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. >> >> Testing: jdk_jfr, CI 1-6, stress testing >> >> Thanks >> Markus > > Markus Gr?nlund has updated the pull request incrementally with one additional commit since the last revision: > > reviewer feedback Marked as reviewed by jbachorik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16931#pullrequestreview-1769813935 From mgronlun at openjdk.org Thu Dec 7 10:48:46 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 7 Dec 2023 10:48:46 GMT Subject: RFR: 8211238: @Deprecated JFR event [v8] In-Reply-To: <2yQLLCsc2Ux9NqSE0NTz6yWyg6sxO3LWlkMK8h6jyrk=.c61993f8-cdd5-4758-ab6e-a10e6505dd37@github.com> References: <2yQLLCsc2Ux9NqSE0NTz6yWyg6sxO3LWlkMK8h6jyrk=.c61993f8-cdd5-4758-ab6e-a10e6505dd37@github.com> Message-ID: On Mon, 4 Dec 2023 15:08:55 GMT, Jaroslav Bachorik wrote: >> Hi @mgronlun - sorry for opening a design discussion in PR :( >> >> I wonder - will this report each single one invocation of a deprecated method conforming to the rules (JDK method called from non-JDK code)? Can this, potentially, flood the recording if the deprecated method gets called from a hot loop? > >> Hi @jbachorik, it will only report one event per unique call site, during link time. Its not a function of hotness, only unique edge discovery. > > Excellent! Thanks! Many thanks, @jbachorik and @egahlin, for your reviews - much appreciated! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16931#issuecomment-1845110765 From mgronlun at openjdk.org Thu Dec 7 10:48:51 2023 From: mgronlun at openjdk.org (Markus =?UTF-8?B?R3LDtm5sdW5k?=) Date: Thu, 7 Dec 2023 10:48:51 GMT Subject: Integrated: 8211238: @Deprecated JFR event In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 21:15:49 GMT, Markus Gr?nlund wrote: > Greetings, > > please help review this enhancement to add a JFR event to report runtime invocations of methods that have been declared deprecated in the JDK. > > Testing: jdk_jfr, CI 1-6, stress testing > > Thanks > Markus This pull request has now been integrated. Changeset: 49fff013 Author: Markus Gr?nlund URL: https://git.openjdk.org/jdk/commit/49fff0132bb470d8ae28355e4d5f4789a1b6d54d Stats: 2517 lines in 68 files changed: 2156 ins; 250 del; 111 mod 8211238: @Deprecated JFR event Reviewed-by: egahlin, jbachorik ------------- PR: https://git.openjdk.org/jdk/pull/16931 From redestad at openjdk.org Thu Dec 7 10:59:41 2023 From: redestad at openjdk.org (Claes Redestad) Date: Thu, 7 Dec 2023 10:59:41 GMT Subject: RFR: 8321468: Remove StringUTF16::equals Message-ID: https://bugs.openjdk.org/browse/JDK-8215017 removed the only use of `StringUTF16::equals`. At the time I did some performance verification focused on x86 showing that simplifying and only using `StringLatin1::equals` was either neutral or a win. I repeated this experiment recently, adding some focused tests on aarch64 where the code generation actually tries to take advantage and generate slightly more efficient code for `StringUTF16::equals`: https://github.com/openjdk/jdk/pull/16933#discussion_r1414118658 The indication here is that disabling use of `StringUTF16::equals` was the right choice: any effect from the low-level optimization (one less branch at the tail end) was offset by the `isLatin1()` branch and added code generation (that all gets inlined). In a `-XX:-CompactStrings` configuration the slightly improved code generation in `StringUTF16::equals` might help, since the `isLatin1()` test and subsequent call to `StringLatin1::equals` would be DCEd. To get the best of both worlds the code in `String::equals` _could_ be sharpened so that we statically pick the best implementation based on `CompactStrings` mode (see comment below). This shows a tiny win (up to -0.2ns/op per `String::equals` on M1; netural on x86). But is all this complexity worth it for a gain that will get lost in the noise on anything realistic? This PR instead proposes removing `StringUTF16::equals` and simplifying the mechanisms to support the `StringLatin1/UTF16::equals` pair of intrinsics in hotspot. ------------- Commit messages: - Fix and further cleanup RISC - Remove StringUTF16::equals Changes: https://git.openjdk.org/jdk/pull/16995/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16995&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321468 Stats: 138 lines in 14 files changed: 0 ins; 113 del; 25 mod Patch: https://git.openjdk.org/jdk/pull/16995.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16995/head:pull/16995 PR: https://git.openjdk.org/jdk/pull/16995 From redestad at openjdk.org Thu Dec 7 10:59:42 2023 From: redestad at openjdk.org (Claes Redestad) Date: Thu, 7 Dec 2023 10:59:42 GMT Subject: RFR: 8321468: Remove StringUTF16::equals In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 14:20:14 GMT, Claes Redestad wrote: > https://bugs.openjdk.org/browse/JDK-8215017 removed the only use of `StringUTF16::equals`. At the time I did some performance verification focused on x86 showing that simplifying and only using `StringLatin1::equals` was either neutral or a win. > > I repeated this experiment recently, adding some focused tests on aarch64 where the code generation actually tries to take advantage and generate slightly more efficient code for `StringUTF16::equals`: > https://github.com/openjdk/jdk/pull/16933#discussion_r1414118658 > > The indication here is that disabling use of `StringUTF16::equals` was the right choice: any effect from the low-level optimization (one less branch at the tail end) was offset by the `isLatin1()` branch and added code generation (that all gets inlined). > > In a `-XX:-CompactStrings` configuration the slightly improved code generation in `StringUTF16::equals` might help, since the `isLatin1()` test and subsequent call to `StringLatin1::equals` would be DCEd. To get the best of both worlds the code in `String::equals` _could_ be sharpened so that we statically pick the best implementation based on `CompactStrings` mode (see comment below). This shows a tiny win (up to -0.2ns/op per `String::equals` on M1; netural on x86). But is all this complexity worth it for a gain that will get lost in the noise on anything realistic? > > This PR instead proposes removing `StringUTF16::equals` and simplifying the mechanisms to support the `StringLatin1/UTF16::equals` pair of intrinsics in hotspot. For reference these are the microbenchmarks used in the JDK-8215017 verification experiment: diff --git a/test/micro/org/openjdk/bench/java/lang/StringEquals.java b/test/micro/org/openjdk/bench/java/lang/StringEquals.java index b0db6a7037e..effe855c228 100644 --- a/test/micro/org/openjdk/bench/java/lang/StringEquals.java +++ b/test/micro/org/openjdk/bench/java/lang/StringEquals.java @@ -43,6 +43,9 @@ public class StringEquals { public String test5 = new String(test4); // equal to test4, but not same public String test6 = new String("0123456780"); public String test7 = new String("0123\u01FE"); + public String test8 = new String("12\u01FE"); + public String test9 = new String("12\u01FF"); + public String test10 = new String("12\u01FE"); @Benchmark public boolean different() { @@ -73,5 +76,15 @@ public boolean differentCoders() { public boolean equalsUTF16() { return test5.equals(test4); } + + @Benchmark + public boolean equalsUTF16_3_NE() { + return test8.equals(test9); + } + + @Benchmark + public boolean equalsUTF16_3_EQ() { + return test8.equals(test10); + } } And this is the change to `String` to get it back to pre-JDK-8215017 state: diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java index 5869e086191..18ad0e85d33 100644 --- a/src/java.base/share/classes/java/lang/String.java +++ b/src/java.base/share/classes/java/lang/String.java @@ -1900,9 +1900,13 @@ public boolean equals(Object anObject) { if (this == anObject) { return true; } - return (anObject instanceof String aString) - && (!COMPACT_STRINGS || this.coder == aString.coder) - && StringLatin1.equals(value, aString.value); + if (anObject instanceof String aString) { + if (coder() == aString.coder()) { + return isLatin1() ? StringLatin1.equals(value, aString.value) + : StringUTF16.equals(value, aString.value); + } + } + return false; } /** M1 experiment with `-XX:-CompactStrings` and a patch to choose the `StringUTF16::equals` if `COMPACT_STRINGS` is false[1]: Name Cnt Base Error Test Error Unit Change StringEquals.equalsUTF16_3_EQ 5 1,799 ? 0,008 1,585 ? 0,009 ns/op 1,14x (p = 0,000*) StringEquals.equalsUTF16_3_NE 5 1,622 ? 0,007 1,541 ? 0,011 ns/op 1,05x (p = 0,000*) * = significant [1] diff --git a/src/java.base/share/classes/java/lang/String.java b/src/java.base/share/classes/java/lang/String.java index 5869e086191..10451bda83f 100644 --- a/src/java.base/share/classes/java/lang/String.java +++ b/src/java.base/share/classes/java/lang/String.java @@ -1900,9 +1900,14 @@ public boolean equals(Object anObject) { if (this == anObject) { return true; } - return (anObject instanceof String aString) - && (!COMPACT_STRINGS || this.coder == aString.coder) - && StringLatin1.equals(value, aString.value); + if (anObject instanceof String aString) { + if (COMPACT_STRINGS) { + return this.coder == aString.coder && StringLatin1.equals(value, aString.value); + } else { + return StringUTF16.equals(value, aString.value); + } + } + return false; } /** As expected this shows a tiny but measurable win on non-x86 platforms for `-CompactString` when retaining `StringUTF16::equals` and selecting it statically. For `+CompactStrings` this is performance neutral with the baseline. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16995#issuecomment-1845096784 PR Comment: https://git.openjdk.org/jdk/pull/16995#issuecomment-1845110919 From duke at openjdk.org Thu Dec 7 11:11:42 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Thu, 7 Dec 2023 11:11:42 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 21:58:55 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. > This is interesting, as in another patch (#16453), a little perf improvement is acheived in reverse way. Just for clarity, yesterday at first I added two additional temp registers for memory loads to avoid register dependencies, and that didn't help at all. After that I moved all loads to the start of "wide" loop and exactly that step fixed perfromance regressions on SiFive/StarFive so now the performance numbers on them are consistent with THead ones. By the the way, could you please provide more details for #16453 perf improvement mentioned above? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1845147318 From mli at openjdk.org Thu Dec 7 14:19:39 2023 From: mli at openjdk.org (Hamlin Li) Date: Thu, 7 Dec 2023 14:19:39 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 21:58:55 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. It's all in the pr https://github.com/openjdk/jdk/pull/16453, I don't have much other information. The major perf opt method of that pr is `do the loads from the buffer more incrementally instead of all in one go`, which I think is the opposite you're doing here. Seems the major difference between this and that pr is, in that pr it has lots of more work to do between the `incremental load`. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1845420124 From dchuyko at openjdk.org Thu Dec 7 16:31:22 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 7 Dec 2023 16:31:22 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v12] In-Reply-To: References: Message-ID: <6NtI6fuDxn2uXl9aud0ztXfS5VSMd7_g6_rQrLyeWP4=.cabafba9-06c0-4c49-a39a-48e61e46ee4b@github.com> > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 30 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - jcheck - Unnecessary import - force_update->refresh - ... and 20 more: https://git.openjdk.org/jdk/compare/a7f60164...21fe715e ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=11 Stats: 372 lines in 15 files changed: 339 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From tschatzl at openjdk.org Thu Dec 7 16:47:50 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 7 Dec 2023 16:47:50 GMT Subject: RFR: 8319313: G1: Rename G1EvacFailureInjector appropriately [v3] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 17:17:06 GMT, Albert Mingkun Yang wrote: >> Thomas Schatzl has updated the pull request incrementally with one additional commit since the last revision: >> >> rename flags after internal discussion > > Marked as reviewed by ayang (Reviewer). Thanks @albertnetymk @walulyai @Hamlin-Li for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/16905#issuecomment-1845679175 From tschatzl at openjdk.org Thu Dec 7 16:47:53 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 7 Dec 2023 16:47:53 GMT Subject: Integrated: 8319313: G1: Rename G1EvacFailureInjector appropriately In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 15:29:22 GMT, Thomas Schatzl wrote: > Hi all, > > please review this rename of `G1EvacFailureInjector` and associated options to `G1AllocationFailureInjector` according to the results of the discussion for the review of [JDK-8318706](https://bugs.openjdk.org/browse/JDK-8318706). > > To facilitate review the first commit implements the renaming changes, the second moves the affected files only. > > Testing: gha, local gc/g1 tests > > Thanks, > Thomas This pull request has now been integrated. Changeset: 86f9b3f5 Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/86f9b3f52a0675be4dd8096da0c65d6bda442f7b Stats: 741 lines in 19 files changed: 343 ins; 335 del; 63 mod 8319313: G1: Rename G1EvacFailureInjector appropriately Reviewed-by: mli, iwalulya, ayang ------------- PR: https://git.openjdk.org/jdk/pull/16905 From duke at openjdk.org Thu Dec 7 18:19:59 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Thu, 7 Dec 2023 18:19:59 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 14:16:33 GMT, Hamlin Li wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. > > It's all in the pr https://github.com/openjdk/jdk/pull/16453, I don't have much other information. > The major perf opt method of that pr is `do the loads from the buffer more incrementally instead of all in one go`, which I think is the opposite you're doing here. Seems the major difference between this and that pr is, in that pr it has lots of more work to do between the `incremental load`. Thanks @Hamlin-Li, I see now what you mean, that's interesting. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1845877661 From kvn at openjdk.org Thu Dec 7 19:26:24 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 7 Dec 2023 19:26:24 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work Agree. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16973#pullrequestreview-1770888708 From eastigeevich at openjdk.org Thu Dec 7 19:57:21 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Thu, 7 Dec 2023 19:57:21 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v5] In-Reply-To: References: <81dXSHvLQMGj3s1BcBs8fmJUEoJpaU-5wBRSIjnztMM=.d53f8a2f-8353-49ec-8a9b-695b32f03d20@github.com> Message-ID: On Tue, 5 Dec 2023 20:13:11 GMT, Chris Plummer wrote: >> Hi Chris, >> The current design of `write_perf_map` provides a clean and explicit interface. The purpose of the function is evident from its signature: to write a perf map into a specified file. This explicitness makes the code more readable and self-documenting. It reduces the need for developers to go to the implementation to figure out: what is the meaning of `nullptr`; where a filename will be taken from. It also serves as a contract between the caller and the function itself. By explicitly requiring a filename, the function sets clear expectations for the caller. >> >> I think `CodeCache::write_default_perf_map` hiding the filename of the default perf map might not be a good idea because it makes impossible to get the filename used in it. I prefer either method `CodeCache::defaultPerfmapFileName()` or class `CodeCache::DefaultPerfmapFileName`. The class is simpler to implement than the method (like it was earlier). > > The default filename was already "hidden" before these changes, so at the very least things are not being made any worse, but I don't see why any users `write_perf_map` would ever need the default filename. I just felt that adding and exporting a class whose only purpose is to provide the default name seemed like unnecessary overkill. I'm not so sure having a public CodeCache::defaultPerfmapFileName() API and two `write_perf_map` APIs isn't overkill also. There is nothing wrong with a null filename argument signally to use some default name. You can also have the filename arg default to `nullptr`. Ok, let's have: void CodeCache::write_perf_map(const char* filename = nullptr); without any additional classes or funcitons. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15871#discussion_r1419537894 From omikhaltcova at openjdk.org Thu Dec 7 21:35:16 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 7 Dec 2023 21:35:16 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> References: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> Message-ID: On Tue, 5 Dec 2023 03:33:52 GMT, Fei Yang wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Replaced tmp with t0 > > Unfortunately, I witnessed performance regression on sifive unmatched board. > > Before: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.243 ? 0.506 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.448 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.411 ? 0.134 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 31.329 ? 0.085 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 31.328 ? 0.031 ops/ms > > After: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.375 ? 0.125 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.407 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.387 ? 0.235 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 23.940 ? 0.025 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 30.629 ? 0.021 ops/ms @RealFYang I've reproduced this performance regression on VisionFive 2. The results are as follow: Before Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 39.335 ? 0.122 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 39.327 ? 0.138 ops/ms After FpRoundingBenchmark.test_round_double 2048 thrpt 15 30.004 ? 0.192 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 38.489 ? 0.120 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1846144148 From duke at openjdk.org Thu Dec 7 22:49:55 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Thu, 7 Dec 2023 22:49:55 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v7] In-Reply-To: References: Message-ID: > `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. > > `jcmd PID help Compiler.perfmap` shows the following usage. > > > Compiler.perfmap > Write map file for Linux perf tool. > > Impact: Low > > Syntax : Compiler.perfmap [] > > Arguments: > filename : [optional] Name of the map file (STRING, no default value) > > > The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) > > > Compiler.perfmap [arguments] (Linux only) > Write map file for Linux perf tool. > > Impact: Low > > arguments: > > ? filename: (Optional) Name of the map file (STRING, no default value) > > If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then > the default filename will be /tmp/perf-12345.map. Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: use default argument of write_perf_map ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15871/files - new: https://git.openjdk.org/jdk/pull/15871/files/6a854920..dbe223c5 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15871&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15871&range=05-06 Stats: 24 lines in 4 files changed: 8 ins; 13 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/15871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15871/head:pull/15871 PR: https://git.openjdk.org/jdk/pull/15871 From duke at openjdk.org Thu Dec 7 23:13:48 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Thu, 7 Dec 2023 23:13:48 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging Message-ID: This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. Example: % java -Xlog:inlinecache=trace -version [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' ... ------------- Commit messages: - Revert removing TraceInlineCacheClearing - Add logging headers - Fix tag order - 8316197: Make tracing of inline cache available in unified logging Changes: https://git.openjdk.org/jdk/pull/17026/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17026&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8316197 Stats: 33 lines in 9 files changed: 6 ins; 3 del; 24 mod Patch: https://git.openjdk.org/jdk/pull/17026.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17026/head:pull/17026 PR: https://git.openjdk.org/jdk/pull/17026 From xgong at openjdk.org Fri Dec 8 00:53:27 2023 From: xgong at openjdk.org (Xiaohong Gong) Date: Fri, 8 Dec 2023 00:53:27 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9] In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 09:30:01 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, which causes large performance gap on AArch64. Note that those APIs are optimized by C2 compiler on X86 platforms by calling Intel's SVML code [1]. To close the gap, we would like to optimize these APIs for AArch64 by calling a third-party vector library called libsleef [2], which are available in mainstream Linux distros (e.g. [3] [4]). >> >> SLEEF supports multiple accuracies. To match Vector API's requirement and implement the math ops on AArch64, we 1) call 1.0 ULP accuracy with FMA instructions used stubs in libsleef for most of the operations by default, and 2) add the vector calling convention to apply with the runtime calls to stub code in libsleef. Note that for those APIs that libsleef does not support 1.0 ULP, we choose 0.5 ULP instead. >> >> To help loading the expected libsleef library, this patch also adds an experimental JVM option (i.e. `-XX:UseSleefLib`) for AArch64 platforms. People can use it to denote the libsleef path/name explicitly. By default, it points to the system installed library. If the library does not exist or the dynamic loading of it in runtime fails, the math vector ops will fall-back to use the default scalar version without error. But a warning is printed out if people specifies a nonexistent library explicitly. >> >> Note that this is a part of the original proposed patch in panama-dev [5], just with some initial review comments addressed. And now we'd like to get some wider feedbacks from more hotspot experts. >> >> [1] https://github.com/openjdk/jdk/pull/3638 >> [2] https://sleef.org/ >> [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ >> [4] https://packages.debian.org/bookworm/libsleef3 >> [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html > > Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: > > Fix potential attribute issue > Build changes finally look good. Great, actually! Thanks for persisting, despite the many rounds of review. > > You will still need the 2 hotspot reviews for the hotspot part of the patch. > > /reviewers 3 Thanks for the review and all the comments! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1846330893 From sspitsyn at openjdk.org Fri Dec 8 01:16:52 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 8 Dec 2023 01:16:52 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v2] In-Reply-To: References: Message-ID: > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Resolved merge conflict in VirtualThread.java - added @summary to new test SuspendWithInterruptLock.java - add new test SuspendWithInterruptLock.java - 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable ------------- Changes: https://git.openjdk.org/jdk/pull/17011/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=01 Stats: 183 lines in 15 files changed: 178 ins; 0 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From fyang at openjdk.org Fri Dec 8 04:00:22 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 8 Dec 2023 04:00:22 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 21:58:55 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. Hi, glad to see the performance numbers are back to normal. Would you mind two more tweaks? Thanks. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1504: > 1502: andi(cnt, cnt, stride-1); // don't forget about tail! > 1503: > 1504: #define DO_ELEMENT_LOAD(reg, idx) \ Why not turn `DO_ELEMENT_LOAD` macro into a small function? Say `C2_MacroAssembler::arrays_hashcode_elload`. We can put it after `C2_MacroAssembler::arrays_hashcode_elsize`. src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1541: > 1539: > 1540: bind(TAIL); > 1541: beqz(cnt, DONE); `cnt` is non-zero we reach here from L1498, so this `beqz` check seems redundant in that case. Maybe move this `beqz` check immediate after L1538? ------------- PR Review: https://git.openjdk.org/jdk/pull/16629#pullrequestreview-1771480770 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1419899410 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1419891737 From duke at openjdk.org Fri Dec 8 08:34:22 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 8 Dec 2023 08:34:22 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> On Fri, 8 Dec 2023 03:51:49 GMT, Fei Yang wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. > > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1504: > >> 1502: andi(cnt, cnt, stride-1); // don't forget about tail! >> 1503: >> 1504: #define DO_ELEMENT_LOAD(reg, idx) \ > > Why not turn `DO_ELEMENT_LOAD` macro into a small function? Say `C2_MacroAssembler::arrays_hashcode_elload`. We can put it after `C2_MacroAssembler::arrays_hashcode_elsize`. Good idea, will do, thanks. > src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1541: > >> 1539: >> 1540: bind(TAIL); >> 1541: beqz(cnt, DONE); > > `cnt` is non-zero we reach here from L1498, so this `beqz` check seems redundant in that case. Maybe move this `beqz` check immediate after L1538? We need this check because after wide "unrolling" loop the cnt could be 0,1,2 or 3, see "don't forget about tail" comment at the line 1502. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1420098364 PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1420097330 From aph at openjdk.org Fri Dec 8 08:55:19 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 8 Dec 2023 08:55:19 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: Message-ID: On Wed, 15 Nov 2023 15:44:47 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced tmp with t0 src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4292: > 4290: // if +/-0, +/-subnormal numbers, signaling/quiet NaN > 4291: andi(t0, t0, fclass_mask::nan | fclass_mask::zero | fclass_mask::subnorm); > 4292: bnez(t0, done); What is this subnorm test for? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1420119491 From aph at openjdk.org Fri Dec 8 09:04:16 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 8 Dec 2023 09:04:16 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> References: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> Message-ID: On Tue, 5 Dec 2023 03:33:52 GMT, Fei Yang wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Replaced tmp with t0 > > Unfortunately, I witnessed performance regression on sifive unmatched board. > > Before: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.243 ? 0.506 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.448 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.411 ? 0.134 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 31.329 ? 0.085 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 31.328 ? 0.031 ops/ms > > After: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.375 ? 0.125 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.407 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.387 ? 0.235 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 23.940 ? 0.025 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 30.629 ? 0.021 ops/ms > @RealFYang I've reproduced this performance regression on VisionFive 2. The results are as follow: > > ``` > Before > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 39.335 ? 0.122 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 39.327 ? 0.138 ops/ms > After > FpRoundingBenchmark.test_round_double 2048 thrpt 15 30.004 ? 0.192 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 38.489 ? 0.120 ops/ms > ``` That is, to say the very least, surprising. I'd use -prof:perfasm to find out why. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1846798680 From aph at openjdk.org Fri Dec 8 09:16:17 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 8 Dec 2023 09:16:17 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 08:52:35 GMT, Andrew Haley wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Replaced tmp with t0 > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4292: > >> 4290: // if +/-0, +/-subnormal numbers, signaling/quiet NaN >> 4291: andi(t0, t0, fclass_mask::nan | fclass_mask::zero | fclass_mask::subnorm); >> 4292: bnez(t0, done); > > What is this subnorm test for? It looks to me like RoundTests.java isn't testing denormals. But I guess you tested the entire 32-bit range against the Java code, right? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1420152675 From vkempik at openjdk.org Fri Dec 8 10:02:18 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 8 Dec 2023 10:02:18 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> References: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> Message-ID: On Tue, 5 Dec 2023 03:33:52 GMT, Fei Yang wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Replaced tmp with t0 > > Unfortunately, I witnessed performance regression on sifive unmatched board. > > Before: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.243 ? 0.506 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.448 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.411 ? 0.134 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 31.329 ? 0.085 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 31.328 ? 0.031 ops/ms > > After: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.375 ? 0.125 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.407 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.387 ? 0.235 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 23.940 ? 0.025 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 30.629 ? 0.021 ops/ms > > @RealFYang I've reproduced this performance regression on VisionFive 2. The results are as follow: > > ``` > > Before > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > > FpRoundingBenchmark.test_round_double 2048 thrpt 15 39.335 ? 0.122 ops/ms > > FpRoundingBenchmark.test_round_float 2048 thrpt 15 39.327 ? 0.138 ops/ms > > After > > FpRoundingBenchmark.test_round_double 2048 thrpt 15 30.004 ? 0.192 ops/ms > > FpRoundingBenchmark.test_round_float 2048 thrpt 15 38.489 ? 0.120 ops/ms > > ``` > > That is, to say the very least, surprising. I'd use -prof:perfasm to find out why. -prof:perfasm doesn't work on u74 boards(hifive and visionfive2) as is, some problems with cycles event. This works: -prof perfasm:"events=cpu-clock" but it's s/w event, still better than nothing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1846888863 From aph at openjdk.org Fri Dec 8 10:31:19 2023 From: aph at openjdk.org (Andrew Haley) Date: Fri, 8 Dec 2023 10:31:19 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> Message-ID: On Fri, 8 Dec 2023 09:59:16 GMT, Vladimir Kempik wrote: > > That is, to say the very least, surprising. I'd use -prof:perfasm to find out why. > > -prof:perfasm doesn't work on u74 boards(hifive and visionfive2) as is, some problems with cycles event. This works: -prof perfasm:"events=cpu-clock" but it's s/w event, still better than nothing. It is. We should not simply accept something like this without trying to understand the reason. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1846927197 From duke at openjdk.org Fri Dec 8 10:34:56 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 8 Dec 2023 10:34:56 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v11] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to support _vectorizedHashCode intrinsic on > RISC-V platform. The patch adds the "scalar" code for the intrinsic without > usage of any RVV instruction but provides manual unrolling of the appropriate > loop. The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Thanks, > -Yuri Gaevsky > > P.S. My OCA has been accepted recently (ygaevsky). > > ### Correctness checks > > Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. > > ### Performance results (the numbers for non-ints are similar) > > #### StarFive JH7110 board: > > > ArraysHashCode: without intrinsic with intrinsic > ------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > ------------------------------------------------------------------------------- > multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op > multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op > multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op > multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op > multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op > multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op > multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op > multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op > multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op > multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op > multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op > multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op > multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op > multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op > multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op > multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op > multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op > multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op > ------------------------------------------------------------------------------- > > #### T-Head RVB-ICE board: > > > ArraysHashCode: ... Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: replaced macro definition with function, fixed whitespaces/comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16629/files - new: https://git.openjdk.org/jdk/pull/16629/files/99f91d04..b5bb8d3d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=09-10 Stats: 39 lines in 2 files changed: 14 ins; 18 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16629.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16629/head:pull/16629 PR: https://git.openjdk.org/jdk/pull/16629 From duke at openjdk.org Fri Dec 8 10:34:56 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 8 Dec 2023 10:34:56 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: <-9VPFQKrWumrHIW9FQxIZ57MB9jAwJ1FsTJ0jGHyaIU=.13a738e7-16e2-40ad-8de3-2c20082ef908@github.com> On Fri, 8 Dec 2023 03:58:01 GMT, Fei Yang wrote: > Hi, glad to see the performance numbers are back to normal. Would you mind two more tweaks? Thanks. Thank you for the suggestions, fixed, please take a look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1846930327 From duke at openjdk.org Fri Dec 8 10:34:57 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 8 Dec 2023 10:34:57 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> References: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> Message-ID: On Fri, 8 Dec 2023 08:30:10 GMT, Yuri Gaevsky wrote: >> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1504: >> >>> 1502: andi(cnt, cnt, stride-1); // don't forget about tail! >>> 1503: >>> 1504: #define DO_ELEMENT_LOAD(reg, idx) \ >> >> Why not turn `DO_ELEMENT_LOAD` macro into a small function? Say `C2_MacroAssembler::arrays_hashcode_elload`. We can put it after `C2_MacroAssembler::arrays_hashcode_elsize`. > > Good idea, will do, thanks. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1420247372 From duke at openjdk.org Fri Dec 8 11:26:30 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 8 Dec 2023 11:26:30 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v4] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains four commits: - Merge branch 'master' of https://git.openjdk.org/jdk into JDK-8234502 - restore comment - line-break for EOF - merge 'CollectedHeap' and 'SerialHeap' ------------- Changes: https://git.openjdk.org/jdk/pull/16927/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=03 Stats: 2910 lines in 15 files changed: 1431 ins; 1465 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From dchuyko at openjdk.org Fri Dec 8 11:42:44 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Fri, 8 Dec 2023 11:42:44 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v13] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 31 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - jcheck - Unnecessary import - ... and 21 more: https://git.openjdk.org/jdk/compare/701bc3bb...1a01cf1c ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=12 Stats: 372 lines in 15 files changed: 339 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From sspitsyn at openjdk.org Fri Dec 8 11:54:40 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 8 Dec 2023 11:54:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: Message-ID: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17011/files - new: https://git.openjdk.org/jdk/pull/17011/files/ccba940d..18f1752e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=01-02 Stats: 80 lines in 9 files changed: 25 ins; 7 del; 48 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From alanb at openjdk.org Fri Dec 8 12:09:15 2023 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 8 Dec 2023 12:09:15 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Fri, 8 Dec 2023 11:54:40 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods I chatted briefly with @sspitsyn about this. A couple of points: - It shouldn't be necessary to touch mount/unmount as the thread identity is the carrier, not the virtual thread, when executing the "critical code". - toggle_is_in_critical_section needs to detect reentrancy, it is otherwise too early to refactor the Java code, e.g. call threadState while holding the interrupt lock. - All the use-sides will need to use try-finally to more reliably revert the critical section flag when rewinding. - The naming is very problematic, we'll need to replace with methods that are clearly named enter and exit critical section. Ongoing work in this area to support monitors has to introduce some temporary pinning so there will be enter/exitCriticalSection methods, that's a better place for the JVMTI hooks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1847063362 From duke at openjdk.org Fri Dec 8 13:01:55 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 8 Dec 2023 13:01:55 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v5] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: resolve conflict ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/fef7f8af..e6d7dfed Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=03-04 Stats: 4 lines in 1 file changed: 4 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From ayang at openjdk.org Fri Dec 8 13:43:14 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Fri, 8 Dec 2023 13:43:14 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v5] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 13:01:55 GMT, Lei Zaakjyu wrote: >> 8234502: Merge GenCollectedHeap and SerialHeap > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > resolve conflict As you pointed it out previously, some files in `src/jdk.hotspot.agent/share/classes/sun/jvm/hotspot/` need to changed (before or inside this PR). The bottom line is that `GenCollectedHeap` should not exist in the codebase after this PR. src/hotspot/share/gc/shared/space.cpp line 131: > 129: #if INCLUDE_SERIALGC > 130: cp->gen = SerialHeap::heap()->young_gen(); > 131: #endif // INCLUDE_SERIALGC This doesn't look right. `INCLUDE_SERIALGC` and its counterparts in other GCs are to support release builds without certain GCs. IOW, when this is `false`, it should still build. I believe https://github.com/openjdk/jdk/pull/16842 should make this part of change obsolete. ------------- PR Review: https://git.openjdk.org/jdk/pull/16927#pullrequestreview-1772359002 PR Review Comment: https://git.openjdk.org/jdk/pull/16927#discussion_r1420454416 From fyang at openjdk.org Fri Dec 8 13:46:25 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 8 Dec 2023 13:46:25 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> References: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> Message-ID: On Fri, 8 Dec 2023 08:29:00 GMT, Yuri Gaevsky wrote: >> src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp line 1541: >> >>> 1539: >>> 1540: bind(TAIL); >>> 1541: beqz(cnt, DONE); >> >> `cnt` is non-zero we reach here from L1498, so this `beqz` check seems redundant in that case. Maybe move this `beqz` check immediate after L1538? > > We need this check because after wide "unrolling" loop the cnt could be 0,1,2 or 3, see "don't forget about tail" comment at the line 1502. The control flow will be directed to `DONE` by the `beqz` check at L1493 when `cnt` is zero. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1420461406 From duke at openjdk.org Fri Dec 8 14:17:23 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 8 Dec 2023 14:17:23 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> Message-ID: <0pxHhnoibLujMzLffPATnaVtzRmWfk1jA7shvSxdUiY=.991ce23e-8b5c-4d72-b766-be6f07e787c9@github.com> On Fri, 8 Dec 2023 13:43:33 GMT, Fei Yang wrote: >> We need this check because after wide "unrolling" loop the cnt could be 0,1,2 or 3, see "don't forget about tail" comment at the line 1502. > > The control flow will be directed to `DONE` by the `beqz` check at L1493 when `cnt` is zero. I meant other case(s): if '`cnt`' is equal to 4/8/... then we pass the initial `cnt==zero` check at L1493 and go further into wide loop, where the 'cnt' is zero (L1501)- that's why that check is needed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1420507338 From eastigeevich at openjdk.org Fri Dec 8 15:19:44 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Fri, 8 Dec 2023 15:19:44 GMT Subject: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v8] In-Reply-To: References: Message-ID: On Thu, 24 Jun 2021 17:02:03 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. Also allows for performance improvement for non-AVX-512 enabled platforms. Due to the nature of MIME-encoded inputs, modify the intrinsic signature to accept an additional parameter (isMIME) for fast-path MIME decoding. >> >> A change was made to the signature of DecodeBlock in Base64.java to provide the intrinsic information as to whether MIME decoding was being done. This allows for the intrinsic to bypass the expensive setup of zmm registers from AVX tables, knowing there may be invalid Base64 characters every 76 characters or so. A change was also made here removing the restriction that the intrinsic must return an even multiple of 3 bytes decoded. This implementation handles the pad characters at the end of the string and will return the actual number of characters decoded. >> >> The AVX portion of this code will decode in blocks of 256 bytes per loop iteration, then in chunks of 64 bytes, followed by end fixup decoding. The non-AVX code is an assembly-optimized version of the java DecodeBlock and behaves identically. >> >> Running the Base64Decode benchmark, this change increases decode performance by an average of 2.6x with a maximum 19.7x for buffers > ~20k. The numbers are given in the table below. >> >> **Base Score** is without intrinsic support, **Optimized Score** is using this intrinsic, and **Gain** is **Base** / **Optimized**. >> >> >> Benchmark Name | Base Score | Optimized Score | Gain >> -- | -- | -- | -- >> testBase64Decode size 1 | 15.36 | 15.32 | 1.00 >> testBase64Decode size 3 | 17.00 | 16.72 | 1.02 >> testBase64Decode size 7 | 20.60 | 18.82 | 1.09 >> testBase64Decode size 32 | 34.21 | 26.77 | 1.28 >> testBase64Decode size 64 | 54.43 | 38.35 | 1.42 >> testBase64Decode size 80 | 66.40 | 48.34 | 1.37 >> testBase64Decode size 96 | 73.16 | 52.90 | 1.38 >> testBase64Decode size 112 | 84.93 | 51.82 | 1.64 >> testBase64Decode size 512 | 288.81 | 32.04 | 9.01 >> testBase64Decode size 1000 | 560.48 | 40.79 | 13.74 >> testBase64Decode size 20000 | 9530.28 | 483.37 | 19.72 >> testBase64Decode size 50000 | 24552.24 | 1735.07 | 14.15 >> testBase64MIMEDecode size 1 | 22.87 | 21.36 | 1.07 >> testBase64MIMEDecode size 3 | 27.79 | 25.32 | 1.10 >> testBase64MIMEDecode size 7 | 44.74 | 43.81 | 1.02 >> testBase64MIMEDecode size 32 | 142.69 | 129.56 | 1.10 >> testBase64MIMEDecode size 64 | 256.90 | 243.80 | 1.05 >> testBase64MIMEDecode size 80 | 311.60 | 310.80 | 1.00 >> testBase64MIMEDecode size... > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fixed Windows register stomping. We found this optimization causes https://bugs.openjdk.org/browse/JDK-8321599 ------------- PR Comment: https://git.openjdk.org/jdk/pull/4368#issuecomment-1847357495 From stevenschlansker at gmail.com Fri Dec 8 17:56:39 2023 From: stevenschlansker at gmail.com (Steven Schlansker) Date: Fri, 8 Dec 2023 09:56:39 -0800 Subject: CDS can archive classpath entries more than once when a JAR manifest has Class-Path attributes Message-ID: Hi hotspot-dev, Recently, we started experiencing JVM crashes [1] and inexplicable IncompatibleClassChangeErrors in our testing environment. We use custom classloaders, NMT, and app-CDS. # Internal Error (virtualMemoryTracker.cpp:403), pid=20, tid=128 # Error: ShouldNotReachHere() # # JRE version: OpenJDK Runtime Environment (Red_Hat-21.0.1.0.12-2) (21.0.1+12) (build 21.0.1+12-LTS) # Java VM: OpenJDK 64-Bit Server VM (Red_Hat-21.0.1.0.12-2) (21.0.1+12-LTS, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # V [libjvm.so+0x104a06c] VirtualMemoryTracker::add_reserved_region(unsigned char*, unsigned long, NativeCallStack const&, MEMFLAGS)+0x6fc and, java.lang.IncompatibleClassChangeError: com.paywholesail.components.util.ByteBuffers and com.paywholesail.components.util.ByteBuffers$ByteBufferPuttable disagree on InnerClasses attribute (I checked with javap, and it looks the same to me...) At least for the ShouldNotReachHere, it looked like a definite JVM bug, so I have been trying to create a reproducing test case to make a good error report. I noticed that the crash only happens when NMT is combined with Class Data Sharing. At this point, I read the logs closely, and noticed: [0.139s][warning][cds ] shared class paths mismatch [0.151s][warning][cds,dynamic] Unable to use shared archive. The top archive failed to load: /.../prebake.jsa So, I compared the expected and actual class path as printed by the JVM. In both cases, we run with `-cp lib/*` with a fixed set of library jars. Imagine my surprise when I find that the only difference is that the expected (archive-time) classpath includes lib/stax-ex-1.8.jar *twice*. By running the generated shared archive file with `strings | grep`, I am able to verify that the `lib/stax-ex-1.8.jar` entry indeed is present in the archive twice. I fixed up my JDK build environment and started sprinkling new logging and assertions through the archive creation code. It looks like ClassLoader::add_to_app_classpath_entries can either check for duplicated classpath entries, or trust that the caller knows the element is new. This list of entries is built in part by ClassLoader::setup_app_search_path, which enumerates the classpath and adds entries one by one. In this case, duplicate checks are skipped, presumably because we trust the initial classpath not to have duplicates. When an element is added in add_to_app_classpath_entries, for each jar, it calls process_jar_manifest. Among other things, this reads the MANIFEST.MF and looks for Class-Path entries, and loads those too. Indeed, our `jaxb-runtime` has such an entry for `stax-ex`. In this case, it does guard against duplicate entries. I think there is a bug here: if a jar is added by a manifest's Class-Path from a jar *before* we finish processing the initial app class path, it can get added twice - first with a duplicate check via the manifest, and then a second time without checking for duplicates from the app classpath. I believe this is reproducible on latest 21.0.1+12 with the following code and steps: A.java: class A { static { System.err.println("A"); } } class B { public static void main(String[] args) { System.err.println("hi!"); new A(); } } MANIFEST.MF: Manifest-Version: 1.0 Class-Path: B.jar % mkdir lib % javac A.java % javac B.java % jar -m META-INF/MANIFEST.MF -c -f lib/A.jar A.class % jar cf lib/B.jar B.class % java -cp lib/B.jar:lib/A.jar -XX:ArchiveClassesAtExit=shared.jsa -XX:NativeMemoryTracking=summary B % strings shared.jsa| grep lib/ lib/B.jar lib/A.jar % java -cp lib/A.jar:lib/B.jar -XX:ArchiveClassesAtExit=shared2.jsa -XX:NativeMemoryTracking=summary B % strings shared2.jsa| grep lib/ lib/A.jar lib/B.jar lib/B.jar When A.jar is loaded first, the Class-Path manifest entry adds B.jar. Then, B.jar is added *again*, unconditionally. When B.jar is loaded first, the app classpath entry is created first. Then, the manifest entry is checked and since it is a duplicate, only one entry is added. At this point I felt like I collected enough information to ask for some expert advice. Am I on the right track here, that this could be a bug resulting in duplicate classpath entries in the archive classpath, if a dependent jar comes in via a manifest class-path entry before the app classpath finishes processing? Could that possibly be the source of our assertion failures and IncompatibleClassChangeErrors? As a related question, this makes me worry that using `-cp lib/*` might implicitly embed the filesystem enumeration order in the archive. Maybe the classpath order is not important when verifying, but at the very least, the wildcard enumeration order influences the build in a way I did not expect. If my analysis sounds plausible, I can submit it via the Java bug system. Thank you for any consideration and advice. Best, Steven [1] https://gist.github.com/stevenschlansker/12d1eaeb363ae135c88a965048353b0e From kvn at openjdk.org Fri Dec 8 18:10:15 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 8 Dec 2023 18:10:15 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 23:07:21 GMT, Yi-Fan Tsai wrote: > This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. > > Example: > > % java -Xlog:inlinecache=trace -version > [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I > [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V > [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 > [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I > [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' > ... You missed `compiler/arguments/TestTraceICs.java` test. ------------- Changes requested by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17026#pullrequestreview-1772915213 From duke at openjdk.org Fri Dec 8 18:33:27 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Fri, 8 Dec 2023 18:33:27 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v2] In-Reply-To: References: Message-ID: <2LU9NAlTsuhqmPAA5Fp6f36NcRZ2Lma5yyjdRcG5kAI=.63b87ce3-74a1-4d49-9078-094a2dc2b566@github.com> > This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. > > Example: > > % java -Xlog:inlinecache=trace -version > [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I > [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V > [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 > [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I > [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' > ... Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Update TestTraceICs ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17026/files - new: https://git.openjdk.org/jdk/pull/17026/files/540d8868..f7602f44 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17026&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17026&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 1 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17026.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17026/head:pull/17026 PR: https://git.openjdk.org/jdk/pull/17026 From duke at openjdk.org Fri Dec 8 18:37:21 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Fri, 8 Dec 2023 18:37:21 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v11] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 10:34:56 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > replaced macro definition with function, fixed whitespaces/comments. The failures above seem unrelated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1847648514 From kvn at openjdk.org Fri Dec 8 18:46:17 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 8 Dec 2023 18:46:17 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v2] In-Reply-To: <2LU9NAlTsuhqmPAA5Fp6f36NcRZ2Lma5yyjdRcG5kAI=.63b87ce3-74a1-4d49-9078-094a2dc2b566@github.com> References: <2LU9NAlTsuhqmPAA5Fp6f36NcRZ2Lma5yyjdRcG5kAI=.63b87ce3-74a1-4d49-9078-094a2dc2b566@github.com> Message-ID: On Fri, 8 Dec 2023 18:33:27 GMT, Yi-Fan Tsai wrote: >> This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. >> >> Example: >> >> % java -Xlog:inlinecache=trace -version >> [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I >> [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V >> [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 >> [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I >> [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' >> ... > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update TestTraceICs test/hotspot/jtreg/compiler/arguments/TestTraceICs.java line 28: > 26: * @bug 8217447 > 27: * @summary Test running TraceICs enabled. > 28: * @run main/othervm compiler.arguments.TestTraceICs I think you need to add `-Xlog:inlinecache=trace` flag to keep previous test's behavior and check your new code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17026#discussion_r1420893410 From duke at openjdk.org Fri Dec 8 18:50:38 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Fri, 8 Dec 2023 18:50:38 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v3] In-Reply-To: References: Message-ID: > This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. > > Example: > > % java -Xlog:inlinecache=trace -version > [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I > [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V > [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 > [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I > [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' > ... Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Enable inlinecache logs in TestTraceICs ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17026/files - new: https://git.openjdk.org/jdk/pull/17026/files/f7602f44..5d782f75 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17026&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17026&range=01-02 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17026.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17026/head:pull/17026 PR: https://git.openjdk.org/jdk/pull/17026 From kvn at openjdk.org Fri Dec 8 19:31:14 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 8 Dec 2023 19:31:14 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v3] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 18:50:38 GMT, Yi-Fan Tsai wrote: >> This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. >> >> Example: >> >> % java -Xlog:inlinecache=trace -version >> [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I >> [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V >> [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 >> [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I >> [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' >> ... > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Enable inlinecache logs in TestTraceICs Good. I will submit our testing. ------------- PR Review: https://git.openjdk.org/jdk/pull/17026#pullrequestreview-1773036125 From calvin.cheung at oracle.com Fri Dec 8 19:35:32 2023 From: calvin.cheung at oracle.com (Calvin Cheung) Date: Fri, 8 Dec 2023 11:35:32 -0800 Subject: CDS can archive classpath entries more than once when a JAR manifest has Class-Path attributes In-Reply-To: References: Message-ID: Hi Steven, I've only investigated the duplicate app class path issue and could reproduce it with JDK 21. I couldn't reproduce it with my mainline repo which includes fixes in the upcoming JDK 22. I think the following fix in JDK 22 addresses the duplicate class path issue: https://bugs.openjdk.org/browse/JDK-8304292 We can look into backporting the fix into JDK 21. I'll let other folks to comment on the remaining issues. thanks, Calvin On 12/8/23 9:56 AM, Steven Schlansker wrote: > Hi hotspot-dev, > > Recently, we started experiencing JVM crashes [1] and inexplicable > IncompatibleClassChangeErrors in our testing environment. We use > custom classloaders, NMT, and app-CDS. > > # Internal Error (virtualMemoryTracker.cpp:403), pid=20, tid=128 > # Error: ShouldNotReachHere() > # > > # JRE version: OpenJDK Runtime Environment (Red_Hat-21.0.1.0.12-2) > (21.0.1+12) (build 21.0.1+12-LTS) > # Java VM: OpenJDK 64-Bit Server VM (Red_Hat-21.0.1.0.12-2) > (21.0.1+12-LTS, mixed mode, sharing, tiered, compressed oops, > compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0x104a06c] > VirtualMemoryTracker::add_reserved_region(unsigned char*, unsigned > long, NativeCallStack const&, MEMFLAGS)+0x6fc > > and, > > java.lang.IncompatibleClassChangeError: > com.paywholesail.components.util.ByteBuffers and > com.paywholesail.components.util.ByteBuffers$ByteBufferPuttable > disagree on InnerClasses attribute > (I checked with javap, and it looks the same to me...) > > At least for the ShouldNotReachHere, it looked like a definite JVM > bug, so I have been trying to create a reproducing test case to make a > good error report. I noticed that the crash only happens when NMT is > combined with Class Data Sharing. At this point, I read the logs > closely, and noticed: > > [0.139s][warning][cds ] shared class paths mismatch > [0.151s][warning][cds,dynamic] Unable to use shared archive. The top > archive failed to load: /.../prebake.jsa > > So, I compared the expected and actual class path as printed by the > JVM. In both cases, we run with `-cp lib/*` with a fixed set of > library jars. Imagine my surprise when I find that the only difference > is that the expected (archive-time) classpath includes > lib/stax-ex-1.8.jar *twice*. > > By running the generated shared archive file with `strings | grep`, I > am able to verify that the `lib/stax-ex-1.8.jar` entry indeed is > present in the archive twice. > > I fixed up my JDK build environment and started sprinkling new logging > and assertions through the archive creation code. > > It looks like ClassLoader::add_to_app_classpath_entries can either > check for duplicated classpath entries, or trust that the caller knows > the element is new. > This list of entries is built in part by > ClassLoader::setup_app_search_path, which enumerates the classpath and > adds entries one by one. In this case, duplicate checks are skipped, > presumably because we trust the initial classpath not to have > duplicates. > > When an element is added in add_to_app_classpath_entries, for each > jar, it calls process_jar_manifest. Among other things, this reads the > MANIFEST.MF and looks for Class-Path entries, and loads those too. > Indeed, our `jaxb-runtime` has such an entry for `stax-ex`. In this > case, it does guard against duplicate entries. > > I think there is a bug here: if a jar is added by a manifest's > Class-Path from a jar *before* we finish processing the initial app > class path, it can get added twice - first with a duplicate check via > the manifest, and then a second time without checking for duplicates > from the app classpath. > > I believe this is reproducible on latest 21.0.1+12 with the following > code and steps: > > A.java: > class A { > static { > System.err.println("A"); > } > } > > class B { > public static void main(String[] args) { > System.err.println("hi!"); > new A(); > } > } > > MANIFEST.MF: > Manifest-Version: 1.0 > Class-Path: B.jar > > > % mkdir lib > % javac A.java > % javac B.java > % jar -m META-INF/MANIFEST.MF -c -f lib/A.jar A.class > % jar cf lib/B.jar B.class > > % java -cp lib/B.jar:lib/A.jar -XX:ArchiveClassesAtExit=shared.jsa > -XX:NativeMemoryTracking=summary B > % strings shared.jsa| grep lib/ > lib/B.jar > lib/A.jar > > % java -cp lib/A.jar:lib/B.jar -XX:ArchiveClassesAtExit=shared2.jsa > -XX:NativeMemoryTracking=summary B > % strings shared2.jsa| grep lib/ > lib/A.jar > lib/B.jar > lib/B.jar > > When A.jar is loaded first, the Class-Path manifest entry adds B.jar. > Then, B.jar is added *again*, unconditionally. > When B.jar is loaded first, the app classpath entry is created first. > Then, the manifest entry is checked and since it is a duplicate, only > one entry is added. > > At this point I felt like I collected enough information to ask for > some expert advice. > Am I on the right track here, that this could be a bug resulting in > duplicate classpath entries in the archive classpath, if a dependent > jar comes in via a manifest class-path entry before the app classpath > finishes processing? Could that possibly be the source of our > assertion failures and IncompatibleClassChangeErrors? > > As a related question, this makes me worry that using `-cp lib/*` > might implicitly embed the filesystem enumeration order in the > archive. Maybe the classpath order is not important when verifying, > but at the very least, the wildcard enumeration order influences the > build in a way I did not expect. > > If my analysis sounds plausible, I can submit it via the Java bug system. > > Thank you for any consideration and advice. Best, > Steven > > [1] https://gist.github.com/stevenschlansker/12d1eaeb363ae135c88a965048353b0e From jvernee at openjdk.org Fri Dec 8 19:43:28 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Fri, 8 Dec 2023 19:43:28 GMT Subject: RFR: 8320886: Unsafe_SetMemory0 is not guarded [v3] In-Reply-To: <5kRdxpEyFZLzxlyHpdHju1w9qLbm4OA6UkVZMr17nt0=.339b7543-574c-4a06-84e9-2ffb9d9a345a@github.com> References: <5kRdxpEyFZLzxlyHpdHju1w9qLbm4OA6UkVZMr17nt0=.339b7543-574c-4a06-84e9-2ffb9d9a345a@github.com> Message-ID: > See JBS issue. > > Guard the memory access done in Unsafe_SetMemory0 to prevent a SIGBUS error from crashing the VM when a truncated memory mapped file is accessed. > > Testing: local `InternalErrorTest`, Tier 1-5 Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: adjust whitespace ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16848/files - new: https://git.openjdk.org/jdk/pull/16848/files/e9a5247e..7c258f07 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16848&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16848&range=01-02 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16848.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16848/head:pull/16848 PR: https://git.openjdk.org/jdk/pull/16848 From kvn at openjdk.org Fri Dec 8 21:31:13 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Fri, 8 Dec 2023 21:31:13 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v3] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 18:50:38 GMT, Yi-Fan Tsai wrote: >> This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. >> >> Example: >> >> % java -Xlog:inlinecache=trace -version >> [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I >> [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V >> [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 >> [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I >> [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' >> ... > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Enable inlinecache logs in TestTraceICs My testing tier and tier2(which runs the test) passed. You need second review. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17026#pullrequestreview-1773195355 From omikhaltcova at openjdk.org Fri Dec 8 22:34:17 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 8 Dec 2023 22:34:17 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 09:13:44 GMT, Andrew Haley wrote: >> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4292: >> >>> 4290: // if +/-0, +/-subnormal numbers, signaling/quiet NaN >>> 4291: andi(t0, t0, fclass_mask::nan | fclass_mask::zero | fclass_mask::subnorm); >>> 4292: bnez(t0, done); >> >> What is this subnorm test for? > > It looks to me like RoundTests.java isn't testing denormals. But I guess you tested the entire 32-bit range against the Java code, right? Subnormal numbers can be distinguished after fclass call, they were also added here in order not to do redundant operations further, they results to 0 as well. Yes, that's right, I verified output of this algorithm against the current java implementation on the full 32-bit range. Thanks for the advice below! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1421076577 From omikhaltcova at openjdk.org Fri Dec 8 22:46:45 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 8 Dec 2023 22:46:45 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v7] In-Reply-To: References: Message-ID: > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Optimization against regression on SiFive ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/fed920ea..d60488fa Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=05-06 Stats: 22 lines in 1 file changed: 4 ins; 12 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From omikhaltcova at openjdk.org Fri Dec 8 23:03:15 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 8 Dec 2023 23:03:15 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> References: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> Message-ID: <---Vaqud515sWsUjJhkv1kONUP3Qon_R8fRwbO07f28=.290ba1f7-50f5-4178-899f-4d3aaf0cff6c@github.com> On Tue, 5 Dec 2023 03:33:52 GMT, Fei Yang wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Replaced tmp with t0 > > Unfortunately, I witnessed performance regression on sifive unmatched board. > > Before: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.243 ? 0.506 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.448 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.411 ? 0.134 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 31.329 ? 0.085 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 31.328 ? 0.031 ops/ms > > After: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.375 ? 0.125 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.407 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.387 ? 0.235 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 23.940 ? 0.025 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 30.629 ? 0.021 ops/ms @RealFYang Thanks for pointing out this regression! Some optimization has been done. Please take a look at the results below! **VisionFive 2** Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 39.351 ? 0.150 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 39.323 ? 0.192 ops/ms After FpRoundingBenchmark.test_round_double 2048 thrpt 15 36.812 ? 0.171 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 50.179 ? 0.143 ops/ms **T-Head** Before Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.853 0.227 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.889 0.145 ops/ms After FpRoundingBenchmark.test_round_double 2048 thrpt 15 119.493 1.591 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 123.546 0.329 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1847949976 From duke at openjdk.org Sat Dec 9 01:10:47 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 9 Dec 2023 01:10:47 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v6] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: remove 'GenCollectedHeap' from 'jdk.hotspot.agent' ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/e6d7dfed..f6ca5177 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=04-05 Stats: 293 lines in 6 files changed: 89 ins; 190 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From duke at openjdk.org Sat Dec 9 01:36:39 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 9 Dec 2023 01:36:39 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v7] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: fix import statement ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/f6ca5177..6d8381cb Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=05-06 Stats: 1 line in 1 file changed: 1 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From fyang at openjdk.org Sat Dec 9 01:48:18 2023 From: fyang at openjdk.org (Fei Yang) Date: Sat, 9 Dec 2023 01:48:18 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: <0pxHhnoibLujMzLffPATnaVtzRmWfk1jA7shvSxdUiY=.991ce23e-8b5c-4d72-b766-be6f07e787c9@github.com> References: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> <0pxHhnoibLujMzLffPATnaVtzRmWfk1jA7shvSxdUiY=.991ce23e-8b5c-4d72-b766-be6f07e787c9@github.com> Message-ID: On Fri, 8 Dec 2023 14:14:46 GMT, Yuri Gaevsky wrote: >> The control flow will be directed to `DONE` by the `beqz` check at L1493 when `cnt` is zero. > > I meant other case(s): if '`cnt`' is equal to 4/8/... then after the initial `cnt==zero` check at L1493 the control flow _**doesn't jump**_ to `DONE ` but continue execution, _zeroed_ by `andi()` _before_ the wide loop, so after the loop the 'cnt' is zero (L1501) - that's why the check is needed. Yes, I know what you mean. And that's also what I am suggesting in my initial comment: move this `beqz` check immediately after the loop (that is after L1522 in your latest version). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1421202679 From duke at openjdk.org Sat Dec 9 02:02:43 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 9 Dec 2023 02:02:43 GMT Subject: RFR: 8234502: Merge GenCollectedHeap and SerialHeap [v8] In-Reply-To: References: Message-ID: > 8234502: Merge GenCollectedHeap and SerialHeap Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: fix import statement ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16927/files - new: https://git.openjdk.org/jdk/pull/16927/files/6d8381cb..992b7ab4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16927&range=06-07 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16927.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16927/head:pull/16927 PR: https://git.openjdk.org/jdk/pull/16927 From bulasevich at openjdk.org Sat Dec 9 05:01:12 2023 From: bulasevich at openjdk.org (Boris Ulasevich) Date: Sat, 9 Dec 2023 05:01:12 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: On Thu, 7 Dec 2023 10:25:05 GMT, Aleksei Voitylov wrote: > Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. Thanks for fixing the ARM32 build! I see the change fixes wrong offset and register usage and does some cleanup. It is OK for me. Thanks again! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17017#issuecomment-1848232235 From gli at openjdk.org Sat Dec 9 08:04:29 2023 From: gli at openjdk.org (Guoxiong Li) Date: Sat, 9 Dec 2023 08:04:29 GMT Subject: RFR: 8321631: Fix comments in access.hpp Message-ID: Hi all, This trivial patch fixes the comments about `atomic_xchg` and `atomic_xchg_at` in `access.hpp`. And it removes the unnecessary content about `INSTANTIATE_HPP_ACCESS` which has been aleready removed in [JDK-8230808](https://bugs.openjdk.org/browse/JDK-8230808). Thanks for the review. Best Regards, -- Guoxiong ------------- Commit messages: - JDK-8321631 Changes: https://git.openjdk.org/jdk/pull/17042/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17042&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321631 Stats: 5 lines in 1 file changed: 0 ins; 1 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/17042.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17042/head:pull/17042 PR: https://git.openjdk.org/jdk/pull/17042 From eosterlund at openjdk.org Sat Dec 9 21:53:11 2023 From: eosterlund at openjdk.org (Erik =?UTF-8?B?w5ZzdGVybHVuZA==?=) Date: Sat, 9 Dec 2023 21:53:11 GMT Subject: RFR: 8321631: Fix comments in access.hpp In-Reply-To: References: Message-ID: On Sat, 9 Dec 2023 07:59:02 GMT, Guoxiong Li wrote: > Hi all, > > This trivial patch fixes the comments about `atomic_xchg` and `atomic_xchg_at` in `access.hpp`. > And it removes the unnecessary content about `INSTANTIATE_HPP_ACCESS` > which has been aleready removed in [JDK-8230808](https://bugs.openjdk.org/browse/JDK-8230808). > > Thanks for the review. > > Best Regards, > -- Guoxiong Marked as reviewed by eosterlund (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17042#pullrequestreview-1773861463 From gli at openjdk.org Sun Dec 10 05:28:28 2023 From: gli at openjdk.org (Guoxiong Li) Date: Sun, 10 Dec 2023 05:28:28 GMT Subject: RFR: 8321640: Move the method barrier_stubs_init from BarrierSetAssembler to BarrierSet Message-ID: Hi all, This patch moves the method `barrier_stubs_init` from `BarrierSetAssembler` to `BarrierSet`. The `BarrierSetAssembler` is an assember which is like `MacroAssembler`, but the method `barrier_stubs_init` generates and stores the stubs which is like `StubGenerator`. So it is not good to place it in `BarrierSetAssembler`. Thanks for taking the time to review. Best Regards, -- Guoixong ------------- Commit messages: - JDK-8321640 Changes: https://git.openjdk.org/jdk/pull/17044/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17044&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321640 Stats: 19 lines in 8 files changed: 5 ins; 12 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17044.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17044/head:pull/17044 PR: https://git.openjdk.org/jdk/pull/17044 From duke at openjdk.org Sun Dec 10 11:19:19 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Sun, 10 Dec 2023 11:19:19 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> <0pxHhnoibLujMzLffPATnaVtzRmWfk1jA7shvSxdUiY=.991ce23e-8b5c-4d72-b766-be6f07e787c9@github.com> Message-ID: On Sat, 9 Dec 2023 01:45:12 GMT, Fei Yang wrote: >> I meant other case(s): if '`cnt`' is equal to 4/8/... then after the initial `cnt==zero` check at L1493 the control flow _**doesn't jump**_ to `DONE ` but continue execution, _zeroed_ by `andi()` _before_ the wide loop, so after the loop the 'cnt' is zero (L1501) - that's why the check is needed. > > Yes, I know what you mean. And that's also what I am suggesting in my initial comment: move this `beqz` check immediately after the loop (that is after L1522 in your latest version). Like following add-on change: > > > diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp > index 2d93d36a37f..11cbcaa48a1 100644 > --- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp > +++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp > @@ -1521,9 +1521,9 @@ void C2_MacroAssembler::arrays_hashcode(Register ary, Register cnt, Register res > // + 31^^1 * ary[i+2] + 31^^0 * ary[i+3] > addi(ary, ary, elsize * stride); > bne(ary, chunks_end, WIDE_LOOP); > + beqz(cnt, DONE); > > bind(TAIL); > - beqz(cnt, DONE); > slli(chunks_end, cnt, chunks_end_shift); > add(chunks_end, ary, chunks_end); Ah, got it finally. Nice catch, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1421730744 From duke at openjdk.org Sun Dec 10 12:14:35 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Sun, 10 Dec 2023 12:14:35 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v12] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to support _vectorizedHashCode intrinsic on > RISC-V platform. The patch adds the "scalar" code for the intrinsic without > usage of any RVV instruction but provides manual unrolling of the appropriate > loop. The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Thanks, > -Yuri Gaevsky > > P.S. My OCA has been accepted recently (ygaevsky). > > ### Correctness checks > > Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. > > ### Performance results (the numbers for non-ints are similar) > > #### StarFive JH7110 board: > > > ArraysHashCode: without intrinsic with intrinsic > ------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > ------------------------------------------------------------------------------- > multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op > multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op > multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op > multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op > multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op > multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op > multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op > multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op > multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op > multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op > multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op > multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op > multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op > multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op > multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op > multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op > multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op > multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op > ------------------------------------------------------------------------------- > > #### T-Head RVB-ICE board: > > > ArraysHashCode: ... Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: Moved zero check for cnt before TAIL per @RealFYang suggestion. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16629/files - new: https://git.openjdk.org/jdk/pull/16629/files/b5bb8d3d..94ad47e2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=10-11 Stats: 2 lines in 1 file changed: 1 ins; 1 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16629.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16629/head:pull/16629 PR: https://git.openjdk.org/jdk/pull/16629 From duke at openjdk.org Sun Dec 10 12:14:37 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Sun, 10 Dec 2023 12:14:37 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: <1DDeK4GVrVkxpfVPZOLZOVDOP0C-ecKBrM322AJuX5U=.3c60fa2e-f243-4fbd-9356-fea6f4cd3f08@github.com> <0pxHhnoibLujMzLffPATnaVtzRmWfk1jA7shvSxdUiY=.991ce23e-8b5c-4d72-b766-be6f07e787c9@github.com> Message-ID: On Sun, 10 Dec 2023 11:16:09 GMT, Yuri Gaevsky wrote: >> Yes, I know what you mean. And that's also what I am suggesting in my initial comment: move this `beqz` check immediately after the loop (that is after L1522 in your latest version). Like following add-on change: >> >> >> diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp >> index 2d93d36a37f..11cbcaa48a1 100644 >> --- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp >> +++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp >> @@ -1521,9 +1521,9 @@ void C2_MacroAssembler::arrays_hashcode(Register ary, Register cnt, Register res >> // + 31^^1 * ary[i+2] + 31^^0 * ary[i+3] >> addi(ary, ary, elsize * stride); >> bne(ary, chunks_end, WIDE_LOOP); >> + beqz(cnt, DONE); >> >> bind(TAIL); >> - beqz(cnt, DONE); >> slli(chunks_end, cnt, chunks_end_shift); >> add(chunks_end, ary, chunks_end); > > Ah, got it finally. Nice catch, thanks! Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16629#discussion_r1421738968 From dholmes at openjdk.org Mon Dec 11 04:49:14 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Dec 2023 04:49:14 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v3] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 18:50:38 GMT, Yi-Fan Tsai wrote: >> This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. >> >> Example: >> >> % java -Xlog:inlinecache=trace -version >> [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I >> [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V >> [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 >> [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I >> [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' >> ... > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Enable inlinecache logs in TestTraceICs Code changes look fine to me. I think the test can go but that is up to @vnkozlov . Thanks test/hotspot/jtreg/compiler/arguments/TestTraceICs.java line 27: > 25: * @test > 26: * @bug 8217447 > 27: * @summary Test running TraceICs enabled. The summary is no longer applicable. But really this test seems some what pointless. It basically checks that `Xlog:inlinecache` is a valid log setting, but doesn't actually check anything interesting. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17026#pullrequestreview-1774275135 PR Review Comment: https://git.openjdk.org/jdk/pull/17026#discussion_r1421938428 From dholmes at openjdk.org Mon Dec 11 06:05:16 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 11 Dec 2023 06:05:16 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: <3pfgWe1NIoMrOXlGqLsyJCsgPgMZ6AJtlxSy64o76o8=.ecc470d4-12c2-4b1b-9da9-1155ceb8329e@github.com> References: <3pfgWe1NIoMrOXlGqLsyJCsgPgMZ6AJtlxSy64o76o8=.ecc470d4-12c2-4b1b-9da9-1155ceb8329e@github.com> Message-ID: On Wed, 6 Dec 2023 08:18:36 GMT, Thomas Stuefe wrote: > I cannot just use scanf with %f since that would also parse values without decimal point that are meant to be absolute. 0.0 -.999... == % else absolute ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1849378802 From fyang at openjdk.org Mon Dec 11 07:29:22 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 11 Dec 2023 07:29:22 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v12] In-Reply-To: References: Message-ID: On Sun, 10 Dec 2023 12:14:35 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Moved zero check for cnt before TAIL per @RealFYang suggestion. Thanks for the update. So I gave it a second try and some tunning. I see up to 7%+ extra improvement on licheepi-4a board (T-Head C910) with following small add-on change (no obvious change on unmatched board). This materializes the powers of 31 with direct `mv` instructions and avoids loading elements from `_arrays_hashcode_powers_of_31` array which would involve calculation of the array address. We could further remove the `_arrays_hashcode_powers_of_31` array then. diff --git a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp index 11cbcaa48a1..fe82b7a4e74 100644 --- a/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp +++ b/src/hotspot/cpu/riscv/c2_MacroAssembler_riscv.cpp @@ -1493,16 +1493,16 @@ void C2_MacroAssembler::arrays_hashcode(Register ary, Register cnt, Register res beqz(cnt, DONE); - addiw(pow31_2, zr, 961); // [31^^2] andi(chunks, cnt, ~(stride-1)); beqz(chunks, TAIL); + mv(pow31_4, 923521); // [31^^4] + mv(pow31_3, 29791); // [31^^3] + mv(pow31_2, 961); // [31^^2] + slli(chunks_end, chunks, chunks_end_shift); add(chunks_end, ary, chunks_end); andi(cnt, cnt, stride-1); // don't forget about tail! - ld(pow31_4, ExternalAddress(StubRoutines::riscv::arrays_hashcode_powers_of_31() - + 0 * sizeof(jint))); // [31^^3:31^^4] - srli(pow31_3, pow31_4, 32); bind(WIDE_LOOP); mulw(result, result, pow31_4); // 31^^4 * h 1. licheepi-4a / without addon fix: Benchmark (size) Mode Cnt Score Error Units ArraysHashCode.bytes 1 avgt 15 21.327 ? 0.035 ns/op ArraysHashCode.bytes 10 avgt 15 33.195 ? 0.166 ns/op ArraysHashCode.bytes 100 avgt 15 154.175 ? 3.433 ns/op ArraysHashCode.bytes 10000 avgt 15 12318.680 ? 25.131 ns/op ArraysHashCode.chars 1 avgt 15 20.965 ? 0.598 ns/op ArraysHashCode.chars 10 avgt 15 33.097 ? 0.117 ns/op ArraysHashCode.chars 100 avgt 15 153.510 ? 0.280 ns/op ArraysHashCode.chars 10000 avgt 15 11881.690 ? 44.507 ns/op ArraysHashCode.ints 1 avgt 15 21.330 ? 0.070 ns/op ArraysHashCode.ints 10 avgt 15 33.409 ? 0.225 ns/op ArraysHashCode.ints 100 avgt 15 154.254 ? 0.650 ns/op ArraysHashCode.ints 10000 avgt 15 11833.894 ? 73.945 ns/op ArraysHashCode.multibytes 1 avgt 15 3.468 ? 0.046 ns/op ArraysHashCode.multibytes 10 avgt 15 12.412 ? 0.126 ns/op ArraysHashCode.multibytes 100 avgt 15 75.963 ? 0.267 ns/op ArraysHashCode.multibytes 10000 avgt 15 6587.068 ? 53.064 ns/op ArraysHashCode.multichars 1 avgt 15 3.437 ? 0.042 ns/op ArraysHashCode.multichars 10 avgt 15 13.019 ? 0.118 ns/op ArraysHashCode.multichars 100 avgt 15 82.657 ? 0.244 ns/op ArraysHashCode.multichars 10000 avgt 15 6743.844 ? 80.474 ns/op ArraysHashCode.multiints 1 avgt 15 3.409 ? 0.036 ns/op ArraysHashCode.multiints 10 avgt 15 13.102 ? 0.140 ns/op ArraysHashCode.multiints 100 avgt 15 82.864 ? 1.002 ns/op ArraysHashCode.multiints 10000 avgt 15 7107.843 ? 69.506 ns/op ArraysHashCode.multishorts 1 avgt 15 3.475 ? 0.033 ns/op ArraysHashCode.multishorts 10 avgt 15 12.923 ? 0.108 ns/op ArraysHashCode.multishorts 100 avgt 15 82.498 ? 0.450 ns/op ArraysHashCode.multishorts 10000 avgt 15 6744.477 ? 22.576 ns/op ArraysHashCode.shorts 1 avgt 15 21.337 ? 0.077 ns/op ArraysHashCode.shorts 10 avgt 15 33.236 ? 0.114 ns/op ArraysHashCode.shorts 100 avgt 15 154.099 ? 0.421 ns/op ArraysHashCode.shorts 10000 avgt 15 11876.918 ? 41.767 ns/op 2. licheepi-4a / with add-on change: Benchmark (size) Mode Cnt Score Error Units ArraysHashCode.bytes 1 avgt 15 21.311 ? 0.036 ns/op ArraysHashCode.bytes 10 avgt 15 32.113 ? 0.124 ns/op ArraysHashCode.bytes 100 avgt 15 150.476 ? 0.635 ns/op ArraysHashCode.bytes 10000 avgt 15 11639.521 ? 16.383 ns/op ArraysHashCode.chars 1 avgt 15 21.329 ? 0.041 ns/op ArraysHashCode.chars 10 avgt 15 32.315 ? 0.466 ns/op ArraysHashCode.chars 100 avgt 15 151.996 ? 1.008 ns/op ArraysHashCode.chars 10000 avgt 15 10957.449 ? 23.898 ns/op ArraysHashCode.ints 1 avgt 15 21.323 ? 0.035 ns/op ArraysHashCode.ints 10 avgt 15 32.416 ? 0.170 ns/op ArraysHashCode.ints 100 avgt 15 152.277 ? 0.555 ns/op ArraysHashCode.ints 10000 avgt 15 11019.286 ? 53.589 ns/op ArraysHashCode.multibytes 1 avgt 15 3.450 ? 0.026 ns/op ArraysHashCode.multibytes 10 avgt 15 12.204 ? 0.171 ns/op ArraysHashCode.multibytes 100 avgt 15 78.433 ? 0.357 ns/op ArraysHashCode.multibytes 10000 avgt 15 6654.488 ? 19.664 ns/op ArraysHashCode.multichars 1 avgt 15 3.443 ? 0.043 ns/op ArraysHashCode.multichars 10 avgt 15 12.364 ? 0.087 ns/op ArraysHashCode.multichars 100 avgt 15 78.246 ? 0.540 ns/op ArraysHashCode.multichars 10000 avgt 15 6455.363 ? 30.115 ns/op ArraysHashCode.multiints 1 avgt 15 3.441 ? 0.019 ns/op ArraysHashCode.multiints 10 avgt 15 12.493 ? 0.063 ns/op ArraysHashCode.multiints 100 avgt 15 78.485 ? 0.587 ns/op ArraysHashCode.multiints 10000 avgt 15 6843.608 ? 82.197 ns/op ArraysHashCode.multishorts 1 avgt 15 3.466 ? 0.029 ns/op ArraysHashCode.multishorts 10 avgt 15 12.369 ? 0.144 ns/op ArraysHashCode.multishorts 100 avgt 15 78.172 ? 0.580 ns/op ArraysHashCode.multishorts 10000 avgt 15 6446.791 ? 13.104 ns/op ArraysHashCode.shorts 1 avgt 15 20.971 ? 0.574 ns/op ArraysHashCode.shorts 10 avgt 15 32.002 ? 0.642 ns/op ArraysHashCode.shorts 100 avgt 15 152.359 ? 0.692 ns/op ArraysHashCode.shorts 10000 avgt 15 10968.816 ? 31.404 ns/op ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1849459695 From duke at openjdk.org Mon Dec 11 07:40:26 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Mon, 11 Dec 2023 07:40:26 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v4] In-Reply-To: References: Message-ID: <1LJAteddFM6_ZpS0HmPDRy_gu4vORyR_yFVs5XhF71E=.475e4ca2-f9e7-4edc-889d-98ed26c0c9eb@github.com> > This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. > > Example: > > % java -Xlog:inlinecache=trace -version > [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I > [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V > [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 > [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I > [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' > ... Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Update summary of TestTraceICs ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17026/files - new: https://git.openjdk.org/jdk/pull/17026/files/5d782f75..23e4e739 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17026&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17026&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17026.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17026/head:pull/17026 PR: https://git.openjdk.org/jdk/pull/17026 From duke at openjdk.org Mon Dec 11 07:49:17 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Mon, 11 Dec 2023 07:49:17 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v3] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 04:46:11 GMT, David Holmes wrote: >> Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: >> >> Enable inlinecache logs in TestTraceICs > > test/hotspot/jtreg/compiler/arguments/TestTraceICs.java line 27: > >> 25: * @test >> 26: * @bug 8217447 >> 27: * @summary Test running TraceICs enabled. > > The summary is no longer applicable. > > But really this test seems some what pointless. It basically checks that `Xlog:inlinecache` is a valid log setting, but doesn't actually check anything interesting. Updated. TestLogJIT is similar in only enabling the log setting without checking anything. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17026#discussion_r1422043325 From stefank at openjdk.org Mon Dec 11 09:21:21 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 11 Dec 2023 09:21:21 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder Message-ID: [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. With these functions it is common to see the following pattern in tests: ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); OutputAnalyzer output = executeProcess(pb); We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: OutputAnalyzer output = ProcessTools.executeTestJvm(); I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. ------------- Commit messages: - 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder Changes: https://git.openjdk.org/jdk/pull/17049/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321713 Stats: 217 lines in 88 files changed: 28 ins; 1 del; 188 mod Patch: https://git.openjdk.org/jdk/pull/17049.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17049/head:pull/17049 PR: https://git.openjdk.org/jdk/pull/17049 From stefank at openjdk.org Mon Dec 11 10:18:14 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 11 Dec 2023 10:18:14 GMT Subject: RFR: 8321631: Fix comments in access.hpp In-Reply-To: References: Message-ID: On Sat, 9 Dec 2023 07:59:02 GMT, Guoxiong Li wrote: > Hi all, > > This trivial patch fixes the comments about `atomic_xchg` and `atomic_xchg_at` in `access.hpp`. > And it removes the unnecessary content about `INSTANTIATE_HPP_ACCESS` > which has been aleready removed in [JDK-8230808](https://bugs.openjdk.org/browse/JDK-8230808). > > Thanks for the review. > > Best Regards, > -- Guoxiong Marked as reviewed by stefank (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17042#pullrequestreview-1774768137 From duke at openjdk.org Mon Dec 11 10:39:36 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Mon, 11 Dec 2023 10:39:36 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v13] In-Reply-To: References: Message-ID: > Hello All, > > Please review these changes to support _vectorizedHashCode intrinsic on > RISC-V platform. The patch adds the "scalar" code for the intrinsic without > usage of any RVV instruction but provides manual unrolling of the appropriate > loop. The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Thanks, > -Yuri Gaevsky > > P.S. My OCA has been accepted recently (ygaevsky). > > ### Correctness checks > > Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. > > ### Performance results (the numbers for non-ints are similar) > > #### StarFive JH7110 board: > > > ArraysHashCode: without intrinsic with intrinsic > ------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > ------------------------------------------------------------------------------- > multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op > multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op > multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op > multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op > multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op > multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op > multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op > multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op > multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op > multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op > multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op > multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op > multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op > multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op > multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op > multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op > multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op > multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op > ------------------------------------------------------------------------------- > > #### T-Head RVB-ICE board: > > > ArraysHashCode: ... Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: Replaces memory loads for 31^^4 and 31^^3 constants with mv() instructions. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16629/files - new: https://git.openjdk.org/jdk/pull/16629/files/94ad47e2..cf988f0a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16629&range=11-12 Stats: 21 lines in 3 files changed: 4 ins; 17 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16629.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16629/head:pull/16629 PR: https://git.openjdk.org/jdk/pull/16629 From duke at openjdk.org Mon Dec 11 10:42:24 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Mon, 11 Dec 2023 10:42:24 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v13] In-Reply-To: References: Message-ID: <7DzdRCQRCTGiWBR0IoXgRRFS5lcTAxnMDIG7ocnujss=.8bebcecd-3912-4a7f-86e1-fbc2bdf16579@github.com> On Mon, 11 Dec 2023 10:39:36 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Replaces memory loads for 31^^4 and 31^^3 constants with mv() instructions. Avoiding memory accesses is always good idea, so your suggestion makes perfect sense, thanks, fixed. I've re-checked that the performance is at least not worse on THead/SiFive/StarFive. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1849782158 From fyang at openjdk.org Mon Dec 11 10:50:26 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 11 Dec 2023 10:50:26 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v13] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 10:39:36 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Replaces memory loads for 31^^4 and 31^^3 constants with mv() instructions. Updated change looks good to me. Thanks for your patience. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16629#pullrequestreview-1774849521 From rehn at openjdk.org Mon Dec 11 10:51:16 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 11 Dec 2023 10:51:16 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 17:48:11 GMT, Ludovic Henry wrote: > 8315856: RISC-V: Use Zacas extension for cmpxchg I this was what I think we had somewhere :) src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2645: > 2643: > 2644: bind(nope); > 2645: membar(AnyAny); Suggestion: src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2647: > 2645: membar(AnyAny); > 2646: } > 2647: Suggestion: membar(AnyAny); ------------- PR Review: https://git.openjdk.org/jdk/pull/16910#pullrequestreview-1774842971 PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1422286487 PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1422286394 From duke at openjdk.org Mon Dec 11 11:04:21 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Mon, 11 Dec 2023 11:04:21 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v13] In-Reply-To: References: Message-ID: <1prlr6fuJsdM7jGEeRDOKwsW4-VUMSZeXuDuW5OMhvQ=.393df56b-5f0b-4f0e-8ed5-8de6d3295fbd@github.com> On Mon, 11 Dec 2023 10:47:52 GMT, Fei Yang wrote: > Updated change looks good to me. Thanks for your patience. Thank you for the thorough review and suggestions, @RealFYang. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1849835967 From luhenry at openjdk.org Mon Dec 11 11:20:30 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 11 Dec 2023 11:20:30 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v2] In-Reply-To: References: Message-ID: > 8315856: RISC-V: Use Zacas extension for cmpxchg Ludovic Henry has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: - review - Merge branch 'master' of github.com:openjdk/jdk into upstream-zacas - 8315856: RISC-V: Use Zacas extension for cmpxchg ------------- Changes: https://git.openjdk.org/jdk/pull/16910/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=01 Stats: 197 lines in 5 files changed: 166 ins; 4 del; 27 mod Patch: https://git.openjdk.org/jdk/pull/16910.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16910/head:pull/16910 PR: https://git.openjdk.org/jdk/pull/16910 From rehn at openjdk.org Mon Dec 11 11:30:19 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Mon, 11 Dec 2023 11:30:19 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v2] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 11:20:30 GMT, Ludovic Henry wrote: >> 8315856: RISC-V: Use Zacas extension for cmpxchg > > Ludovic Henry has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: > > - review > - Merge branch 'master' of github.com:openjdk/jdk into upstream-zacas > - 8315856: RISC-V: Use Zacas extension for cmpxchg src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 1079: > 1077: Assembler::Aqrl acquire = Assembler::relaxed, Assembler::Aqrl release = Assembler::relaxed); > 1078: > 1079: static bool far_branches() { Is this related ? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1422331784 From luhenry at openjdk.org Mon Dec 11 11:35:43 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 11 Dec 2023 11:35:43 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v3] In-Reply-To: References: Message-ID: > 8315856: RISC-V: Use Zacas extension for cmpxchg Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: fix merge conflict mistake ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16910/files - new: https://git.openjdk.org/jdk/pull/16910/files/b9d86703..16b42595 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=01-02 Stats: 6 lines in 1 file changed: 0 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16910.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16910/head:pull/16910 PR: https://git.openjdk.org/jdk/pull/16910 From luhenry at openjdk.org Mon Dec 11 11:35:46 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 11 Dec 2023 11:35:46 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v2] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 11:27:04 GMT, Robbin Ehn wrote: >> Ludovic Henry has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits: >> >> - review >> - Merge branch 'master' of github.com:openjdk/jdk into upstream-zacas >> - 8315856: RISC-V: Use Zacas extension for cmpxchg > > src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 1079: > >> 1077: Assembler::Aqrl acquire = Assembler::relaxed, Assembler::Aqrl release = Assembler::relaxed); >> 1078: >> 1079: static bool far_branches() { > > Is this related ? Not at all, it's a merge-conflict-fix gone wrong. Let me fix right now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1422335509 From aph at openjdk.org Mon Dec 11 13:58:17 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 11 Dec 2023 13:58:17 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v3] In-Reply-To: <6My9uP_jRGDPa31RMLN07O7DpheOFp6d2KgLSDieWU8=.bb4a52a3-b766-4167-9078-08dd9af564cf@github.com> References: <6My9uP_jRGDPa31RMLN07O7DpheOFp6d2KgLSDieWU8=.bb4a52a3-b766-4167-9078-08dd9af564cf@github.com> Message-ID: On Mon, 4 Dec 2023 07:22:21 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2630: >> >>> 2628: mv(tmp, oldv); >>> 2629: atomic_cas(tmp, newv, addr, Assembler::int64, Assembler::aq, Assembler::rl); >>> 2630: beq(tmp, oldv, succeed); >> >> The Zacas spec says: `The memory operation performed by an AMOCAS.W/D/Q, when not successful, has acquire semantics if aq bit is 1 but does not have release semantics, regardless of rl.` >> >> So when the CAS fails, I think we are lacking the needed semantics which is enforced at L2645 for the else block. Seems that we should place a `membar(AnyAny);` after the `beq` instruction when like our aarch64 counterpart [1]. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L2758 > > Good! Yea, we discussed that internally and I thought we fixed that, those changes seems to have been lost, thanks! That `dmb` is not present in the AArch64 port because we want a release when the CAS fails, because if it fails nothing was stored, so there is literally nothing for a subsequent load from that address to synchronize with. It's there because of this re-ordering: // atomic_op (B) 1: ldaxr x0, [B] // Exclusive load with acquire stlxr w1, x0, [B] // Exclusive store with release cbnz w1, 1b It doesn't forbid orderings such as Load [B] -> Load [C] -> Store [A] -> Store [B] [See here](https://mail.openjdk.org/pipermail/aarch64-port-dev/2014-February/000706.html) The Arm memory model has been strengthened, and this reasoning looks a bit shaky today. At the time we did not know if any of the usages of `cmpxchgptr`required "full barrier" semantics, so we put a full barrier in for safety's sake. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1422501212 From stefank at openjdk.org Mon Dec 11 14:03:36 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 11 Dec 2023 14:03:36 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder [v2] In-Reply-To: References: Message-ID: > [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. > > We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. > > With these functions it is common to see the following pattern in tests: > > ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); > OutputAnalyzer output = executeProcess(pb); > > > We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: > > OutputAnalyzer output = ProcessTools.executeTestJvm(); > > > I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Fix impl and add test ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17049/files - new: https://git.openjdk.org/jdk/pull/17049/files/080caef5..ad072e06 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=00-01 Stats: 54 lines in 2 files changed: 52 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17049.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17049/head:pull/17049 PR: https://git.openjdk.org/jdk/pull/17049 From stefank at openjdk.org Mon Dec 11 14:06:43 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Mon, 11 Dec 2023 14:06:43 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder [v3] In-Reply-To: References: Message-ID: > [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. > > We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. > > With these functions it is common to see the following pattern in tests: > > ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); > OutputAnalyzer output = executeProcess(pb); > > > We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: > > OutputAnalyzer output = ProcessTools.executeTestJvm(); > > > I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: Test cleanup ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17049/files - new: https://git.openjdk.org/jdk/pull/17049/files/ad072e06..5d488f42 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17049&range=01-02 Stats: 10 lines in 1 file changed: 1 ins; 8 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17049.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17049/head:pull/17049 PR: https://git.openjdk.org/jdk/pull/17049 From lkorinth at openjdk.org Mon Dec 11 14:23:19 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 11 Dec 2023 14:23:19 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder [v3] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 14:06:43 GMT, Stefan Karlsson wrote: >> [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. >> >> We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. >> >> With these functions it is common to see the following pattern in tests: >> >> ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); >> OutputAnalyzer output = executeProcess(pb); >> >> >> We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: >> >> OutputAnalyzer output = ProcessTools.executeTestJvm(); >> >> >> I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Test cleanup Looks good to me. ------------- Marked as reviewed by lkorinth (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17049#pullrequestreview-1775250269 From avoitylov at openjdk.org Mon Dec 11 15:34:14 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Mon, 11 Dec 2023 15:34:14 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: On Sat, 9 Dec 2023 04:59:02 GMT, Boris Ulasevich wrote: >> Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. > > Thanks for fixing the ARM32 build! > I see the change fixes wrong offset and register usage and does some cleanup. It is OK for me. > Thanks again! thanks @bulasevich. Any Reviewers, please? I'd like to to get the ARM32 port to work again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17017#issuecomment-1850315153 From mli at openjdk.org Mon Dec 11 15:53:25 2023 From: mli at openjdk.org (Hamlin Li) Date: Mon, 11 Dec 2023 15:53:25 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v13] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 10:39:36 GMT, Yuri Gaevsky wrote: >> Hello All, >> >> Please review these changes to support _vectorizedHashCode intrinsic on >> RISC-V platform. The patch adds the "scalar" code for the intrinsic without >> usage of any RVV instruction but provides manual unrolling of the appropriate >> loop. The code with usage of RVV instruction could be added as follow-up of >> the patch or independently. >> >> Thanks, >> -Yuri Gaevsky >> >> P.S. My OCA has been accepted recently (ygaevsky). >> >> ### Correctness checks >> >> Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. >> >> ### Performance results (the numbers for non-ints are similar) >> >> #### StarFive JH7110 board: >> >> >> ArraysHashCode: without intrinsic with intrinsic >> ------------------------------------------------------------------------------- >> Benchmark (size) Mode Cnt Score Error Score Error Units >> ------------------------------------------------------------------------------- >> multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op >> multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op >> multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op >> multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op >> multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op >> multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op >> multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op >> multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op >> multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op >> multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op >> multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op >> multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op >> multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op >> multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op >> multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op >> multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op >> multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op >> multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op >> ---------------------------------------... > > Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: > > Replaces memory loads for 31^^4 and 31^^3 constants with mv() instructions. LGTM. Thanks! ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16629#pullrequestreview-1775502190 From shade at openjdk.org Mon Dec 11 15:59:17 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 11 Dec 2023 15:59:17 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: On Thu, 7 Dec 2023 10:25:05 GMT, Aleksei Voitylov wrote: > Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. Looks okay, but I have suggestions. Also changed target-version to 23 in JBS, so that we don't create the backports when we push. You might want to pull and merge the new master to get new mainline with jcheck config for 23. src/hotspot/cpu/arm/interp_masm_arm.cpp line 311: > 309: get_index_at_bcp(index, bcp_offset, cache /* as tmp */, sizeof(u2)); > 310: > 311: if (is_power_of_2(sizeof(ResolvedMethodEntry))) { I usually dislike introducing split like these, because one of the branches is effectively dead. Which also means it is effectively untested. Given this interpreter code, can we just leave the generic version unconditionally? src/hotspot/cpu/arm/interp_masm_arm.cpp line 318: > 316: add(cache, cache, AsmOperand(index, lsl, log2i_exact(sizeof(ResolvedMethodEntry)))); > 317: } > 318: else { Suggestion: } else { ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17017#pullrequestreview-1775511070 PR Comment: https://git.openjdk.org/jdk/pull/17017#issuecomment-1850366263 PR Review Comment: https://git.openjdk.org/jdk/pull/17017#discussion_r1422707417 PR Review Comment: https://git.openjdk.org/jdk/pull/17017#discussion_r1422707792 From duke at openjdk.org Mon Dec 11 16:22:25 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Mon, 11 Dec 2023 16:22:25 GMT Subject: RFR: 8318217: RISC-V: C2 VectorizedHashCode [v10] In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 14:16:33 GMT, Hamlin Li wrote: >> Yuri Gaevsky has updated the pull request incrementally with one additional commit since the last revision: >> >> Added two temp registers for loads; all loads in wide loop has been moved to the start of the loop. > > It's all in the pr https://github.com/openjdk/jdk/pull/16453, I don't have much other information. > The major perf opt method of that pr is `do the loads from the buffer more incrementally instead of all in one go`, which I think is the opposite you're doing here. Seems the major difference between this and that pr is, in that pr it has lots of more work to do between the `incremental load`. Thanks for your review/suggestions, @Hamlin-Li! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16629#issuecomment-1850410673 From kvn at openjdk.org Mon Dec 11 16:54:23 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Mon, 11 Dec 2023 16:54:23 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v3] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 07:47:00 GMT, Yi-Fan Tsai wrote: >> test/hotspot/jtreg/compiler/arguments/TestTraceICs.java line 27: >> >>> 25: * @test >>> 26: * @bug 8217447 >>> 27: * @summary Test running TraceICs enabled. >> >> The summary is no longer applicable. >> >> But really this test seems some what pointless. It basically checks that `Xlog:inlinecache` is a valid log setting, but doesn't actually check anything interesting. > > Updated. TestLogJIT is similar in only enabling the log setting without checking anything. Yes, these tests could check output but it is separate issue. Let keep them and update later. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17026#discussion_r1422799726 From rkennke at openjdk.org Mon Dec 11 17:05:34 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Mon, 11 Dec 2023 17:05:34 GMT Subject: RFR: 8305898: Alternative self-forwarding mechanism [v23] In-Reply-To: References: Message-ID: > Currently, the Serial, Parallel and G1 GCs store a pointer to self into object headers, when compaction fails, to indicate that the object has been looked at, but failed compaction into to-space. This is problematic for compact object headers ([JDK-8294992](https://bugs.openjdk.org/browse/JDK-8294992)) because it would (temporarily) over-write the crucial class information, which we need for heap parsing. I would like to propose an alternative: use the bit #3 (previously biased-locking bit) to indicate that an object is 'self-forwarded'. That preserves the crucial class information in the upper bits of the header until the full header gets restored. Roman Kennke has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 35 commits: - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - Merge branch 'JDK-8305896' into JDK-8305898 - Update comment about mark-word layout - Merge branch 'JDK-8305896' into JDK-8305898 - Fix tests on 32bit builds - ... and 25 more: https://git.openjdk.org/jdk/compare/d67b42f5...1a4dda11 ------------- Changes: https://git.openjdk.org/jdk/pull/13779/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=13779&range=22 Stats: 101 lines in 8 files changed: 85 ins; 2 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/13779.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/13779/head:pull/13779 PR: https://git.openjdk.org/jdk/pull/13779 From luhenry at openjdk.org Mon Dec 11 18:28:44 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Mon, 11 Dec 2023 18:28:44 GMT Subject: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6] In-Reply-To: References: <3cK9QjVQNIgZoWWhrWKEb3XxfbLjprjRMBbStWegH7M=.6df92651-b97d-445a-aa42-302ea791bbea@github.com> Message-ID: On Mon, 4 Dec 2023 11:58:55 GMT, Magnus Ihse Bursie wrote: > I can't say anything for sure, but I picked up some positive vibes from our internal chat. I think the idea was that libsleef could potentially cover up vector math for all platforms that the current Intel lib solution is missing (basically, everything but linux+windows x64). So I this can be seen as a bit of a trial balloon if it is worth a more complete integration of libsleef in the JDK. I can add that we are interested to use that for Linux + RISC-V support given the RISC-V support was recently merged into sleef upstream. https://github.com/shibatch/sleef/pull/477 ------------- PR Comment: https://git.openjdk.org/jdk/pull/16234#issuecomment-1850636984 From jvernee at openjdk.org Mon Dec 11 18:38:55 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 11 Dec 2023 18:38:55 GMT Subject: RFR: 8320310: CompiledMethod::has_monitors flag can be incorrect [v4] In-Reply-To: <4efExybeWDkEbcsckI1Qdz8kpYFqd-Rbmt7oiWz5qlo=.d8d38d0e-affa-48dc-b963-45f958041c4e@github.com> References: <4efExybeWDkEbcsckI1Qdz8kpYFqd-Rbmt7oiWz5qlo=.d8d38d0e-affa-48dc-b963-45f958041c4e@github.com> Message-ID: > Currently, the `CompiledMethod::has_monitors` flag is set when either a `monitorenter` is parsed by C1, and `monitorexit` is parsed by C1 or C2 during method compilation. However, not necessarily every bytecode of a method is parsed, which means that we could miss all `monitorenter`/`monitorexit` byte codes in a method, while it actually does use monitors. This can lead to situations where a thread holds a monitor, but `has_monitors` for all frames is set to `false`, leading to an assertion failure in 'freeze_internal' in continuationFreezeThaw.cpp: > > assert(monitors_on_stack(current) == ((current->held_monitor_count() - current->jni_monitor_count()) > 0), > "Held monitor count and locks on stack invariant: " INT64_FORMAT " JNI: " INT64_FORMAT, (int64_t)current->held_monitor_count(), (int64_t)current->jni_monitor_count()); > > The proposed fix is to rely on `Method::has_monitor_bytecodes` to set the `has_monitors` flag when compiling, which is immune to issues where not all byte codes of a method are parsed during compilation. We can follow the pattern established for `has_reserved_stack_access`, which is similar. > > Note that this PR is based on: https://github.com/openjdk/jdk/pull/16416 which disables the assertion. The goal of this PR is to fix the issue, and then re-enable the assertion. > > Testing: Tier 1-4, `java/lang/Thread/virtual/stress/PinALot.java` Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: re-enable assert again ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16799/files - new: https://git.openjdk.org/jdk/pull/16799/files/85b2d662..eb7f0f5a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16799&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16799&range=02-03 Stats: 17 lines in 1 file changed: 0 ins; 0 del; 17 mod Patch: https://git.openjdk.org/jdk/pull/16799.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16799/head:pull/16799 PR: https://git.openjdk.org/jdk/pull/16799 From lkorinth at openjdk.org Mon Dec 11 18:47:34 2023 From: lkorinth at openjdk.org (Leo Korinth) Date: Mon, 11 Dec 2023 18:47:34 GMT Subject: RFR: 8320750: Allow a testcase to run with muliple -Xlog In-Reply-To: References: Message-ID: On Mon, 27 Nov 2023 13:32:52 GMT, Leo Korinth wrote: > Running a testcase with muliple -Xlog crashes JTREG test cases. This is because `Collector.toMap` is not given a merge strategy. > > When the same argument is passed multiple times, I have added a merge strategy to use the latter value. This is similar to how it is implemented for `vm.opt.*` in JTREG. > > If the flag tested is `-Xlog`, replace the value part with a dummy value "NONEMPTY_TEST_SENTINEL". This is because in the case of multiple `-Xlog` all values are used, and JTREG does not give a satisfactory way to represent them. This dummy value should make it hard to try to `@require` on specific values by mistake. > > Tested with: > > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINEL" > @requires vm.opt.x.Xlog == "NONEMPTY_TEST_SENTINELXXX" > @requires vm.opt.x.Xms == "3g" > > and > > JAVA_OPTIONS=-Xms3g -Xms4g > JAVA_OPTIONS=-Xms4g -Xms3g > JAVA_OPTIONS=-Xlog:gc* -Xlog:gc* > ``` > > Running tier1 Hi again and sorry for taking so much time. I have been thinking about this for a while, and done some code search inside jtreg etc. I have not really come to a conclusion, but let me try to summarize some of it. First I want to say that the idea (in the beginning) was not to test for the final value to use but to test that certain flags does not collide/conflict with flags added by the test case. For example, the different flags that chooses a gc collides with each other, and I wanted to make similar checks for short options. I was about to try to change lots of gc test cases to use test APIs that propagates VM flags and I thought I would need that functionality. It does not help me to check the *final* vm flag values if the flags have conflicted before that. It also somewhat irritates me that jtreg has a mechanism to test for only `-XX` flags. *However*, after this review and after starting to look at certain flags, it seems that it is in /general/ alright to combine flags that obviously conflicts. There seems to be no problem to tell java to use the interpreter and then later to tell it to use the compiler (quite different from telling it to use serial gc followed by parallel that is not allowed). Another thing I have discovered is that it seems to me that vm flags are *prepended* and not appended when using `@run` and when spawning a new test vm using `createTestJavaProcessBuilder`. It was the opposite of what I would have guessed. *It could be that these two observations make it easy enough to skip require flags* and just rely on that user flags are prepended and that test flags are appended and will *override*. If it is also the case that we can mix and match all short flags (*I need your input on this*), it might make it much easier to convert test cases. I could remove the short flag detection in VMProps, but I would not be happy if I later see that certain of these flags *do* conflict. It might also be that it is good, for other reasons, to test against these flags with `@require` lines. It is also an unfortunate consequence that this behaviour of prepending vm flags that it also makes it extremely hard to know if vm flags will be active in a test case (lets test to see if this test case works with 2 bytes heap --- ooh, it does --- because the test case sets the heap size as well and it overrides). But I digress. I am willing to remove parsing of short flags if you know that it will not be useful; I think it might be better to just fix the bug as I suspect that this functionality is useful. I also want to say that I am a bit conflicted and that I am not really sure, I do like to remove not needed code. Feedback on if/how short flag conflicts would be valuable for me. Feedback on whether I have understood the prepending of vm flags in both jtreg as well as in our test framework correctly would be welcomed as well. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16824#issuecomment-1850676602 From fparain at openjdk.org Mon Dec 11 18:52:36 2023 From: fparain at openjdk.org (Frederic Parain) Date: Mon, 11 Dec 2023 18:52:36 GMT Subject: RFR: 8320886: Unsafe_SetMemory0 is not guarded [v3] In-Reply-To: References: <5kRdxpEyFZLzxlyHpdHju1w9qLbm4OA6UkVZMr17nt0=.339b7543-574c-4a06-84e9-2ffb9d9a345a@github.com> Message-ID: On Fri, 8 Dec 2023 19:43:28 GMT, Jorn Vernee wrote: >> See JBS issue. >> >> Guard the memory access done in Unsafe_SetMemory0 to prevent a SIGBUS error from crashing the VM when a truncated memory mapped file is accessed. >> >> Testing: local `InternalErrorTest`, Tier 1-5 > > Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: > > adjust whitespace Marked as reviewed by fparain (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16848#pullrequestreview-1775948001 From jvernee at openjdk.org Mon Dec 11 19:08:41 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 11 Dec 2023 19:08:41 GMT Subject: Integrated: 8320886: Unsafe_SetMemory0 is not guarded In-Reply-To: <5kRdxpEyFZLzxlyHpdHju1w9qLbm4OA6UkVZMr17nt0=.339b7543-574c-4a06-84e9-2ffb9d9a345a@github.com> References: <5kRdxpEyFZLzxlyHpdHju1w9qLbm4OA6UkVZMr17nt0=.339b7543-574c-4a06-84e9-2ffb9d9a345a@github.com> Message-ID: On Tue, 28 Nov 2023 12:09:12 GMT, Jorn Vernee wrote: > See JBS issue. > > Guard the memory access done in Unsafe_SetMemory0 to prevent a SIGBUS error from crashing the VM when a truncated memory mapped file is accessed. > > Testing: local `InternalErrorTest`, Tier 1-5 This pull request has now been integrated. Changeset: ce4b257f Author: Jorn Vernee URL: https://git.openjdk.org/jdk/commit/ce4b257fa539d35a7d14bba2d5d3342093d714e1 Stats: 22 lines in 3 files changed: 12 ins; 0 del; 10 mod 8320886: Unsafe_SetMemory0 is not guarded Reviewed-by: dholmes, fparain ------------- PR: https://git.openjdk.org/jdk/pull/16848 From jvernee at openjdk.org Mon Dec 11 19:25:39 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Mon, 11 Dec 2023 19:25:39 GMT Subject: [jdk22] RFR: 8320886: Unsafe_SetMemory0 is not guarded Message-ID: Hi all, This pull request contains a backport of commit [ce4b257f](https://github.com/openjdk/jdk/commit/ce4b257fa539d35a7d14bba2d5d3342093d714e1) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Jorn Vernee on 11 Dec 2023 and was reviewed by David Holmes and Frederic Parain. Thanks! ------------- Commit messages: - Backport ce4b257fa539d35a7d14bba2d5d3342093d714e1 Changes: https://git.openjdk.org/jdk22/pull/8/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=8&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320886 Stats: 22 lines in 3 files changed: 12 ins; 0 del; 10 mod Patch: https://git.openjdk.org/jdk22/pull/8.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/8/head:pull/8 PR: https://git.openjdk.org/jdk22/pull/8 From dchuyko at openjdk.org Mon Dec 11 20:22:15 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Mon, 11 Dec 2023 20:22:15 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v14] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 32 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - jcheck - ... and 22 more: https://git.openjdk.org/jdk/compare/ce4b257f...8b8b30ae ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=13 Stats: 372 lines in 15 files changed: 339 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From cjplummer at openjdk.org Mon Dec 11 21:43:36 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 11 Dec 2023 21:43:36 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v7] In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 22:49:55 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) >> >> >> Compiler.perfmap [arguments] (Linux only) >> Write map file for Linux perf tool. >> >> Impact: Low >> >> arguments: >> >> ? filename: (Optional) Name of the map file (STRING, no default value) >> >> If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then >> the default filename will be /tmp/perf-12345.map. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > use default argument of write_perf_map The test needs a copyright update. Otherwise the changes look good. ------------- Marked as reviewed by cjplummer (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/15871#pullrequestreview-1776216918 From duke at openjdk.org Mon Dec 11 22:41:56 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Mon, 11 Dec 2023 22:41:56 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: > `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. > > `jcmd PID help Compiler.perfmap` shows the following usage. > > > Compiler.perfmap > Write map file for Linux perf tool. > > Impact: Low > > Syntax : Compiler.perfmap [] > > Arguments: > filename : [optional] Name of the map file (STRING, no default value) > > > The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) > > > Compiler.perfmap [arguments] (Linux only) > Write map file for Linux perf tool. > > Impact: Low > > arguments: > > ? filename: (Optional) Name of the map file (STRING, no default value) > > If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then > the default filename will be /tmp/perf-12345.map. Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: Update copyright of PerfMapTest ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15871/files - new: https://git.openjdk.org/jdk/pull/15871/files/dbe223c5..e1b0b162 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15871&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15871&range=06-07 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/15871.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15871/head:pull/15871 PR: https://git.openjdk.org/jdk/pull/15871 From vlivanov at openjdk.org Tue Dec 12 00:17:14 2023 From: vlivanov at openjdk.org (Vladimir Ivanov) Date: Tue, 12 Dec 2023 00:17:14 GMT Subject: RFR: 8320310: CompiledMethod::has_monitors flag can be incorrect [v4] In-Reply-To: References: <4efExybeWDkEbcsckI1Qdz8kpYFqd-Rbmt7oiWz5qlo=.d8d38d0e-affa-48dc-b963-45f958041c4e@github.com> Message-ID: On Mon, 11 Dec 2023 18:38:55 GMT, Jorn Vernee wrote: >> Currently, the `CompiledMethod::has_monitors` flag is set when either a `monitorenter` is parsed by C1, and `monitorexit` is parsed by C1 or C2 during method compilation. However, not necessarily every bytecode of a method is parsed, which means that we could miss all `monitorenter`/`monitorexit` byte codes in a method, while it actually does use monitors. This can lead to situations where a thread holds a monitor, but `has_monitors` for all frames is set to `false`, leading to an assertion failure in 'freeze_internal' in continuationFreezeThaw.cpp: >> >> assert(monitors_on_stack(current) == ((current->held_monitor_count() - current->jni_monitor_count()) > 0), >> "Held monitor count and locks on stack invariant: " INT64_FORMAT " JNI: " INT64_FORMAT, (int64_t)current->held_monitor_count(), (int64_t)current->jni_monitor_count()); >> >> The proposed fix is to rely on `Method::has_monitor_bytecodes` to set the `has_monitors` flag when compiling, which is immune to issues where not all byte codes of a method are parsed during compilation. We can follow the pattern established for `has_reserved_stack_access`, which is similar. >> >> Note that this PR is based on: https://github.com/openjdk/jdk/pull/16416 which disables the assertion. The goal of this PR is to fix the issue, and then re-enable the assertion. >> >> Testing: Tier 1-4, `java/lang/Thread/virtual/stress/PinALot.java` > > Jorn Vernee has updated the pull request incrementally with one additional commit since the last revision: > > re-enable assert again Looks good. ------------- Marked as reviewed by vlivanov (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16799#pullrequestreview-1776382808 From jnimeh at openjdk.org Tue Dec 12 01:09:40 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Tue, 12 Dec 2023 01:09:40 GMT Subject: RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes Message-ID: This fix corrects an oversight in the ChaCha20 intrinsics delivered by JDK-8247645. An ifdef guard is now part of the x86 ChaCha20 intrinsics code which disables them by default on 32-bit platforms, as this architecture was not part of the feature delivery. ------------- Commit messages: - Adjust 32-bit warning text - C2: Missing ChaCha20 stub for x86_32 leads to crashes Changes: https://git.openjdk.org/jdk/pull/17072/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17072&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321542 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17072.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17072/head:pull/17072 PR: https://git.openjdk.org/jdk/pull/17072 From jnimeh at openjdk.org Tue Dec 12 01:09:40 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Tue, 12 Dec 2023 01:09:40 GMT Subject: RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 01:02:59 GMT, Jamil Nimeh wrote: > This fix corrects an oversight in the ChaCha20 intrinsics delivered by JDK-8247645. An ifdef guard is now part of the x86 ChaCha20 intrinsics code which disables them by default on 32-bit platforms, as this architecture was not part of the feature delivery. The bug on 32-bit without this fix is easily reproducible by running any of the ChaCha20 microbenchmarks, or via the AEADBufferTest microbenchmark. Likewise, once the fix is applied these benchmarks no longer crash the JVM. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17072#issuecomment-1851140657 From fyang at openjdk.org Tue Dec 12 01:20:32 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 12 Dec 2023 01:20:32 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v3] In-Reply-To: References: <6My9uP_jRGDPa31RMLN07O7DpheOFp6d2KgLSDieWU8=.bb4a52a3-b766-4167-9078-08dd9af564cf@github.com> Message-ID: On Mon, 11 Dec 2023 13:55:39 GMT, Andrew Haley wrote: >> Good! Yea, we discussed that internally and I thought we fixed that, those changes seems to have been lost, thanks! > > That `dmb` is not present in the AArch64 port because we want a release when the CAS fails, because if it fails nothing was stored, so there is literally nothing for a subsequent load from that address to synchronize with. > > It's there because of this re-ordering: > > > > > > // atomic_op (B) > 1: ldaxr x0, [B] // Exclusive load with acquire > > stlxr w1, x0, [B] // Exclusive store with release > cbnz w1, 1b > > > > It doesn't forbid orderings such as > > Load [B] -> Load [C] -> Store [A] -> Store [B] > > > [See here](https://mail.openjdk.org/pipermail/aarch64-port-dev/2014-February/000706.html) > > The Arm memory model has been strengthened, and this reasoning looks a bit shaky today. At the time we did not know if any of the usages of `cmpxchgptr`required "full barrier" semantics, so we put a full barrier in for safety's sake. Wow, thanks for finding that history. It's very helpful for us to understand the existence of this barrier. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1423296081 From duke at openjdk.org Tue Dec 12 02:25:32 2023 From: duke at openjdk.org (Liming Liu) Date: Tue, 12 Dec 2023 02:25:32 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v12] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 06:47:15 GMT, Thomas Stuefe wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Update the name of the method > > I don't have time to look at this for now and step back to wait for others. But I don't want to hold up this patch - if there are enough reviewers that ok it, please don't wait for me. > > As I wrote earlier, the patch itself is mechanically fine. The test has aspects I don't understand, and I am worried about concurrent usage of the about-to-be-pretouched area by other threads. Hi, @tstuefe , @kimbarrett , @dholmes-ora & @jdksjolen . Could you please continue to review this? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1851199350 From fyang at openjdk.org Tue Dec 12 03:11:28 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 12 Dec 2023 03:11:28 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v3] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 11:35:43 GMT, Ludovic Henry wrote: >> 8315856: RISC-V: Use Zacas extension for cmpxchg > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > fix merge conflict mistake Changes requested by fyang (Reviewer). src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2868: > 2866: > 2867: atomic_cas(old, tmp, aligned_addr, size, acquire, release); > 2868: bne(tmp, old, retry); Does it make sense to use `atomic_cas` here for this case? I don't see the benefit. The input `size` is supposed to be `int8` or `int16`, but the newly-added `atomic_cas` can only handle `int64`, `int32` and `uint32`. src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2929: > 2927: > 2928: atomic_cas(tmp, new_val, addr, size, acquire, release); > 2929: bne(tmp, old, fail); Similar question as above here for this case. src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3009: > 3007: Register result) { > 3008: if (UseZacas) { > 3009: cmpxchg(addr, expected, new_val, size, acquire, release, result, false); The logic emitted by the original `MacroAssembler::cmpxchg_weak` returns a boolean value in register `result`. So shouldn't we set `result_as_bool` param to `true` when calling `cmpxchg` here? ------------- PR Review: https://git.openjdk.org/jdk/pull/16910#pullrequestreview-1776497580 PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1423350176 PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1423352211 PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1423338735 From fyang at openjdk.org Tue Dec 12 03:27:32 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 12 Dec 2023 03:27:32 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 22:31:07 GMT, Olga Mikhaltsova wrote: > Yes, that's right, I verified output of this algorithm against the current java implementation on the full 32-bit range. Hi, can you share with us your test? I am also trying to understand how this works. Thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1423360084 From duke at openjdk.org Tue Dec 12 06:38:45 2023 From: duke at openjdk.org (Yuri Gaevsky) Date: Tue, 12 Dec 2023 06:38:45 GMT Subject: Integrated: 8318217: RISC-V: C2 VectorizedHashCode In-Reply-To: References: Message-ID: On Mon, 13 Nov 2023 12:32:51 GMT, Yuri Gaevsky wrote: > Hello All, > > Please review these changes to support _vectorizedHashCode intrinsic on > RISC-V platform. The patch adds the "scalar" code for the intrinsic without > usage of any RVV instruction but provides manual unrolling of the appropriate > loop. The code with usage of RVV instruction could be added as follow-up of > the patch or independently. > > Thanks, > -Yuri Gaevsky > > P.S. My OCA has been accepted recently (ygaevsky). > > ### Correctness checks > > Testing: tier1 tests successfully passed on a RISC-V StarFive JH7110 board with Linux. > > ### Performance results (the numbers for non-ints are similar) > > #### StarFive JH7110 board: > > > ArraysHashCode: without intrinsic with intrinsic > ------------------------------------------------------------------------------- > Benchmark (size) Mode Cnt Score Error Score Error Units > ------------------------------------------------------------------------------- > multiints 0 avgt 30 2.658 ? 0.001 2.661 ? 0.004 ns/op > multiints 1 avgt 30 4.881 ? 0.011 4.892 ? 0.015 ns/op > multiints 2 avgt 30 16.109 ? 0.041 10.451 ? 0.075 ns/op > multiints 3 avgt 30 14.873 ? 0.068 11.753 ? 0.024 ns/op > multiints 4 avgt 30 17.283 ? 0.078 13.176 ? 0.044 ns/op > multiints 5 avgt 30 19.691 ? 0.136 14.723 ? 0.046 ns/op > multiints 6 avgt 30 21.727 ? 0.166 15.463 ? 0.124 ns/op > multiints 7 avgt 30 23.790 ? 0.126 18.298 ? 0.059 ns/op > multiints 8 avgt 30 23.527 ? 0.116 18.267 ? 0.046 ns/op > multiints 9 avgt 30 27.981 ? 0.303 20.453 ? 0.069 ns/op > multiints 10 avgt 30 26.947 ? 0.215 20.541 ? 0.051 ns/op > multiints 50 avgt 30 95.373 ? 0.588 69.238 ? 0.208 ns/op > multiints 100 avgt 30 177.109 ? 0.525 137.852 ? 0.417 ns/op > multiints 200 avgt 30 341.074 ? 1.363 296.832 ? 0.725 ns/op > multiints 500 avgt 30 847.993 ? 1.713 752.415 ? 1.918 ns/op > multiints 1000 avgt 30 1610.199 ? 5.424 1426.112 ? 3.407 ns/op > multiints 10000 avgt 30 16234.260 ? 26.789 14447.936 ? 26.345 ns/op > multiints 100000 avgt 30 170726.025 ? 184.003 152587.649 ? 381.964 ns/op > ------------------------------------------------------------------------------- > > #### T-Head RVB-ICE board: > > > ArraysHashCode: ... This pull request has now been integrated. Changeset: 6359b4ec Author: Yuri Gaevsky Committer: Vladimir Kempik URL: https://git.openjdk.org/jdk/commit/6359b4ec2303e9cd81f3cbcfdf1c3e015278cb7b Stats: 139 lines in 4 files changed: 139 ins; 0 del; 0 mod 8318217: RISC-V: C2 VectorizedHashCode Reviewed-by: mli, fyang ------------- PR: https://git.openjdk.org/jdk/pull/16629 From dholmes at openjdk.org Tue Dec 12 06:51:24 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 12 Dec 2023 06:51:24 GMT Subject: RFR: 8316197: Make tracing of inline cache available in unified logging [v4] In-Reply-To: <1LJAteddFM6_ZpS0HmPDRy_gu4vORyR_yFVs5XhF71E=.475e4ca2-f9e7-4edc-889d-98ed26c0c9eb@github.com> References: <1LJAteddFM6_ZpS0HmPDRy_gu4vORyR_yFVs5XhF71E=.475e4ca2-f9e7-4edc-889d-98ed26c0c9eb@github.com> Message-ID: On Mon, 11 Dec 2023 07:40:26 GMT, Yi-Fan Tsai wrote: >> This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. >> >> Example: >> >> % java -Xlog:inlinecache=trace -version >> [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I >> [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V >> [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' >> [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 >> [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I >> [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' >> ... > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update summary of TestTraceICs Marked as reviewed by dholmes (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17026#pullrequestreview-1776760125 From dholmes at openjdk.org Tue Dec 12 07:13:41 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 12 Dec 2023 07:13:41 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v12] In-Reply-To: References: Message-ID: <2FLPtM9O7UhPPmfaHjc1dL38kXR94ODWiDvQd_SdvPo=.98133b3c-5ac9-4ccd-8c68-52b46cd1356b@github.com> On Tue, 12 Dec 2023 02:22:12 GMT, Liming Liu wrote: >> I don't have time to look at this for now and step back to wait for others. But I don't want to hold up this patch - if there are enough reviewers that ok it, please don't wait for me. >> >> As I wrote earlier, the patch itself is mechanically fine. The test has aspects I don't understand, and I am worried about concurrent usage of the about-to-be-pretouched area by other threads. > > Hi, @tstuefe , @kimbarrett , @dholmes-ora & @jdksjolen . Could you please continue to review this? Sorry @limingliu-ampere , as I stated originally I can't comment on the actual use of MADV_POPULATE_WRITE so can't Review this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1851419463 From gli at openjdk.org Tue Dec 12 07:22:43 2023 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 12 Dec 2023 07:22:43 GMT Subject: RFR: 8321631: Fix comments in access.hpp In-Reply-To: References: Message-ID: On Sat, 9 Dec 2023 21:50:10 GMT, Erik ?sterlund wrote: >> Hi all, >> >> This trivial patch fixes the comments about `atomic_xchg` and `atomic_xchg_at` in `access.hpp`. >> And it removes the unnecessary content about `INSTANTIATE_HPP_ACCESS` >> which has been aleready removed in [JDK-8230808](https://bugs.openjdk.org/browse/JDK-8230808). >> >> Thanks for the review. >> >> Best Regards, >> -- Guoxiong > > Marked as reviewed by eosterlund (Reviewer). @fisk @stefank Thanks for the reviews. Integrating. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17042#issuecomment-1851427675 From gli at openjdk.org Tue Dec 12 07:22:43 2023 From: gli at openjdk.org (Guoxiong Li) Date: Tue, 12 Dec 2023 07:22:43 GMT Subject: Integrated: 8321631: Fix comments in access.hpp In-Reply-To: References: Message-ID: On Sat, 9 Dec 2023 07:59:02 GMT, Guoxiong Li wrote: > Hi all, > > This trivial patch fixes the comments about `atomic_xchg` and `atomic_xchg_at` in `access.hpp`. > And it removes the unnecessary content about `INSTANTIATE_HPP_ACCESS` > which has been aleready removed in [JDK-8230808](https://bugs.openjdk.org/browse/JDK-8230808). > > Thanks for the review. > > Best Regards, > -- Guoxiong This pull request has now been integrated. Changeset: 973bcdab Author: Guoxiong Li URL: https://git.openjdk.org/jdk/commit/973bcdab81f754671de4c55656b8fb921bba4f61 Stats: 5 lines in 1 file changed: 0 ins; 1 del; 4 mod 8321631: Fix comments in access.hpp Reviewed-by: eosterlund, stefank ------------- PR: https://git.openjdk.org/jdk/pull/17042 From rehn at openjdk.org Tue Dec 12 08:34:23 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 12 Dec 2023 08:34:23 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v3] In-Reply-To: References: <6My9uP_jRGDPa31RMLN07O7DpheOFp6d2KgLSDieWU8=.bb4a52a3-b766-4167-9078-08dd9af564cf@github.com> Message-ID: On Tue, 12 Dec 2023 01:17:33 GMT, Fei Yang wrote: >> That `dmb` is not present in the AArch64 port because we want a release when the CAS fails, because if it fails nothing was stored, so there is literally nothing for a subsequent load from that address to synchronize with. >> >> It's there because of this re-ordering: >> >> >> >> >> >> // atomic_op (B) >> 1: ldaxr x0, [B] // Exclusive load with acquire >> >> stlxr w1, x0, [B] // Exclusive store with release >> cbnz w1, 1b >> >> >> >> It doesn't forbid orderings such as >> >> Load [B] -> Load [C] -> Store [A] -> Store [B] >> >> >> [See here](https://mail.openjdk.org/pipermail/aarch64-port-dev/2014-February/000706.html) >> >> The Arm memory model has been strengthened, and this reasoning looks a bit shaky today. At the time we did not know if any of the usages of `cmpxchgptr`required "full barrier" semantics, so we put a full barrier in for safety's sake. > > Wow, thanks for finding that history. It's very helpful for us to understand the existence of this barrier. I have seen code, do not find it anymore in code base, which use cmpxchg for Atomic::release_store_fence(). (which is "new"); Atomic::store(_x, true); Atomic::store_release_fence(_one_way_barrier, true); bool z = Atomic::load(_z); But it is written as: Atomic::store(_x, true); Atomic::cmpxchg(_one_way_barrier, false, true); bool z = Atomic::load(_z); Where load Z happens after store X. But without release the store may happens after. As we say: "All of the atomic operations that imply a read-modify-write action guarantee a two-way memory barrier across that operation." I have look over the cmpxchg uses and not releasing 'seems' okay at first glance. But I think before this line changes we should release on failed CAS. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1423631241 From rehn at openjdk.org Tue Dec 12 08:44:34 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 12 Dec 2023 08:44:34 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v3] In-Reply-To: References: <6My9uP_jRGDPa31RMLN07O7DpheOFp6d2KgLSDieWU8=.bb4a52a3-b766-4167-9078-08dd9af564cf@github.com> Message-ID: On Tue, 12 Dec 2023 08:31:25 GMT, Robbin Ehn wrote: >> Wow, thanks for finding that history. It's very helpful for us to understand the existence of this barrier. > > I have seen code, do not find it anymore in code base, which use cmpxchg for Atomic::release_store_fence(). > (which is "new"); > > Atomic::store(_x, true); > Atomic::store_release_fence(_one_way_barrier, true); > bool z = Atomic::load(_z); > > But it is written as: > > Atomic::store(_x, true); > Atomic::cmpxchg(_one_way_barrier, false, true); > bool z = Atomic::load(_z); > > Where load Z happens after store X. But without release the store may happens after. > > As we say: > "All of the atomic operations that imply a read-modify-write action guarantee a two-way memory barrier across that operation." > > I have look over the cmpxchg uses and not releasing 'seems' okay at first glance. > But I think before this line changes we should release on failed CAS. Note I mention the hotspot runtime atomic here because cmpxchgptr is used to manipulate the markword which follows hotspot atomics, not Java volatiles. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1423644955 From chagedorn at openjdk.org Tue Dec 12 08:58:24 2023 From: chagedorn at openjdk.org (Christian Hagedorn) Date: Tue, 12 Dec 2023 08:58:24 GMT Subject: RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 01:02:59 GMT, Jamil Nimeh wrote: > This fix corrects an oversight in the ChaCha20 intrinsics delivered by JDK-8247645. An ifdef guard is now part of the x86 ChaCha20 intrinsics code which disables them by default on 32-bit platforms, as this architecture was not part of the feature delivery. Marked as reviewed by chagedorn (Reviewer). src/hotspot/cpu/x86/vm_version_x86.cpp line 1152: > 1150: // No support currently for ChaCha20 intrinsics on 32-bit platforms > 1151: if (UseChaCha20Intrinsics) { > 1152: warning("Support for ChaCha20 intrinsics not available on this CPU."); Maybe we can adapt the message to follow the convention of the other warning messages for intrinsics like AES https://github.com/openjdk/jdk/blob/2611a49ea13ee7a07f22692c3a4b32856ec5898f/src/hotspot/cpu/x86/vm_version_x86.cpp#L1062-L1065 or CRC32C https://github.com/openjdk/jdk/blob/2611a49ea13ee7a07f22692c3a4b32856ec5898f/src/hotspot/cpu/x86/vm_version_x86.cpp#L1116-L1118 Suggestion: warning("ChaCha20 intrinsics are not available on this CPU."); Though I see that not all existing warning messages are consistent. Anyway, the bailout looks good to me. ------------- PR Review: https://git.openjdk.org/jdk/pull/17072#pullrequestreview-1776952817 PR Review Comment: https://git.openjdk.org/jdk/pull/17072#discussion_r1423662348 From shade at openjdk.org Tue Dec 12 09:28:34 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 12 Dec 2023 09:28:34 GMT Subject: RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 01:02:59 GMT, Jamil Nimeh wrote: > This fix corrects an oversight in the ChaCha20 intrinsics delivered by JDK-8247645. An ifdef guard is now part of the x86 ChaCha20 intrinsics code which disables them by default on 32-bit platforms, as this architecture was not part of the feature delivery. Agreed with message suggestion, otherwise good. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17072#pullrequestreview-1777014512 From shade at openjdk.org Tue Dec 12 09:28:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 12 Dec 2023 09:28:35 GMT Subject: RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 08:53:35 GMT, Christian Hagedorn wrote: >> This fix corrects an oversight in the ChaCha20 intrinsics delivered by JDK-8247645. An ifdef guard is now part of the x86 ChaCha20 intrinsics code which disables them by default on 32-bit platforms, as this architecture was not part of the feature delivery. > > src/hotspot/cpu/x86/vm_version_x86.cpp line 1152: > >> 1150: // No support currently for ChaCha20 intrinsics on 32-bit platforms >> 1151: if (UseChaCha20Intrinsics) { >> 1152: warning("Support for ChaCha20 intrinsics not available on this CPU."); > > Maybe we can adapt the message to follow the convention of the other warning messages for intrinsics like AES > https://github.com/openjdk/jdk/blob/2611a49ea13ee7a07f22692c3a4b32856ec5898f/src/hotspot/cpu/x86/vm_version_x86.cpp#L1062-L1065 > or CRC32C > https://github.com/openjdk/jdk/blob/2611a49ea13ee7a07f22692c3a4b32856ec5898f/src/hotspot/cpu/x86/vm_version_x86.cpp#L1116-L1118 > > Suggestion: > > warning("ChaCha20 intrinsics are not available on this CPU."); > > > Though I see that not all existing warning messages are consistent. > > Anyway, the bailout looks good to me. +1 on changing the message to "ChaCha20 intrinsics are not available on this CPU." ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17072#discussion_r1423706642 From shade at openjdk.org Tue Dec 12 09:41:37 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 12 Dec 2023 09:41:37 GMT Subject: [jdk22] RFR: 8320886: Unsafe_SetMemory0 is not guarded In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 19:19:59 GMT, Jorn Vernee wrote: > Hi all, > > This pull request contains a backport of commit [ce4b257f](https://github.com/openjdk/jdk/commit/ce4b257fa539d35a7d14bba2d5d3342093d714e1) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Jorn Vernee on 11 Dec 2023 and was reviewed by David Holmes and Frederic Parain. > > Thanks! Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/8#pullrequestreview-1777042832 From rehn at openjdk.org Tue Dec 12 10:27:35 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 12 Dec 2023 10:27:35 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr In-Reply-To: References: Message-ID: <4E1XPkx4Hgc2np3DGhr2YzKIWXBxhYBUv9E4aYQqpbc=.265c4fba-e076-4fc2-b4cd-ebedac8b3e73@github.com> On Wed, 29 Nov 2023 11:58:31 GMT, Gui Cao wrote: > MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? > This issue is used to track avoid passing t0 as a temporary register in the following cases: > 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. > 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad > 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad > > Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. > https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 > > ### Testing: > - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) > - [x] Run tier1-3 tests with SiFive unmatched (release) I think we should not use any fixed regs, just add them to the nodes, like PPC have done. src/hotspot/cpu/riscv/gc/x/x_riscv.ad line 55: > 53: > 54: // Load Pointer > 55: instruct xLoadP(iRegPNoSp dst, memory mem, rFlagsReg cr) Add a tmpReg here, as t1 is also used by masm. ppc have added tempregs for all these I think we should also do that as both t0 and t1 can be used by masm. src/hotspot/cpu/riscv/gc/x/x_riscv.ad line 142: > 140: %} > 141: > 142: instruct xCompareAndExchangeP(iRegPNoSp res, indirect mem, iRegP oldval, iRegP newval, rFlagsReg cr) %{ So we fix them like this: diff --git a/src/hotspot/cpu/riscv/gc/x/x_riscv.ad b/src/hotspot/cpu/riscv/gc/x/x_riscv.ad index d90539123ff..e5383a695c5 100644 --- a/src/hotspot/cpu/riscv/gc/x/x_riscv.ad +++ b/src/hotspot/cpu/riscv/gc/x/x_riscv.ad @@ -74 +74 @@ instruct xLoadP(iRegPNoSp dst, memory mem, rFlagsReg cr) -instruct xCompareAndSwapP(iRegINoSp res, indirect mem, iRegP oldval, iRegP newval, rFlagsReg cr) %{ +instruct xCompareAndSwapP(iRegINoSp res, indirect mem, iRegP oldval, iRegP newval, iRegINoSp tmpreg, rFlagsReg cr) %{ @@ -78 +78 @@ instruct xCompareAndSwapP(iRegINoSp res, indirect mem, iRegP oldval, iRegP newva - effect(KILL cr, TEMP_DEF res); + effect(KILL cr, TEMP_DEF res, TEMP tmpreg); @@ -89,2 +89,2 @@ instruct xCompareAndSwapP(iRegINoSp res, indirect mem, iRegP oldval, iRegP newva - Assembler::relaxed /* acquire */, Assembler::rl /* release */, t1); - __ sub(t0, t1, $oldval$$Register); + Assembler::relaxed /* acquire */, Assembler::rl /* release */, $tmpreg$$Register); + __ sub(t0, $tmpreg$$Register, $oldval$$Register); @@ -95 +95 @@ instruct xCompareAndSwapP(iRegINoSp res, indirect mem, iRegP oldval, iRegP newva - __ andr(t0, t0, t1); + __ andr(t0, t0, $tmpreg$$Register); @@ -97 +97 @@ instruct xCompareAndSwapP(iRegINoSp res, indirect mem, iRegP oldval, iRegP newva - x_load_barrier_slow_path(_masm, this, Address($mem$$Register), t1 /* ref */, $res$$Register /* tmp */); + x_load_barrier_slow_path(_masm, this, Address($mem$$Register), $tmpreg$$Register /* ref */, $res$$Register /* tmp */); src/hotspot/cpu/riscv/gc/z/z_riscv.ad line 143: > 141: guarantee($mem$$disp == 0, "impossible encoding"); > 142: Address ref_addr($mem$$Register); > 143: z_color(_masm, this, $oldval_tmp$$Register, $oldval$$Register, t1); Here you can use newval_tmp instead t1, no? Don't need to add additional regs. ------------- PR Review: https://git.openjdk.org/jdk/pull/16880#pullrequestreview-1777113868 PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1423769200 PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1423784938 PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1423779627 From johan.sjolen at oracle.com Tue Dec 12 10:59:20 2023 From: johan.sjolen at oracle.com (Johan Sjolen) Date: Tue, 12 Dec 2023 10:59:20 +0000 Subject: CDS can archive classpath entries more than once when a JAR manifest has Class-Path attributes In-Reply-To: References: Message-ID: <0b9334db-6140-4bfe-8395-bc32de509d9a@oracle.com> Hi Steven, The issue you saw in NMT is an indication that something is calling NMT's API in an unexpected manner. This is probably a symptom of the CDS bug, so when that bug is fixed NMT will no longer crash the JVM. There are plans to make NMT more resilient against these sorts of misuses, but those changes are planned for JDK-23. Thank you for the in-depth research on this issue. All the best, Johan On 2023-12-08 18:56, Steven Schlansker wrote: > Hi hotspot-dev, > > Recently, we started experiencing JVM crashes [1] and inexplicable > IncompatibleClassChangeErrors in our testing environment. We use > custom classloaders, NMT, and app-CDS. > > # Internal Error (virtualMemoryTracker.cpp:403), pid=20, tid=128 > # Error: ShouldNotReachHere() > # > > # JRE version: OpenJDK Runtime Environment (Red_Hat-21.0.1.0.12-2) > (21.0.1+12) (build 21.0.1+12-LTS) > # Java VM: OpenJDK 64-Bit Server VM (Red_Hat-21.0.1.0.12-2) > (21.0.1+12-LTS, mixed mode, sharing, tiered, compressed oops, > compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # V [libjvm.so+0x104a06c] > VirtualMemoryTracker::add_reserved_region(unsigned char*, unsigned > long, NativeCallStack const&, MEMFLAGS)+0x6fc > > and, > > java.lang.IncompatibleClassChangeError: > com.paywholesail.components.util.ByteBuffers and > com.paywholesail.components.util.ByteBuffers$ByteBufferPuttable > disagree on InnerClasses attribute > (I checked with javap, and it looks the same to me...) > > At least for the ShouldNotReachHere, it looked like a definite JVM > bug, so I have been trying to create a reproducing test case to make a > good error report. I noticed that the crash only happens when NMT is > combined with Class Data Sharing. At this point, I read the logs > closely, and noticed: > > [0.139s][warning][cds ] shared class paths mismatch > [0.151s][warning][cds,dynamic] Unable to use shared archive. The top > archive failed to load: /.../prebake.jsa > > So, I compared the expected and actual class path as printed by the > JVM. In both cases, we run with `-cp lib/*` with a fixed set of > library jars. Imagine my surprise when I find that the only difference > is that the expected (archive-time) classpath includes > lib/stax-ex-1.8.jar *twice*. > > By running the generated shared archive file with `strings | grep`, I > am able to verify that the `lib/stax-ex-1.8.jar` entry indeed is > present in the archive twice. > > I fixed up my JDK build environment and started sprinkling new logging > and assertions through the archive creation code. > > It looks like ClassLoader::add_to_app_classpath_entries can either > check for duplicated classpath entries, or trust that the caller knows > the element is new. > This list of entries is built in part by > ClassLoader::setup_app_search_path, which enumerates the classpath and > adds entries one by one. In this case, duplicate checks are skipped, > presumably because we trust the initial classpath not to have > duplicates. > > When an element is added in add_to_app_classpath_entries, for each > jar, it calls process_jar_manifest. Among other things, this reads the > MANIFEST.MF and looks for Class-Path entries, and loads those too. > Indeed, our `jaxb-runtime` has such an entry for `stax-ex`. In this > case, it does guard against duplicate entries. > > I think there is a bug here: if a jar is added by a manifest's > Class-Path from a jar *before* we finish processing the initial app > class path, it can get added twice - first with a duplicate check via > the manifest, and then a second time without checking for duplicates > from the app classpath. > > I believe this is reproducible on latest 21.0.1+12 with the following > code and steps: > > A.java: > class A { > static { > System.err.println("A"); > } > } > > class B { > public static void main(String[] args) { > System.err.println("hi!"); > new A(); > } > } > > MANIFEST.MF: > Manifest-Version: 1.0 > Class-Path: B.jar > > > % mkdir lib > % javac A.java > % javac B.java > % jar -m META-INF/MANIFEST.MF -c -f lib/A.jar A.class > % jar cf lib/B.jar B.class > > % java -cp lib/B.jar:lib/A.jar -XX:ArchiveClassesAtExit=shared.jsa > -XX:NativeMemoryTracking=summary B > % strings shared.jsa| grep lib/ > lib/B.jar > lib/A.jar > > % java -cp lib/A.jar:lib/B.jar -XX:ArchiveClassesAtExit=shared2.jsa > -XX:NativeMemoryTracking=summary B > % strings shared2.jsa| grep lib/ > lib/A.jar > lib/B.jar > lib/B.jar > > When A.jar is loaded first, the Class-Path manifest entry adds B.jar. > Then, B.jar is added *again*, unconditionally. > When B.jar is loaded first, the app classpath entry is created first. > Then, the manifest entry is checked and since it is a duplicate, only > one entry is added. > > At this point I felt like I collected enough information to ask for > some expert advice. > Am I on the right track here, that this could be a bug resulting in > duplicate classpath entries in the archive classpath, if a dependent > jar comes in via a manifest class-path entry before the app classpath > finishes processing? Could that possibly be the source of our > assertion failures and IncompatibleClassChangeErrors? > > As a related question, this makes me worry that using `-cp lib/*` > might implicitly embed the filesystem enumeration order in the > archive. Maybe the classpath order is not important when verifying, > but at the very least, the wildcard enumeration order influences the > build in a way I did not expect. > > If my analysis sounds plausible, I can submit it via the Java bug system. > > Thank you for any consideration and advice. Best, > Steven > > [1] https://gist.github.com/stevenschlansker/12d1eaeb363ae135c88a965048353b0e From avoitylov at openjdk.org Tue Dec 12 11:36:03 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Tue, 12 Dec 2023 11:36:03 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly [v2] In-Reply-To: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: > Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. Aleksei Voitylov has updated the pull request incrementally with one additional commit since the last revision: review comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17017/files - new: https://git.openjdk.org/jdk/pull/17017/files/79aa66ec..f1f7243d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17017&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17017&range=00-01 Stats: 19 lines in 1 file changed: 1 ins; 9 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/17017.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17017/head:pull/17017 PR: https://git.openjdk.org/jdk/pull/17017 From avoitylov at openjdk.org Tue Dec 12 12:02:49 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Tue, 12 Dec 2023 12:02:49 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly [v3] In-Reply-To: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: > Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'openjdk:master' into JDK-8321515 - review comments - JDK-8321515 implementation ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17017/files - new: https://git.openjdk.org/jdk/pull/17017/files/f1f7243d..df046137 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17017&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17017&range=01-02 Stats: 21273 lines in 342 files changed: 17769 ins; 2780 del; 724 mod Patch: https://git.openjdk.org/jdk/pull/17017.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17017/head:pull/17017 PR: https://git.openjdk.org/jdk/pull/17017 From avoitylov at openjdk.org Tue Dec 12 12:10:33 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Tue, 12 Dec 2023 12:10:33 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly [v3] In-Reply-To: References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: On Mon, 11 Dec 2023 15:54:04 GMT, Aleksey Shipilev wrote: >> Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into JDK-8321515 >> - review comments >> - JDK-8321515 implementation > > src/hotspot/cpu/arm/interp_masm_arm.cpp line 311: > >> 309: get_index_at_bcp(index, bcp_offset, cache /* as tmp */, sizeof(u2)); >> 310: >> 311: if (is_power_of_2(sizeof(ResolvedMethodEntry))) { > > I usually dislike introducing split like these, because one of the branches is effectively dead. Which also means it is effectively untested. Given this interpreter code, can we just leave the generic version unconditionally? Yes, it's gone in the new version. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17017#discussion_r1423899184 From jsjolen at openjdk.org Tue Dec 12 12:37:32 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 12 Dec 2023 12:37:32 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v12] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 06:20:07 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Update the name of the method Hi, I'll take a look. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1851952120 From shade at openjdk.org Tue Dec 12 13:46:35 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Tue, 12 Dec 2023 13:46:35 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly [v3] In-Reply-To: References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: On Tue, 12 Dec 2023 12:02:49 GMT, Aleksei Voitylov wrote: >> Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. > > Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'openjdk:master' into JDK-8321515 > - review comments > - JDK-8321515 implementation Still good. ------------- Marked as reviewed by shade (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17017#pullrequestreview-1777504712 From jsjolen at openjdk.org Tue Dec 12 13:48:31 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Tue, 12 Dec 2023 13:48:31 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v12] In-Reply-To: References: Message-ID: On Mon, 30 Oct 2023 06:20:07 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Update the name of the method I'm on board with this change, I think it's a good idea. My only concern is the same as Thomas, what are the guarantees given when concurrently touching the memory and `madvise`:ing? From reading the Linux kernel code, this seems fine, but it doesn't seem like it's guaranteed by anything to work correctly. AFAIK, the concurrent pre-touch + usage pattern doesn't exist in the OpenJDK yet, so is this just a wishlist item anyway? @kimbarrett, could you provide some clarity on this situation? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15781#issuecomment-1852068777 From jkern at openjdk.org Tue Dec 12 14:05:48 2023 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 12 Dec 2023 14:05:48 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: followed the proposals ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/2d32c43b..b7676822 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=02-03 Stats: 485 lines in 6 files changed: 329 ins; 149 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From jnimeh at openjdk.org Tue Dec 12 14:24:34 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Tue, 12 Dec 2023 14:24:34 GMT Subject: RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 09:24:48 GMT, Aleksey Shipilev wrote: >> src/hotspot/cpu/x86/vm_version_x86.cpp line 1152: >> >>> 1150: // No support currently for ChaCha20 intrinsics on 32-bit platforms >>> 1151: if (UseChaCha20Intrinsics) { >>> 1152: warning("Support for ChaCha20 intrinsics not available on this CPU."); >> >> Maybe we can adapt the message to follow the convention of the other warning messages for intrinsics like AES >> https://github.com/openjdk/jdk/blob/2611a49ea13ee7a07f22692c3a4b32856ec5898f/src/hotspot/cpu/x86/vm_version_x86.cpp#L1062-L1065 >> or CRC32C >> https://github.com/openjdk/jdk/blob/2611a49ea13ee7a07f22692c3a4b32856ec5898f/src/hotspot/cpu/x86/vm_version_x86.cpp#L1116-L1118 >> >> Suggestion: >> >> warning("ChaCha20 intrinsics are not available on this CPU."); >> >> >> Though I see that not all existing warning messages are consistent. >> >> Anyway, the bailout looks good to me. > > +1 on changing the message to "ChaCha20 intrinsics are not available on this CPU." Makes sense to me. Will fix. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17072#discussion_r1424063190 From jvernee at openjdk.org Tue Dec 12 14:29:32 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 12 Dec 2023 14:29:32 GMT Subject: [jdk22] RFR: 8320886: Unsafe_SetMemory0 is not guarded In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 19:19:59 GMT, Jorn Vernee wrote: > Hi all, > > This pull request contains a backport of commit [ce4b257f](https://github.com/openjdk/jdk/commit/ce4b257fa539d35a7d14bba2d5d3342093d714e1) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Jorn Vernee on 11 Dec 2023 and was reviewed by David Holmes and Frederic Parain. > > Thanks! Thanks for the review. ------------- PR Comment: https://git.openjdk.org/jdk22/pull/8#issuecomment-1852137563 From jvernee at openjdk.org Tue Dec 12 14:29:33 2023 From: jvernee at openjdk.org (Jorn Vernee) Date: Tue, 12 Dec 2023 14:29:33 GMT Subject: [jdk22] Integrated: 8320886: Unsafe_SetMemory0 is not guarded In-Reply-To: References: Message-ID: <4eHqACJpYEaznzcI_7NqmtGyf828xStq-kStUPhbQmc=.062b16dc-662c-439b-806a-30601a4922b5@github.com> On Mon, 11 Dec 2023 19:19:59 GMT, Jorn Vernee wrote: > Hi all, > > This pull request contains a backport of commit [ce4b257f](https://github.com/openjdk/jdk/commit/ce4b257fa539d35a7d14bba2d5d3342093d714e1) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Jorn Vernee on 11 Dec 2023 and was reviewed by David Holmes and Frederic Parain. > > Thanks! This pull request has now been integrated. Changeset: 9f0469b9 Author: Jorn Vernee URL: https://git.openjdk.org/jdk22/commit/9f0469b94a97886e4ac0ee6cb870763430a1e487 Stats: 22 lines in 3 files changed: 12 ins; 0 del; 10 mod 8320886: Unsafe_SetMemory0 is not guarded Reviewed-by: shade Backport-of: ce4b257fa539d35a7d14bba2d5d3342093d714e1 ------------- PR: https://git.openjdk.org/jdk22/pull/8 From eastigeevich at openjdk.org Tue Dec 12 14:38:22 2023 From: eastigeevich at openjdk.org (Evgeny Astigeevich) Date: Tue, 12 Dec 2023 14:38:22 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: <0g42xTuuwIpZZ8aF4IMVWt6T1tI-ExyeSY31pI6Wv80=.df942a73-d418-4548-b6e1-f344e4896bed@github.com> On Mon, 11 Dec 2023 22:41:56 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) >> >> >> Compiler.perfmap [arguments] (Linux only) >> Write map file for Linux perf tool. >> >> Impact: Low >> >> arguments: >> >> ? filename: (Optional) Name of the map file (STRING, no default value) >> >> If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then >> the default filename will be /tmp/perf-12345.map. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright of PerfMapTest lgtm ------------- Marked as reviewed by eastigeevich (Committer). PR Review: https://git.openjdk.org/jdk/pull/15871#pullrequestreview-1777620332 From jnimeh at openjdk.org Tue Dec 12 14:41:07 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Tue, 12 Dec 2023 14:41:07 GMT Subject: RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes [v2] In-Reply-To: References: Message-ID: > This fix corrects an oversight in the ChaCha20 intrinsics delivered by JDK-8247645. An ifdef guard is now part of the x86 ChaCha20 intrinsics code which disables them by default on 32-bit platforms, as this architecture was not part of the feature delivery. Jamil Nimeh has updated the pull request incrementally with one additional commit since the last revision: Modify warning message on 32-bit ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17072/files - new: https://git.openjdk.org/jdk/pull/17072/files/fe0d4461..ff55249e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17072&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17072&range=00-01 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17072.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17072/head:pull/17072 PR: https://git.openjdk.org/jdk/pull/17072 From jnimeh at openjdk.org Tue Dec 12 14:41:09 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Tue, 12 Dec 2023 14:41:09 GMT Subject: Integrated: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 01:02:59 GMT, Jamil Nimeh wrote: > This fix corrects an oversight in the ChaCha20 intrinsics delivered by JDK-8247645. An ifdef guard is now part of the x86 ChaCha20 intrinsics code which disables them by default on 32-bit platforms, as this architecture was not part of the feature delivery. This pull request has now been integrated. Changeset: 5718039a Author: Jamil Nimeh URL: https://git.openjdk.org/jdk/commit/5718039a46ae51fa9b7042fe7163e3637e981b05 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes Reviewed-by: chagedorn, shade ------------- PR: https://git.openjdk.org/jdk/pull/17072 From luhenry at openjdk.org Tue Dec 12 15:17:41 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 12 Dec 2023 15:17:41 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v3] In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 03:03:00 GMT, Fei Yang wrote: >> Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: >> >> fix merge conflict mistake > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2868: > >> 2866: >> 2867: atomic_cas(old, tmp, aligned_addr, size, acquire, release); >> 2868: bne(tmp, old, retry); > > Does it make sense to use `atomic_cas` here for this case? I don't see the benefit. The input `size` is supposed to be `int8` or `int16`, but the newly-added `atomic_cas` can only handle `int64`, `int32` and `uint32`. It does make sense to use `atomic_cas` here all to avoid `lr/sc`, for all the reasons where amocas behaves better than lr/sc. The manipulation needed for sub-word atomics is the same for both amocas and lr/sc anyway, just a slightly different order. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1424152061 From luhenry at openjdk.org Tue Dec 12 15:20:56 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 12 Dec 2023 15:20:56 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v4] In-Reply-To: References: Message-ID: > 8315856: RISC-V: Use Zacas extension for cmpxchg Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16910/files - new: https://git.openjdk.org/jdk/pull/16910/files/16b42595..942cb11f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16910.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16910/head:pull/16910 PR: https://git.openjdk.org/jdk/pull/16910 From aph at openjdk.org Tue Dec 12 17:36:33 2023 From: aph at openjdk.org (Andrew Haley) Date: Tue, 12 Dec 2023 17:36:33 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v4] In-Reply-To: References: <6My9uP_jRGDPa31RMLN07O7DpheOFp6d2KgLSDieWU8=.bb4a52a3-b766-4167-9078-08dd9af564cf@github.com> Message-ID: <_Tfd12F6NBCQihio_aanAWM_ZHwo5f6QyTzfDhViBzo=.a934f999-36dc-43d7-b4d9-527fdea443f5@github.com> On Tue, 12 Dec 2023 08:41:21 GMT, Robbin Ehn wrote: >> I have seen code, do not find it anymore in code base, which use cmpxchg for Atomic::release_store_fence(). >> (which is "new"); >> >> Atomic::store(_x, true); >> Atomic::store_release_fence(_one_way_barrier, true); >> bool z = Atomic::load(_z); >> >> But it is written as: >> >> Atomic::store(_x, true); >> Atomic::cmpxchg(_one_way_barrier, false, true); >> bool z = Atomic::load(_z); >> >> Where load Z happens after store X. But without release the store may happens after. >> >> As we say: >> "All of the atomic operations that imply a read-modify-write action guarantee a two-way memory barrier across that operation." >> >> I have look over the cmpxchg uses and not releasing 'seems' okay at first glance. >> But I think before this line changes we should release on failed CAS. > > Note I mention the hotspot runtime atomic here because cmpxchgptr is used to manipulate the markword which follows hotspot atomics, not Java volatiles. > I have seen code, do not find it anymore in code base, which use cmpxchg for Atomic::release_store_fence(). (which is "new"); > > ``` > Atomic::store(_x, true); > Atomic::store_release_fence(_one_way_barrier, true); > bool z = Atomic::load(_z); > ``` > > But it is written as: > > ``` > Atomic::store(_x, true); > Atomic::cmpxchg(_one_way_barrier, false, true); > bool z = Atomic::load(_z); > ``` > > Where load Z happens after store X. But without release the store may happens after. Yuck. I hope that is never going to return. I think all such usages of cmpxchg were replaced by an atomic add of zero into the stack, which works on x86. > As we say: "All of the atomic operations that imply a read-modify-write action guarantee a two-way memory barrier across that operation." Well, that's for the HotSpot internal atomics, rather than for Java code. And given that this particular version of `cmpxchg` is only used in a few places it suffices to see what they need. > I have look over the cmpxchg uses and not releasing 'seems' okay at first glance. But I think before this line changes we should release on failed CAS. I have never agreed with that thinking. We should find the callers of CAS that depend on such a side effect and fix them. The problem is that no-one wants to do it. There may no longer be any such cases. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1424352158 From stevenschlansker at gmail.com Tue Dec 12 19:28:38 2023 From: stevenschlansker at gmail.com (Steven Schlansker) Date: Tue, 12 Dec 2023 11:28:38 -0800 Subject: CDS can archive classpath entries more than once when a JAR manifest has Class-Path attributes In-Reply-To: <0b9334db-6140-4bfe-8395-bc32de509d9a@oracle.com> References: <0b9334db-6140-4bfe-8395-bc32de509d9a@oracle.com> Message-ID: Thank you very much Calvin and Johan. We have applied a workaround that seems to work for us: construct a -cp by sorting all jar entries with a manifest class-path to the end. While this bug is obscure, the symptoms are painful and difficult to diagnose. I think end-users would appreciate a jdk21 backport if it is feasible. Looking forward to future work in 22, 23, and beyond. It's a great time to be in the Java ecosystem :) On Tue, Dec 12, 2023 at 2:59?AM Johan Sjolen wrote: > > Hi Steven, > > The issue you saw in NMT is an indication that something is calling > NMT's API in an unexpected manner. This is probably a symptom of the CDS > bug, so when that bug is fixed NMT will no longer crash the JVM. There > are plans to make NMT more resilient against these sorts of misuses, but > those changes are planned for JDK-23. > > Thank you for the in-depth research on this issue. > > All the best, > Johan > > On 2023-12-08 18:56, Steven Schlansker wrote: > > Hi hotspot-dev, > > > > Recently, we started experiencing JVM crashes [1] and inexplicable > > IncompatibleClassChangeErrors in our testing environment. We use > > custom classloaders, NMT, and app-CDS. > > > > # Internal Error (virtualMemoryTracker.cpp:403), pid=20, tid=128 > > # Error: ShouldNotReachHere() > > # > > > > # JRE version: OpenJDK Runtime Environment (Red_Hat-21.0.1.0.12-2) > > (21.0.1+12) (build 21.0.1+12-LTS) > > # Java VM: OpenJDK 64-Bit Server VM (Red_Hat-21.0.1.0.12-2) > > (21.0.1+12-LTS, mixed mode, sharing, tiered, compressed oops, > > compressed class ptrs, g1 gc, linux-amd64) > > # Problematic frame: > > # V [libjvm.so+0x104a06c] > > VirtualMemoryTracker::add_reserved_region(unsigned char*, unsigned > > long, NativeCallStack const&, MEMFLAGS)+0x6fc > > > > and, > > > > java.lang.IncompatibleClassChangeError: > > com.paywholesail.components.util.ByteBuffers and > > com.paywholesail.components.util.ByteBuffers$ByteBufferPuttable > > disagree on InnerClasses attribute > > (I checked with javap, and it looks the same to me...) > > > > At least for the ShouldNotReachHere, it looked like a definite JVM > > bug, so I have been trying to create a reproducing test case to make a > > good error report. I noticed that the crash only happens when NMT is > > combined with Class Data Sharing. At this point, I read the logs > > closely, and noticed: > > > > [0.139s][warning][cds ] shared class paths mismatch > > [0.151s][warning][cds,dynamic] Unable to use shared archive. The top > > archive failed to load: /.../prebake.jsa > > > > So, I compared the expected and actual class path as printed by the > > JVM. In both cases, we run with `-cp lib/*` with a fixed set of > > library jars. Imagine my surprise when I find that the only difference > > is that the expected (archive-time) classpath includes > > lib/stax-ex-1.8.jar *twice*. > > > > By running the generated shared archive file with `strings | grep`, I > > am able to verify that the `lib/stax-ex-1.8.jar` entry indeed is > > present in the archive twice. > > > > I fixed up my JDK build environment and started sprinkling new logging > > and assertions through the archive creation code. > > > > It looks like ClassLoader::add_to_app_classpath_entries can either > > check for duplicated classpath entries, or trust that the caller knows > > the element is new. > > This list of entries is built in part by > > ClassLoader::setup_app_search_path, which enumerates the classpath and > > adds entries one by one. In this case, duplicate checks are skipped, > > presumably because we trust the initial classpath not to have > > duplicates. > > > > When an element is added in add_to_app_classpath_entries, for each > > jar, it calls process_jar_manifest. Among other things, this reads the > > MANIFEST.MF and looks for Class-Path entries, and loads those too. > > Indeed, our `jaxb-runtime` has such an entry for `stax-ex`. In this > > case, it does guard against duplicate entries. > > > > I think there is a bug here: if a jar is added by a manifest's > > Class-Path from a jar *before* we finish processing the initial app > > class path, it can get added twice - first with a duplicate check via > > the manifest, and then a second time without checking for duplicates > > from the app classpath. > > > > I believe this is reproducible on latest 21.0.1+12 with the following > > code and steps: > > > > A.java: > > class A { > > static { > > System.err.println("A"); > > } > > } > > > > class B { > > public static void main(String[] args) { > > System.err.println("hi!"); > > new A(); > > } > > } > > > > MANIFEST.MF: > > Manifest-Version: 1.0 > > Class-Path: B.jar > > > > > > % mkdir lib > > % javac A.java > > % javac B.java > > % jar -m META-INF/MANIFEST.MF -c -f lib/A.jar A.class > > % jar cf lib/B.jar B.class > > > > % java -cp lib/B.jar:lib/A.jar -XX:ArchiveClassesAtExit=shared.jsa > > -XX:NativeMemoryTracking=summary B > > % strings shared.jsa| grep lib/ > > lib/B.jar > > lib/A.jar > > > > % java -cp lib/A.jar:lib/B.jar -XX:ArchiveClassesAtExit=shared2.jsa > > -XX:NativeMemoryTracking=summary B > > % strings shared2.jsa| grep lib/ > > lib/A.jar > > lib/B.jar > > lib/B.jar > > > > When A.jar is loaded first, the Class-Path manifest entry adds B.jar. > > Then, B.jar is added *again*, unconditionally. > > When B.jar is loaded first, the app classpath entry is created first. > > Then, the manifest entry is checked and since it is a duplicate, only > > one entry is added. > > > > At this point I felt like I collected enough information to ask for > > some expert advice. > > Am I on the right track here, that this could be a bug resulting in > > duplicate classpath entries in the archive classpath, if a dependent > > jar comes in via a manifest class-path entry before the app classpath > > finishes processing? Could that possibly be the source of our > > assertion failures and IncompatibleClassChangeErrors? > > > > As a related question, this makes me worry that using `-cp lib/*` > > might implicitly embed the filesystem enumeration order in the > > archive. Maybe the classpath order is not important when verifying, > > but at the very least, the wildcard enumeration order influences the > > build in a way I did not expect. > > > > If my analysis sounds plausible, I can submit it via the Java bug system. > > > > Thank you for any consideration and advice. Best, > > Steven > > > > [1] https://gist.github.com/stevenschlansker/12d1eaeb363ae135c88a965048353b0e From matsaave at openjdk.org Tue Dec 12 19:38:24 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 12 Dec 2023 19:38:24 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: On Mon, 11 Dec 2023 15:31:45 GMT, Aleksei Voitylov wrote: >> Thanks for fixing the ARM32 build! >> I see the change fixes wrong offset and register usage and does some cleanup. It is OK for me. >> Thanks again! > > thanks @bulasevich. Any Reviewers, please? I'd like to to get the ARM32 port to work again. Thanks for this fix @voitylov! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17017#issuecomment-1852683953 From luhenry at openjdk.org Tue Dec 12 21:09:45 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 12 Dec 2023 21:09:45 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v5] In-Reply-To: References: Message-ID: > 8315856: RISC-V: Use Zacas extension for cmpxchg Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: Fix narrow compxchg ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16910/files - new: https://git.openjdk.org/jdk/pull/16910/files/942cb11f..0b0363da Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=03-04 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16910.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16910/head:pull/16910 PR: https://git.openjdk.org/jdk/pull/16910 From lmesnik at openjdk.org Tue Dec 12 22:55:22 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Tue, 12 Dec 2023 22:55:22 GMT Subject: RFR: 8321713: Harmonize executeTestJvm with create[Limited]TestJavaProcessBuilder [v3] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 14:06:43 GMT, Stefan Karlsson wrote: >> [JDK-8315097](https://bugs.openjdk.org/browse/JDK-8315097): 'Rename createJavaProcessBuilder' changed the name of the ProcessTools helper functions used to create `ProcessBuilder`s used to spawn new java test processes. >> >> We now have `createTestJavaProcessBuilder` and `createLimitedTestJavaProcess`. The former prepends jvm options from jtreg, while the latter doesn't. >> >> With these functions it is common to see the following pattern in tests: >> >> ProcessBuilder pb = ProcessTools.createTestJavaProcessBuilder(...); >> OutputAnalyzer output = executeProcess(pb); >> >> >> We have a couple of thin wrapper in `ProcessTools` that does exactly this, so that the code can be written as a one-liner: >> >> OutputAnalyzer output = ProcessTools.executeTestJvm(); >> >> >> I propose that we name this functions using the same naming scheme we used for `createTestJavaProcessBuilder` and `createLimitedTestJavaProcessBuilder`. That is, we change `executeTestJvm` to `executeTestJava` and add a new `executeLimitedTestJava` function. > > Stefan Karlsson has updated the pull request incrementally with one additional commit since the last revision: > > Test cleanup Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17049#pullrequestreview-1778566038 From omikhaltcova at openjdk.org Tue Dec 12 23:42:50 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Tue, 12 Dec 2023 23:42:50 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 03:24:40 GMT, Fei Yang wrote: >> Subnormal numbers can be distinguished after fclass call, they were also added here in order not to do redundant operations further, they results to 0 as well. >> >> Yes, that's right, I verified output of this algorithm against the current java implementation on the full 32-bit range. >> >> Thanks for the advice below! > >> Yes, that's right, I verified output of this algorithm against the current java implementation on the full 32-bit range. > > Hi, can you share with us your test? I am also trying to understand how this works. Thanks. @RealFYang Here it is my draft. I made very simple check. The java Math.round() was copy-pasted into the test for getting an etalon value to compare. In addition as a paranoid check, I made a special build with a stub on Math.round() returning a fake value = 777, in order to be absolutely sure that the intrincified method is executing and all values in the range are checked. Also I added the option "-Xcomp" while running the test. public class MathTest_RangeRun { public static final int PRECISION = 24; public static final int SIZE = 32; public static final int SIGNIFICAND_WIDTH = PRECISION; public static final int EXP_BIT_MASK = ((1 << (SIZE - SIGNIFICAND_WIDTH)) - 1) << (SIGNIFICAND_WIDTH - 1); public static final int EXP_BIAS = (1 << (SIZE - SIGNIFICAND_WIDTH - 1)) - 1; // 127 public static final int SIGNIF_BIT_MASK = (1 << (SIGNIFICAND_WIDTH - 1)) - 1; public static int java_round(float a) { int intBits = Float.floatToRawIntBits(a); int biasedExp = (intBits & EXP_BIT_MASK) >> (SIGNIFICAND_WIDTH - 1); int shift = (SIGNIFICAND_WIDTH - 2 + EXP_BIAS) - biasedExp; if ((shift & -32) == 0) { // shift >= 0 && shift < 32 // a is a finite number such that pow(2,-32) <= ulp(a) < 1 int r = ((intBits & SIGNIF_BIT_MASK) | (SIGNIF_BIT_MASK + 1)); if (intBits < 0) { r = -r; } // In the comments below each Java expression evaluates to the value // the corresponding mathematical expression: // (r) evaluates to a / ulp(a) // (r >> shift) evaluates to floor(a * 2) // ((r >> shift) + 1) evaluates to floor((a + 1/2) * 2) // (((r >> shift) + 1) >> 1) evaluates to floor(a + 1/2) return ((r >> shift) + 1) >> 1; } else { // a is either // - a finite number with abs(a) < exp(2,FloatConsts.SIGNIFICAND_WIDTH-32) < 1/2 // - a finite number with ulp(a) >= 1 and hence a is a mathematical integer // - an infinity or NaN return (int) a; } } public static void test(int start, int end) { long beg_ms = System.currentTimeMillis(); boolean warm_up = true; for (int j = 0; j < 2; ++j) { for (int x = start; ; ++x) { float src = Float.intBitsToFloat(x); int dst = Math.round(src); int etalon = java_round(src); if (warm_up && dst != 777) { System.out.printf("END warm up: j = %d x = %d \n", j, x); warm_up = false; break; } if (dst != etalon) { System.out.printf("ERROR: x = %d src = %f dst = %d etalon = %d \n", x, src, dst, etalon); } if (x == end) break; } } System.out.printf("END test: time_elapsed = %d ms \n", System.currentTimeMillis() - beg_ms); } public static void main(String[] args) { int start = 0; int end = 0xFFFFFFFF; test(start, end); } } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1424686631 From lmesnik at openjdk.org Wed Dec 13 00:04:44 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Wed, 13 Dec 2023 00:04:44 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Fri, 8 Dec 2023 11:54:40 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods Changes requested by lmesnik (Reviewer). src/hotspot/share/prims/jvm.cpp line 4013: > 4011: // Notification from VirtualThread about entering/exiting sync critical section. > 4012: // Needed to avoid deadlocks with JVMTI suspend mechanism. > 4013: JVM_ENTRY(void, JVM_VirtualThreadCriticalLock(JNIEnv* env, jobject vthread, jboolean enter)) the jobject vthread is not used. Can't be the method made static to reduce the number of arguments? It is the performance-critical code, I don't know if it is optimized by C2. src/hotspot/share/runtime/javaThread.hpp line 320: > 318: bool _is_in_VTMS_transition; // thread is in virtual thread mount state transition > 319: bool _is_in_tmp_VTMS_transition; // thread is in temporary virtual thread mount state transition > 320: bool _is_in_critical_section; // thread is in a locking critical section might make sense to add a comment, that his variable Is changed/read only by current thread and no sync is needed. src/java.base/share/classes/java/lang/VirtualThread.java line 1164: > 1162: > 1163: @IntrinsicCandidate > 1164: private native void notifyJvmtiCriticalLock(boolean enter); The name is confusing to me, the CriticalLock looks like it is the section is critical and might be taken by a single thread only. Or it's just unclear what is critical here. However, the purpose is to disable suspend Wouldn't be 'notifyJvmtiSuspendLock notifyJvmtiDisableSuspend' better name here? or comment what critical means here. test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java line 30: > 28: * @requires vm.continuations > 29: * @library /testlibrary > 30: * @run main/othervm -Xint SuspendWithInterruptLock Doesn't it make sense to add a testcase without -Xint also? Just to give stress testing with compilation. test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java line 36: > 34: > 35: public class SuspendWithInterruptLock { > 36: static boolean done; done is accessed from different threads, should be volatile. test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java line 54: > 52: Thread.yield(); > 53: } > 54: done = true; I think it is better to use done to stop all threads and set it to true in the main thread after some time. So you could be sure that the yielder hadn't been completed before the suspender started. But it is just proposal. ------------- PR Review: https://git.openjdk.org/jdk/pull/17011#pullrequestreview-1778571090 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1424694672 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1424697179 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1424687810 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1424662055 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1424663078 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1424683585 From omikhaltcova at openjdk.org Wed Dec 13 00:31:46 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Wed, 13 Dec 2023 00:31:46 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v7] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 22:46:45 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Optimization against regression on SiFive Slight performance improvement on SiFive is possible via changing instructions sequence (feq before fmv) but meanwhile some performance will be lost on T-Head: **VisionFive 2** Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 38.890 ? 0.129 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 50.300 ? 0.017 ops/ms **T-Head** Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 95.119 19.104 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 100.212 19.395 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1853075129 From gunnar at wagenknecht.org Wed Dec 13 06:09:00 2023 From: gunnar at wagenknecht.org (Gunnar Wagenknecht) Date: Wed, 13 Dec 2023 07:09:00 +0100 Subject: Too many open files problem on MacOS 14.1 Message-ID: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> Greetings, I'm reaching out because of an issue with the JVM on MacOS that is hitting us a large scale. It started in MacOS 13 but got really worse in 14.1. We basically now need to use -XX:-MaxFDLimit on MacOS for everything with a classpath of more then 10k jars (monolith). That applies to the Java app itself as well as any Java based IDE (Eclipse, IntelliJ) or build tool. The reason is that the MaxFDLimit implementation on Mac is broken in the JVM. The JVM is applying a lower limit to itself. We discovered the -XX:-MaxFDLimit solution after our old workarounds of increasing the open files on MacOS stopped working. We discovered it in the Bazel repository: https://github.com/bazelbuild/bazel/blob/699208763906fbd4b5e46e445b637436ee2293aa/src/main/cpp/startup_options.cc#L589-L597 // Disable the JVM's own unlimiting of file descriptors. We do this // ourselves in blaze.cc so we want our setting to propagate to the JVM. // // The reason to do this is that the JVM's unlimiting is suboptimal on // macOS. Under that platform, the JVM limits the open file descriptors // to the OPEN_MAX constant... which is much lower than the per-process // kernel allowed limit of kern.maxfilesperproc (which is what we set // ourselves to). In older versions of macOS we used a launch daemon to increase the limit: ? cat /Library/LaunchDaemons/limit.maxfiles.plist Label limit.maxfiles ProgramArguments /bin/bash -c launchctl limit maxfiles unlimited unlimited ; launchctl limit maxfiles 1000000 2147483647 RunAtLoad ServiceIPC This stopped working in MacOS 14.1 ? sudo launchctl limit maxfiles 1000000 2147483647 Could not set resource limits: 150: Operation not permitted while System Integrity Protection is engaged We got an interesting info from Apple: Defaults Soft - 65535 Hard - unlimited Per Apple - The unlimited keyword is now an alias for 2147483647 (INT32_MAX) where before it was an alias for 10240 (OPEN_MAX). With SIP enabled, the soft limit can be raised, but not lowered once it?s been raised. This is by design and will generate the "Could not set resource limits". So perhaps the code in the JVM should be changed to INT32_MAX? The thing is, we cannot add the flag uniquely. For example, on older Linux installations (eg., Ubuntu 20) we noticed the -XX:-MaxFDLimit is harmful, i.e. we actually do need the JVM to increase its file handles. I found a couple old bugs but nothing actionable. Can this issue be revisited? I am not able to open a JDK bug, though. Thanks a lot! -Gunnar -- Gunnar Wagenknecht gunnar at wagenknecht.org, http://guw.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From aph-open at littlepinkcloud.com Wed Dec 13 09:24:57 2023 From: aph-open at littlepinkcloud.com (Andrew Haley) Date: Wed, 13 Dec 2023 09:24:57 +0000 Subject: Too many open files problem on MacOS 14.1 In-Reply-To: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> References: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> Message-ID: On 12/13/23 06:09, Gunnar Wagenknecht wrote: > Greetings, > > I'm reaching out because of an issue with the JVM on MacOS that is hitting us a large scale. It started in MacOS 13 but got really worse in 14.1. We basically now need to use -XX:-MaxFDLimit on MacOS for everything with a classpath of more then 10k jars (monolith). That applies to the Java app itself as well as any Java based IDE (Eclipse, IntelliJ) or build tool. > > The reason is that the MaxFDLimit implementation on Mac is broken in the JVM. The JVM is applying a lower limit to itself. We discovered the -XX:-MaxFDLimit solution after our old workarounds of increasing the open files on MacOS stopped working. Have a look at https://bugs.openjdk.org/browse/JDK-8291060, and the problem discussed there. Gerard Ziemski will read this, and I expect he'd like to comment. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From gcao at openjdk.org Wed Dec 13 09:24:58 2023 From: gcao at openjdk.org (Gui Cao) Date: Wed, 13 Dec 2023 09:24:58 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: > MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? > This issue is used to track avoid passing t0 as a temporary register in the following cases: > 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. > 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad > 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad > > Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. > https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 > > ### Testing: > - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) > - [x] Run tier1-3 tests with SiFive unmatched (release) Gui Cao has updated the pull request incrementally with one additional commit since the last revision: Add tmpReg to replace the t1 register ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16880/files - new: https://git.openjdk.org/jdk/pull/16880/files/cc041d22..a97d3627 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16880&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16880&range=00-01 Stats: 31 lines in 2 files changed: 0 ins; 0 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/16880.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16880/head:pull/16880 PR: https://git.openjdk.org/jdk/pull/16880 From gcao at openjdk.org Wed Dec 13 09:28:39 2023 From: gcao at openjdk.org (Gui Cao) Date: Wed, 13 Dec 2023 09:28:39 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: <4E1XPkx4Hgc2np3DGhr2YzKIWXBxhYBUv9E4aYQqpbc=.265c4fba-e076-4fc2-b4cd-ebedac8b3e73@github.com> References: <4E1XPkx4Hgc2np3DGhr2YzKIWXBxhYBUv9E4aYQqpbc=.265c4fba-e076-4fc2-b4cd-ebedac8b3e73@github.com> Message-ID: On Tue, 12 Dec 2023 10:22:29 GMT, Robbin Ehn wrote: > I think we should not use any fixed regs, just add them to the nodes, like PPC have done. Thanks for your review, your comment Great, it's fixed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16880#issuecomment-1853551997 From rehn at openjdk.org Wed Dec 13 10:22:41 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 13 Dec 2023 10:22:41 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Wed, 13 Dec 2023 09:24:58 GMT, Gui Cao wrote: >> MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? >> This issue is used to track avoid passing t0 as a temporary register in the following cases: >> 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. >> 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad >> 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad >> >> Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. >> https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 >> >> ### Testing: >> - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) >> - [x] Run tier1-3 tests with SiFive unmatched (release) > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Add tmpReg to replace the t1 register Thank you! Looks good to me! Can you do a tier1 on latest to make sure? ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16880#pullrequestreview-1779302583 From avoitylov at openjdk.org Wed Dec 13 11:02:46 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Wed, 13 Dec 2023 11:02:46 GMT Subject: RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly [v3] In-Reply-To: References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: <2fRKPSoDDMqwBOZTzXdqfnbnE0d9GBn6_sFVbehaNWM=.cec3b74c-c694-46e0-b6b5-9f8bc5f00513@github.com> On Tue, 12 Dec 2023 12:02:49 GMT, Aleksei Voitylov wrote: >> Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. > > Aleksei Voitylov has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Merge branch 'openjdk:master' into JDK-8321515 > - review comments > - JDK-8321515 implementation Thanks again Boris, Matias and Aleksey! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17017#issuecomment-1853700632 From avoitylov at openjdk.org Wed Dec 13 11:06:49 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Wed, 13 Dec 2023 11:06:49 GMT Subject: Integrated: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> References: <0k-NWGbtvd5O9OMdqwpYl04jtEi31xM3F9fe35g7Fkk=.d2e4e4d4-a529-4d87-9f4e-91de3c4b638f@github.com> Message-ID: On Thu, 7 Dec 2023 10:25:05 GMT, Aleksei Voitylov wrote: > Thanks to @matias9927, JDK-8320278 fixed the JDK build for ARM32 after JDK-8301997. This PR introduces some additional fixes that enable the ARM32 port to actually work. This pull request has now been integrated. Changeset: f573f6d2 Author: Aleksei Voitylov Committer: Aleksey Shipilev URL: https://git.openjdk.org/jdk/commit/f573f6d233d5ea1657018c3c806fee0fac382ac3 Stats: 33 lines in 3 files changed: 14 ins; 4 del; 15 mod 8321515: ARM32: Move method resolution information out of the cpCache properly Reviewed-by: shade ------------- PR: https://git.openjdk.org/jdk/pull/17017 From duke at openjdk.org Wed Dec 13 12:23:57 2023 From: duke at openjdk.org (duke) Date: Wed, 13 Dec 2023 12:23:57 GMT Subject: Withdrawn: 8189088: Add intrusive doubly-linked list utility In-Reply-To: References: Message-ID: On Mon, 25 Sep 2023 06:09:49 GMT, Kim Barrett wrote: > Please review this new facility, providing a general mechanism for intrusive > doubly-linked lists. A class supports inclusion in a list by having an > IntrusiveListEntry member, and providing structured information about how to > access that member. A class supports inclusion in multiple lists by having > multiple IntrusiveListEntry members, with different lists specified to use > different members. > > The IntrusiveList class template provides the list management. It is modelled > on bidirectional containers such as std::list and boost::intrusive::list, > providing many of the expected member types and functions. (Note that the > member types use the Standard's naming conventions.) (Not all standard > container requirements are met; some operations are not presently supported > because they haven't been needed yet.) This includes iteration support using > (mostly) standard-conforming iterator types (they are presently missing > iterator_category member types, pending being able to include so we > can use std::bidirectional_iterator_tag). > > This change only provides the new facility, and doesn't include any uses of > it. It is intended to replace the 4-5 (or maybe more) competing intrusive > doubly-linked lists presently in HotSpot. Unlike most (or perhaps all?) of > those alterantives, this proposal provides a suite of unit tests. > > An example of a place that I think might benefit from this is G1's region > handling. There are various places where G1 iterates over all regions in order > to do something with those which satisfy some property (humongous regions, > regions in the collection set, &etc). If it were trivial to create new region > sublists (and this facility makes that easy), some of these could be turned > into direct iteration over only the regions of interest. > > Some specific points to consider when reviewing this proposal: > > (1) This proposal follows Standard Library API conventions, which differ from > HotSpot in various ways. > > (1a) Lists and iterators provide various type members, with names per the > Standard Library. There has been discussion of using some parts of the > Standard Library eventually, in which case this would be important. But for > now some of the naming choices are atypical for HotSpot. > > (1b) Some of the function signatures follow the Standard Library APIs even > though the reasons for that form might not apply to HotSpot. For example, the > list pop operations don't return the removed value. For node-based containers > in Standard Library that would introduce exception... This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/15896 From gcao at openjdk.org Wed Dec 13 13:10:38 2023 From: gcao at openjdk.org (Gui Cao) Date: Wed, 13 Dec 2023 13:10:38 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Wed, 13 Dec 2023 10:19:51 GMT, Robbin Ehn wrote: > Thank you! Looks good to me! > > Can you do a tier1 on latest to make sure? Yes, I ran the tier1 test on qemu 8.1.50 with UseRVV(release) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16880#issuecomment-1853887978 From duke at openjdk.org Wed Dec 13 19:21:57 2023 From: duke at openjdk.org (Steven Schlansker) Date: Wed, 13 Dec 2023 19:21:57 GMT Subject: RFR: 8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp Message-ID: Discovered while deep in an InternalError debugging session... ------------- Commit messages: - virtualMemoryTracker: typo 'resvered' Changes: https://git.openjdk.org/jdk/pull/17021/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17021&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321892 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17021.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17021/head:pull/17021 PR: https://git.openjdk.org/jdk/pull/17021 From jpai at openjdk.org Wed Dec 13 19:21:58 2023 From: jpai at openjdk.org (Jaikiran Pai) Date: Wed, 13 Dec 2023 19:21:58 GMT Subject: RFR: 8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 18:22:32 GMT, Steven Schlansker wrote: > Discovered while deep in an InternalError debugging session... Hello Steven, I've created a JDK issue to track this change https://bugs.openjdk.org/browse/JDK-8321892. Please update the title of this PR to `8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp` so that it then triggers the official review process. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17021#issuecomment-1851734466 From dlong at openjdk.org Wed Dec 13 20:54:45 2023 From: dlong at openjdk.org (Dean Long) Date: Wed, 13 Dec 2023 20:54:45 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 22:41:56 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) >> >> >> Compiler.perfmap [arguments] (Linux only) >> Write map file for Linux perf tool. >> >> Impact: Low >> >> arguments: >> >> ? filename: (Optional) Name of the map file (STRING, no default value) >> >> If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then >> the default filename will be /tmp/perf-12345.map. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright of PerfMapTest The man page says "no default value" but then right below describes the default value, which is confusing. I would remove "no default value". The code already deals with patterns, so why not allow a pattern like /dir/perf-%x.map and document that the platform-specific process id will be passed to String.format() to expand any formatting tokens in the string? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15871#issuecomment-1854683037 From cjplummer at openjdk.org Wed Dec 13 21:20:45 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Wed, 13 Dec 2023 21:20:45 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: On Wed, 13 Dec 2023 20:51:36 GMT, Dean Long wrote: > The man page says "no default value" but then right below describes the default value, which is confusing. I would remove "no default value". This was copied from VM.cds, which pretty much does the same thing (says "no default value" and then explains the default value below). Since the default is based on the pid, and we probably don't want a long description here, maybe just say "/tmp/perf-.map". Or it can say "(STRING, system-generated default name)" as I see in one other jcmd. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15871#issuecomment-1854714496 From amenkov at openjdk.org Wed Dec 13 21:38:54 2023 From: amenkov at openjdk.org (Alex Menkov) Date: Wed, 13 Dec 2023 21:38:54 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field Message-ID: FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) Testing: - tier1..3 - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic including - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. ------------- Commit messages: - GetClassFields Changes: https://git.openjdk.org/jdk/pull/17094/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17094&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8318563 Stats: 51 lines in 2 files changed: 36 ins; 8 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17094.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17094/head:pull/17094 PR: https://git.openjdk.org/jdk/pull/17094 From ddong at openjdk.org Thu Dec 14 02:08:42 2023 From: ddong at openjdk.org (Denghui Dong) Date: Thu, 14 Dec 2023 02:08:42 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC In-Reply-To: References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> Message-ID: On Wed, 6 Dec 2023 04:00:51 GMT, David Holmes wrote: >> Hi, >> >> Could I have a review of this patch? >> >> In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. >> >> This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. >> >> Best, >> Denghui > > @D-D-H adding a new manageable flag requires a CSR request to be approved. @dholmes-ora Thanks for the review. Could anyone in the serviceability area help review CSR and this patch? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16976#issuecomment-1854994491 From duke at openjdk.org Thu Dec 14 02:44:16 2023 From: duke at openjdk.org (Liming Liu) Date: Thu, 14 Dec 2023 02:44:16 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v13] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 19 commits: - Update the name of the method - Remove the unneccessary class - Use address to find the mapping of the heap - Make the test use a smaller heap and exit properly - Make the jtreg test check the usage of THP - Untabify - Improve the use of madvise for pretouching: 1. use madvise when THP is actually used; 2. remove the need of modifing page_size; 3. log the failure of madvise rather than warn. - Cuddle ptr-operators in pretouch_memory_common - Use pointer_delta to calculate the distance - Add a sanity check for MADV_POPULATE_WRITE - ... and 9 more: https://git.openjdk.org/jdk/compare/cf948548...8d7af152 ------------- Changes: https://git.openjdk.org/jdk/pull/15781/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=12 Stats: 178 lines in 8 files changed: 166 ins; 6 del; 6 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From sspitsyn at openjdk.org Thu Dec 14 02:52:36 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 02:52:36 GMT Subject: RFR: JDK-8318563: GetClassFields should not use random access to field In-Reply-To: References: Message-ID: <73S5mXiweVRQDjHUz3dHncaRgYfQb3gzSiwqVshk_Ak=.d15ac98f-a569-40b6-9023-01d7f01ba27c@github.com> On Wed, 13 Dec 2023 21:32:50 GMT, Alex Menkov wrote: > FieldStream/FilteredFieldStream classes from reflectionUtils.hpp iterate class fields in the reverse order and use field indexes to access instead of forward iteration. This is performance ineffective (see [JDK-8317692](https://bugs.openjdk.org/browse/JDK-8317692) for details). > The change introduces new class FilteredJavaFieldStream as a replacement for FilteredFieldStream. > It uses the same FilteredField/FilteredFieldsMap stuff as FilteredJavaFieldStream does. > > FieldStream/FilteredFieldStream are still used by heap walking API, will be cleaned by [JDK-8317636](https://bugs.openjdk.org/browse/JDK-8317636) > > Testing: > - tier1..3 > - all tests which calls GetClassFields: open/test/hotspot/jtreg/serviceability/jvmti,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/IterateThroughHeap,open/test/hotspot/jtreg/vmTestbase/nsk/jvmti/unit/IsSynthetic > including > - test/hotspot/jtreg/vmTestbase/nsk/jvmti/GetClassFields/getclfld007.java - tests that GetClassFields returns fields in correct order; > - test/hotspot/jtreg/serviceability/jvmti/GetClassFields/FilteredFields/FilteredFieldsTest.java - test that GetClassFields filters out field like reflection. This looks good in general. Will make one more pass tomorrow. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17094#issuecomment-1855032678 From fyang at openjdk.org Thu Dec 14 02:55:42 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 14 Dec 2023 02:55:42 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Wed, 13 Dec 2023 09:24:58 GMT, Gui Cao wrote: >> MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? >> This issue is used to track avoid passing t0 as a temporary register in the following cases: >> 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. >> 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad >> 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad >> >> Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. >> https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 >> >> ### Testing: >> - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) >> - [x] Run tier1-3 tests with SiFive unmatched (release) > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Add tmpReg to replace the t1 register src/hotspot/cpu/riscv/gc/x/x_riscv.ad line 78: > 76: match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); > 77: predicate(UseZGC && !ZGenerational && !needs_acquiring_load_reserved(n) && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong); > 78: effect(KILL cr, TEMP_DEF res, TEMP tmp); You might want to remove `KILL cr` from `effect` at the same time if `cr` (aka `t1`) is not used anymore after this change. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426113364 From dholmes at openjdk.org Thu Dec 14 05:51:45 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 05:51:45 GMT Subject: RFR: 8322065: Initial nroff manpage generation for JDK 23 Message-ID: Updated the version to 23-ea and year to 2024. This initial generation also picks up the unpublished changes from: - [JDK-8302233](https://bugs.openjdk.org/browse/JDK-8302233) (keytool & jarsigner) - [JDK-8290702](https://bugs.openjdk.org/browse/JDK-8290702) (javadoc) (JDK 23 backport) - [JDK-8321384](https://bugs.openjdk.org/browse/JDK-8321384) (javadoc) In addition this includes the updates for - [JDK-8309981](https://bugs.openjdk.org/browse/8309981) Remove expired flags in JDK 23 Thanks ------------- Commit messages: - 8322065: Initial nroff manpage generation for JDK 23 - 8309981: Remove expired flags in JDK 23 Changes: https://git.openjdk.org/jdk/pull/17101/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17101&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322065 Stats: 216 lines in 29 files changed: 47 ins; 61 del; 108 mod Patch: https://git.openjdk.org/jdk/pull/17101.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17101/head:pull/17101 PR: https://git.openjdk.org/jdk/pull/17101 From rehn at openjdk.org Thu Dec 14 07:44:45 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 07:44:45 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 02:52:34 GMT, Fei Yang wrote: >> Gui Cao has updated the pull request incrementally with one additional commit since the last revision: >> >> Add tmpReg to replace the t1 register > > src/hotspot/cpu/riscv/gc/x/x_riscv.ad line 78: > >> 76: match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); >> 77: predicate(UseZGC && !ZGenerational && !needs_acquiring_load_reserved(n) && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong); >> 78: effect(KILL cr, TEMP_DEF res, TEMP tmp); > > You might want to remove `KILL cr` from `effect` at the same time if `cr` (aka `t1`) is not used anymore after this change. I don't *think* that matters, as masm may use t1 so this 'may' kill cr. But since cr never contains any actual information and is never read (as cr), it's unclear to me, if we ever can get unwanted side-effect if either missing a "KILL cr" or have one to many, you know? As we have fused cmp+branch I would think the graph would be better of believing we always have a live CR. We don't need to re-do the cmp to get back the killed cr, we can pretend it is still alive, since the fused branch will act as it is alive, no? Also I have patch which cleans-up the registers in riscv.ad, I'll try to get it out as RFC in form of a PR. (i.e. my suggestions) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426349293 From duke at openjdk.org Thu Dec 14 07:47:02 2023 From: duke at openjdk.org (Liming Liu) Date: Thu, 14 Dec 2023 07:47:02 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v14] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Try to add a testcase to cover concurrent pretouch ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/8d7af152..3de05073 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=12-13 Stats: 26 lines in 1 file changed: 26 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From duke at openjdk.org Thu Dec 14 07:59:15 2023 From: duke at openjdk.org (Liming Liu) Date: Thu, 14 Dec 2023 07:59:15 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v15] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with two additional commits since the last revision: - Use char* instead - Fix the function arguments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/3de05073..16dd9a5d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=14 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=13-14 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From fyang at openjdk.org Thu Dec 14 08:00:43 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 14 Dec 2023 08:00:43 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 07:41:51 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/gc/x/x_riscv.ad line 78: >> >>> 76: match(Set res (WeakCompareAndSwapP mem (Binary oldval newval))); >>> 77: predicate(UseZGC && !ZGenerational && !needs_acquiring_load_reserved(n) && n->as_LoadStore()->barrier_data() == XLoadBarrierStrong); >>> 78: effect(KILL cr, TEMP_DEF res, TEMP tmp); >> >> You might want to remove `KILL cr` from `effect` at the same time if `cr` (aka `t1`) is not used anymore after this change. > > I don't *think* that matters, as masm may use t1 so this 'may' kill cr. > But since cr never contains any actual information and is never read (as cr), it's unclear to me, if we ever can get unwanted side-effect if either missing a "KILL cr" or have one to many, you know? > > As we have fused cmp+branch I would think the graph would be better of believing we always have a live CR. > We don't need to re-do the cmp to get back the killed cr, we can pretend it is still alive, since the fused branch will act as it is alive, no? > > Also I have patch which cleans-up the registers in riscv.ad, I'll try to get it out as RFC in form of a PR. (i.e. my suggestions) > I don't _think_ that matters, as masm may use t1 so this 'may' kill cr. But since cr never contains any actual information and is never read (as cr), it's unclear to me, if we ever can get unwanted side-effect if either missing a "KILL cr" or have one to many, you know? > > As we have fused cmp+branch I would think the graph would be better of believing we always have a live CR. We don't need to re-do the cmp to get back the killed cr, we can pretend it is still alive, since the fused branch will act as it is alive, no? The C2 JIT does expect something in a CR register. Check the nodes in file riscv.ad like `cmpFastLock`, `cmpFastUnlock` and `partialSubtypeCheckVsZero` [1]. That's way we emulated the CR register on riscv using the `t1` general-purpose register. I don't see another way without changing the C2's assumption about CR. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/riscv.ad#L10434 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426363410 From duke at openjdk.org Thu Dec 14 08:05:24 2023 From: duke at openjdk.org (Liming Liu) Date: Thu, 14 Dec 2023 08:05:24 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v16] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Fix the typo ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/16dd9a5d..609d3979 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=15 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=14-15 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From rehn at openjdk.org Thu Dec 14 08:05:41 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 08:05:41 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 07:57:48 GMT, Fei Yang wrote: >> I don't *think* that matters, as masm may use t1 so this 'may' kill cr. >> But since cr never contains any actual information and is never read (as cr), it's unclear to me, if we ever can get unwanted side-effect if either missing a "KILL cr" or have one to many, you know? >> >> As we have fused cmp+branch I would think the graph would be better of believing we always have a live CR. >> We don't need to re-do the cmp to get back the killed cr, we can pretend it is still alive, since the fused branch will act as it is alive, no? >> >> Also I have patch which cleans-up the registers in riscv.ad, I'll try to get it out as RFC in form of a PR. (i.e. my suggestions) > >> I don't _think_ that matters, as masm may use t1 so this 'may' kill cr. But since cr never contains any actual information and is never read (as cr), it's unclear to me, if we ever can get unwanted side-effect if either missing a "KILL cr" or have one to many, you know? >> >> As we have fused cmp+branch I would think the graph would be better of believing we always have a live CR. We don't need to re-do the cmp to get back the killed cr, we can pretend it is still alive, since the fused branch will act as it is alive, no? > > The C2 JIT compiler does expect something in a CR register. Check the nodes in file riscv.ad like `cmpFastLock`, `cmpFastUnlock` and `partialSubtypeCheckVsZero` [1]. That's way we emulated the CR register on riscv using the `t1` general-purpose register. I don't see another way without changing the C2's assumption about CR. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/riscv.ad#L10434 Sorry, I'm not saying we should remove cr, I'm just saying I don't think need to KILL cr. C2 will think cr is valid all the time. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426367765 From duke at openjdk.org Thu Dec 14 08:12:22 2023 From: duke at openjdk.org (Liming Liu) Date: Thu, 14 Dec 2023 08:12:22 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v17] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Replace to char* when type casting ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/609d3979..e1f844f8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=16 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=15-16 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From rehn at openjdk.org Thu Dec 14 08:29:38 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 08:29:38 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 08:02:39 GMT, Robbin Ehn wrote: >>> I don't _think_ that matters, as masm may use t1 so this 'may' kill cr. But since cr never contains any actual information and is never read (as cr), it's unclear to me, if we ever can get unwanted side-effect if either missing a "KILL cr" or have one to many, you know? >>> >>> As we have fused cmp+branch I would think the graph would be better of believing we always have a live CR. We don't need to re-do the cmp to get back the killed cr, we can pretend it is still alive, since the fused branch will act as it is alive, no? >> >> The C2 JIT compiler does expect something in a CR register. Check the nodes in file riscv.ad like `cmpFastLock`, `cmpFastUnlock` and `partialSubtypeCheckVsZero` [1]. That's way we emulated the CR register on riscv using the `t1` general-purpose register. I don't see another way without changing the C2's assumption about CR. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/riscv.ad#L10434 > > Sorry, I'm not saying we should remove cr, I'm just saying I don't think need to KILL cr. > C2 will think cr is valid all the time. I tested removing all "KILL cr" from all riscv ad files. Compiler tests c2/codegen works fine (not 100% done yet). Which was my theory, as we never consume or produce values in 'cr' we can see cr as always having correct values, which is nothing. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426390300 From gcao at openjdk.org Thu Dec 14 08:37:01 2023 From: gcao at openjdk.org (Gui Cao) Date: Thu, 14 Dec 2023 08:37:01 GMT Subject: RFR: 8321972: test runtime/Unsafe/InternalErrorTest.java timeout on linux-riscv64 platform Message-ID: <9rOe1C_eoD2fz22nqlfzaK5kJ_gxYZBsVVRe4hQwhaw=.abf30b11-de1f-456e-baa4-208755e136ee@github.com> As described on the JBS issue, JDK-8320886 extended InternalErrorTest.java adding extra test for Unsafe_SetMemory0 trying to access next page after truncation. This triggers SIGBUS error and control flow is transfered to JVM signal handler [1]. But the current logic doesn't consider 16-bit compressed instructions when calculating next_pc. It always add NativeCall::instruction_size which is 4 to pc and use the result as next_pc. This is not correct as the memset invoked in this case contains compressed instructions and it is those instructions that are triggering the SIGBUS error. The proposed fix is similar with other platform with variable-length instruction encoding like x86. The encoding of the instruction triggering the SIGBUS error is checked to see if it is a compressed instruction and then calculate next_pc based on that. The test case can now pass normally with this fix. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_riscv/os_linux_riscv.cpp#L274 ### Testing: - [ ] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) ------------- Commit messages: - 8321972: test runtime/Unsafe/InternalErrorTest.java timeout on linux-riscv64 platform Changes: https://git.openjdk.org/jdk/pull/17103/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17103&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321972 Stats: 13 lines in 2 files changed: 11 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17103.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17103/head:pull/17103 PR: https://git.openjdk.org/jdk/pull/17103 From fyang at openjdk.org Thu Dec 14 08:48:38 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 14 Dec 2023 08:48:38 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 08:26:02 GMT, Robbin Ehn wrote: >> Sorry, I'm not saying we should remove cr, I'm just saying I don't think need to KILL cr. >> C2 will think cr is valid all the time. > > I tested removing all "KILL cr" from all riscv ad files. > CORRECTION: > Running compiler tests c2/codegen: something blow up :) > > My theory was, as we never consume or produce values in 'cr' we can see cr as always having correct values, which is nothing. > > It seem like we need to KILL cr, not to confuse C2 to much :) Hi @robehn, I am still a bit confused. Are you suggesting we keep the `KILL cr` in `effect`? My first comment is suggesting we remove that. Of couse, it will still work in functionality if we simply keep it there. But I am a bit worried that this may affect performance in some way as our CR could contain a live value in nodes as I mentioned in my previous comment. A `KILL cr` in `effect` as in this case may pose some unnecessary constraint on C2 code motion. That's just my concern. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426410697 From alanb at openjdk.org Thu Dec 14 09:03:38 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 09:03:38 GMT Subject: RFR: 8322065: Initial nroff manpage generation for JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 05:46:01 GMT, David Holmes wrote: > Updated the version to 23-ea and year to 2024. > > This initial generation also picks up the unpublished changes from: > > - [JDK-8302233](https://bugs.openjdk.org/browse/JDK-8302233) (keytool & jarsigner) > - [JDK-8290702](https://bugs.openjdk.org/browse/JDK-8290702) (javadoc) (JDK 23 backport) > - [JDK-8321384](https://bugs.openjdk.org/browse/JDK-8321384) (javadoc) > > > In addition this includes the updates for > > - [JDK-8309981](https://bugs.openjdk.org/browse/8309981) Remove expired flags in JDK 23 > > Thanks Initially I wondered if JDK-8309981 should be separated but include keeps things in sync so I think okay. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17101#pullrequestreview-1781343785 From prappo at openjdk.org Thu Dec 14 09:19:37 2023 From: prappo at openjdk.org (Pavel Rappo) Date: Thu, 14 Dec 2023 09:19:37 GMT Subject: RFR: 8322065: Initial nroff manpage generation for JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 05:46:01 GMT, David Holmes wrote: > Updated the version to 23-ea and year to 2024. > > This initial generation also picks up the unpublished changes from: > > - [JDK-8302233](https://bugs.openjdk.org/browse/JDK-8302233) (keytool & jarsigner) > - [JDK-8290702](https://bugs.openjdk.org/browse/JDK-8290702) (javadoc) (JDK 23 backport) > - [JDK-8321384](https://bugs.openjdk.org/browse/JDK-8321384) (javadoc) > > > In addition this includes the updates for > > - [JDK-8309981](https://bugs.openjdk.org/browse/8309981) Remove expired flags in JDK 23 > > Thanks > Updated the version to 23-ea and year to 2024. > > This initial generation also picks up the unpublished changes from: > > * [JDK-8321384](https://bugs.openjdk.org/browse/JDK-8321384) (javadoc) Thanks for doing this, David. I only note that the changes for JDK-8321384 were published in [JDK-8308715](https://bugs.openjdk.org/browse/JDK-8308715), which was integrated in the mainline before JDK 22 RDP 1. So they are already present in the mainline. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17101#issuecomment-1855467435 From rehn at openjdk.org Thu Dec 14 09:44:39 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 09:44:39 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 08:45:34 GMT, Fei Yang wrote: >> I tested removing all "KILL cr" from all riscv ad files. >> CORRECTION: >> Running compiler tests c2/codegen: something blow up :) >> >> My theory was, as we never consume or produce values in 'cr' we can see cr as always having correct values, which is nothing. >> >> It seem like we need to KILL cr, not to confuse C2 to much :) > > Hi @robehn, I am still a bit confused. Are you suggesting we keep the `KILL cr` in `effect`? My first comment is suggesting we remove that. Of couse, it will still work in functionality if we simply keep it there. But I am a bit worried that this may affect performance in some way as our CR could contain a live value in nodes as I mentioned in my previous comment. A `KILL cr` in `effect` as in this case may pose some unnecessary constraint on C2 code motion. That's just my concern. This is not related to this bug (yes, we can remove KILL cr): Ah, this is very problematic, as t1 is used masm, which means calling masm may destroy cr. For exampel C2 may call: `MacroAssembler::zero_words(Register ptr, Register cnt)` as it do not clobber t1 but if we would call: `MacroAssembler::zero_words(Register base, uint64_t cnt) ` which do clobber t1 suddenly we need to add KILL cr. Which means if someone just do a small change like: - masm->zero_words(reg1, reg2); + masm->zero_words(reg1, 64); We suddenly can get very subtle bugs. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426475012 From fyang at openjdk.org Thu Dec 14 09:53:38 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 14 Dec 2023 09:53:38 GMT Subject: RFR: 8321972: test runtime/Unsafe/InternalErrorTest.java timeout on linux-riscv64 platform In-Reply-To: <9rOe1C_eoD2fz22nqlfzaK5kJ_gxYZBsVVRe4hQwhaw=.abf30b11-de1f-456e-baa4-208755e136ee@github.com> References: <9rOe1C_eoD2fz22nqlfzaK5kJ_gxYZBsVVRe4hQwhaw=.abf30b11-de1f-456e-baa4-208755e136ee@github.com> Message-ID: On Thu, 14 Dec 2023 08:28:42 GMT, Gui Cao wrote: > As described on the JBS issue, JDK-8320886 extended InternalErrorTest.java adding extra test for Unsafe_SetMemory0 trying to access next page after truncation. This triggers SIGBUS error and control flow is transfered to JVM signal handler [1]. But the current logic doesn't consider 16-bit compressed instructions when calculating next_pc. It always add NativeCall::instruction_size which is 4 to pc and use the result as next_pc. This is not correct as the memset invoked in this case contains compressed instructions and it is those instructions that are triggering the SIGBUS error. > > The proposed fix is similar with other platform with variable-length instruction encoding like x86. > The encoding of the instruction triggering the SIGBUS error is checked to see if it is a compressed instruction and then calculate next_pc based on that. The test case can now pass normally with this fix. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_riscv/os_linux_riscv.cpp#L274 > > ### Testing: > - [ ] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) That make sense to me. I find that the native GNU compiler toolchain on both my unmatched and licheepi-4a boards are compiling with RVC by default which means native JDK builds on those hardware platforms will also have compressed instructions. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17103#pullrequestreview-1781437272 From fyang at openjdk.org Thu Dec 14 10:09:38 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 14 Dec 2023 10:09:38 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: Message-ID: <5qbvK3wdtVka-kI4XsN7Q40R2K2_6Uppz_pPsN7-834=.4a1c4819-dc1c-4e1d-8edb-c8e9112fb4dd@github.com> On Thu, 14 Dec 2023 09:41:32 GMT, Robbin Ehn wrote: >> Hi @robehn, I am still a bit confused. Are you suggesting we keep the `KILL cr` in `effect`? My first comment is suggesting we remove that. Of couse, it will still work in functionality if we simply keep it there. But I am a bit worried that this may affect performance in some way as our CR could contain a live value in nodes as I mentioned in my previous comment. A `KILL cr` in `effect` as in this case may pose some unnecessary constraint on C2 code motion. That's just my concern. > > This is not related to this enhancement (yes, we can remove KILL cr): > > Ah, this is very problematic, as t1 is used masm, which means calling masm may destroy cr. > > For exampel C2 may call: > `MacroAssembler::zero_words(Register ptr, Register cnt)` as it do not clobber t1 but if we would call: > `MacroAssembler::zero_words(Register base, uint64_t cnt) ` which do clobber t1 suddenly we need to add KILL cr. > > Which means if someone just do a small change like: > > - masm->zero_words(reg1, reg2); > + masm->zero_words(reg1, 64); > > > We suddenly can get very subtle bugs. Yeah, I agree it is error-prone. But it is the same case for ther other platforms which have a real CR register, isn't it? They should have the same issue. I guess it doesn't deserve it if we keep `t1` the dedicated CR register, that is not touched else where by other masm assembers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426505395 From rehn at openjdk.org Thu Dec 14 10:29:37 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 10:29:37 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: <5qbvK3wdtVka-kI4XsN7Q40R2K2_6Uppz_pPsN7-834=.4a1c4819-dc1c-4e1d-8edb-c8e9112fb4dd@github.com> References: <5qbvK3wdtVka-kI4XsN7Q40R2K2_6Uppz_pPsN7-834=.4a1c4819-dc1c-4e1d-8edb-c8e9112fb4dd@github.com> Message-ID: On Thu, 14 Dec 2023 10:06:39 GMT, Fei Yang wrote: >> This is not related to this enhancement (yes, we can remove KILL cr): >> >> Ah, this is very problematic, as t1 is used masm, which means calling masm may destroy cr. >> >> For exampel C2 may call: >> `MacroAssembler::zero_words(Register ptr, Register cnt)` as it do not clobber t1 but if we would call: >> `MacroAssembler::zero_words(Register base, uint64_t cnt) ` which do clobber t1 suddenly we need to add KILL cr. >> >> Which means if someone just do a small change like: >> >> - masm->zero_words(reg1, reg2); >> + masm->zero_words(reg1, 64); >> >> >> We suddenly can get very subtle bugs. > > Yeah, I agree it is error-prone. But it is the same case for ther other platforms which have a real CR register, isn't it? They should have the same issue. I guess it doesn't deserve it if we keep `t1` the dedicated CR register, that is not touched else where by other masm assembers. I think we can pretty easy add a scoped object, similar to uncompressed region, where we can white-list registers. So the ad file would add for example to fast_lock: { AllowAdditionalRegister(t1); // As we kill cr fastlock may use t1/(cr) in addition to t0 and passed in regs. ___ fastlock(...); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426526889 From rehn at openjdk.org Thu Dec 14 10:29:38 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 10:29:38 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: <5qbvK3wdtVka-kI4XsN7Q40R2K2_6Uppz_pPsN7-834=.4a1c4819-dc1c-4e1d-8edb-c8e9112fb4dd@github.com> Message-ID: On Thu, 14 Dec 2023 10:24:42 GMT, Robbin Ehn wrote: >> Yeah, I agree it is error-prone. But it is the same case for ther other platforms which have a real CR register, isn't it? They should have the same issue. I guess it doesn't deserve it if we keep `t1` the dedicated CR register, that is not touched else where by other masm assembers. > > I think we can pretty easy add a scoped object, similar to uncompressed region, where we can white-list registers. > > So the ad file would add for example to fast_lock: > > > { > AllowAdditionalRegister(t1); // As we kill cr fastlock may use t1/(cr) in addition to t0 and passed in regs. > ___ fastlock(...); > } Anyhow something to think about. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426529344 From jsjolen at openjdk.org Thu Dec 14 11:16:47 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 14 Dec 2023 11:16:47 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v17] In-Reply-To: References: Message-ID: <-yGcrNxBa91rrdyLb4zNbgz_VRuht7MXBpnel_-WWxg=.6eec01fb-03e7-42d4-b07c-d5617f34bdc2@github.com> On Thu, 14 Dec 2023 08:12:22 GMT, Liming Liu wrote: >> As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). >> >> Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: >> >> >> >> >> >> >> >> >> >> >> >>
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
> > Liming Liu has updated the pull request incrementally with one additional commit since the last revision: > > Replace to char* when type casting test/hotspot/gtest/runtime/test_os_linux.cpp line 377: > 375: EXPECT_TRUE(os::release_memory(heap, 1 * G)); > 376: UseTransparentHugePages = useThp; > 377: } This seems like it's concurrently running `madvise(..., MADV_POPULATE_WRITE)`, correct? This is not what I meant. What I meant was having at least 2 threads, where one thread is running `os::pretouch_memory` and another using the memory for something. For example, 1 thread pretouching, the other thread filling out the memory with an incrementing integer array `[0,1,2,3,4,...]`. I think this is what Kim meant also, or am I the one misunderstanding him? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1426582129 From fyang at openjdk.org Thu Dec 14 11:34:39 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 14 Dec 2023 11:34:39 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: <5qbvK3wdtVka-kI4XsN7Q40R2K2_6Uppz_pPsN7-834=.4a1c4819-dc1c-4e1d-8edb-c8e9112fb4dd@github.com> Message-ID: <448jme9oP8X7BtNKBAOUkkN1bJZy1mkWdZDQot1i_t4=.6cfc3581-2712-4a25-a527-177f462031d9@github.com> On Thu, 14 Dec 2023 10:26:46 GMT, Robbin Ehn wrote: >> I think we can pretty easy add a scoped object, similar to uncompressed region, where we can white-list registers. >> >> So the ad file would add for example to fast_lock: >> >> >> { >> AllowAdditionalRegister(t1); // As we kill cr fastlock may use t1/(cr) in addition to t0 and passed in regs. >> ___ fastlock(...); >> } > > Anyhow something to think about. Your proposal sounds interesting. It's like a way of self-checking/assertion about `t1` register usage. The masm could stop when it sees usage of `t1` in a scope without an `AllowAdditionalRegister(t1)`. But I am not sure if there is a good place for us to add checking for that purpose. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426599022 From sspitsyn at openjdk.org Thu Dec 14 12:14:40 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 12:14:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Fri, 8 Dec 2023 11:54:40 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods @AlanBateman Thank you for reviewing an the comment. > It shouldn't be necessary to touch mount/unmount as the thread identity is the carrier, not the virtual thread, when executing the "critical code". Carrier thread also can be suspended when executing the "critical code". Why do you think it can't be a problem? Do you think the deadlocking scenario described in the bug report is not possible? > toggle_is_in_critical_section needs to detect reentrancy, it is otherwise too easy to refactor the Java code, e.g. call threadState while holding the interrupt lock. Is your concern a recursive `interruptLock` enter? I was also thinking if this scenario is possible, so a counter can be used instead of boolean. > All the use-sides will need to use try-finally to more reliably revert the critical section flag when rewinding. Right, thanks. It is already done. > The naming is very problematic, we'll need to replace with methods that are clearly named enter and exit critical section. Ongoing work in this area to support monitors has to introduce some temporary pinning so there will be enter/exitCriticalSection methods, that's a better place for the JVMTI hooks. Okay. What about the Leonid's suggestion to name it `notifyJvmtiDisableSuspend()` ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1855730274 From sspitsyn at openjdk.org Thu Dec 14 12:14:45 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 12:14:45 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Tue, 12 Dec 2023 23:42:07 GMT, Leonid Mesnik wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods > > src/java.base/share/classes/java/lang/VirtualThread.java line 1164: > >> 1162: >> 1163: @IntrinsicCandidate >> 1164: private native void notifyJvmtiCriticalLock(boolean enter); > > The name is confusing to me, the CriticalLock looks like it is the section is critical and might be taken by a single thread only. Or it's just unclear what is critical here. > However, the purpose is to disable suspend > Wouldn't be 'notifyJvmtiSuspendLock notifyJvmtiDisableSuspend' better name here? > or comment what critical means here. Okay, thanks. I like your name suggestion but let's check with Alan first. > test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java line 30: > >> 28: * @requires vm.continuations >> 29: * @library /testlibrary >> 30: * @run main/othervm -Xint SuspendWithInterruptLock > > Doesn't it make sense to add a testcase without -Xint also? Just to give stress testing with compilation. Thanks. I was also thinking about this. Will add a sub-test without -Xint. > test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java line 36: > >> 34: >> 35: public class SuspendWithInterruptLock { >> 36: static boolean done; > > done is accessed from different threads, should be volatile. Good suggestion, thanks. > test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock/SuspendWithInterruptLock.java line 54: > >> 52: Thread.yield(); >> 53: } >> 54: done = true; > > I think it is better to use done to stop all threads and set it to true in the main thread after some time. So you could be sure that the yielder hadn't been completed before the suspender started. But it is just proposal. Thank you. Will consider this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1426638981 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1426635613 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1426636196 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1426637200 From sspitsyn at openjdk.org Thu Dec 14 12:19:43 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 12:19:43 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Tue, 12 Dec 2023 23:54:43 GMT, Leonid Mesnik wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods > > src/hotspot/share/prims/jvm.cpp line 4013: > >> 4011: // Notification from VirtualThread about entering/exiting sync critical section. >> 4012: // Needed to avoid deadlocks with JVMTI suspend mechanism. >> 4013: JVM_ENTRY(void, JVM_VirtualThreadCriticalLock(JNIEnv* env, jobject vthread, jboolean enter)) > > the jobject vthread is not used. Can't be the method made static to reduce the number of arguments? > It is the performance-critical code, I don't know if it is optimized by C2. Good question. In general, I'd like to keep this unified with the other `notiftJvmti` methods. Let me double check how it fits together. Also, I'm not sure how is going to impact the intrinsification. > src/hotspot/share/runtime/javaThread.hpp line 320: > >> 318: bool _is_in_VTMS_transition; // thread is in virtual thread mount state transition >> 319: bool _is_in_tmp_VTMS_transition; // thread is in temporary virtual thread mount state transition >> 320: bool _is_in_critical_section; // thread is in a locking critical section > > might make sense to add a comment, that his variable Is changed/read only by current thread and no sync is needed. Good suggestion, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1426643218 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1426643663 From alanb at openjdk.org Thu Dec 14 12:22:40 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 12:22:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Thu, 14 Dec 2023 12:06:41 GMT, Serguei Spitsyn wrote: > Carrier thread also can be suspended when executing the "critical code". Why do you think it can't be a problem? Do you think the deadlocking scenario described in the bug report is not possible? It's a different scenario. When mounting, the coordination of the interrupt status is done before the thread identity is changed. Similarly, when unmounting, the coordination is done after reverting the thread identity to the carrier. So if there is an agent randomly suspending threads when it shouldn't be an issue here. > > toggle_is_in_critical_section needs to detect reentrancy, it is otherwise too easy to refactor the Java code, e.g. call threadState while holding the interrupt lock. > > Is your concern a recursive `interruptLock` enter? I was also thinking if this scenario is possible, so a counter can be used instead of boolean. Minimally an assert. A counter might be needed later. > Okay. What about the Leonid's suggestion to name it `notifyJvmtiDisableSuspend()` ? We have changes in the works that require pinning during some critical sections so I think I prefer to use that terminology. We can move the notification to JVMTI to the enter/leave methods. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1855748841 From dholmes at openjdk.org Thu Dec 14 12:26:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 12:26:40 GMT Subject: RFR: 8322065: Initial nroff manpage generation for JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 09:17:05 GMT, Pavel Rappo wrote: > Thanks for doing this, David. I only note that the changes for JDK-8321384 were published in [JDK-8308715](https://bugs.openjdk.org/browse/JDK-8308715), which was integrated in the mainline before JDK 22 RDP 1. So they are already present in the mainline. Ah I see. Thanks for correcting that, I will update the PR and JBS issue. And thanks for looking at this @pavelrappo . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17101#issuecomment-1855755042 From dholmes at openjdk.org Thu Dec 14 12:30:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 12:30:38 GMT Subject: RFR: 8322065: Initial nroff manpage generation for JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 09:01:17 GMT, Alan Bateman wrote: > Initially I wondered if JDK-8309981 should be separated but include keeps things in sync so I think okay. Thanks for the review @AlanBateman . Yeah I was in two minds there myself. I started fixing [JDK-8309981](https://bugs.openjdk.org/browse/JDK-8309981) only to discover that the start of release updates had not been done as part of the start of release, so I figured I may as well fix it all together given I'd generated all the updated files anyway. But I'm still a little unsure ... in fact I think I will remove it in the morning. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17101#issuecomment-1855761906 From dholmes at openjdk.org Thu Dec 14 12:35:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 12:35:52 GMT Subject: RFR: 8322065: Initial nroff manpage generation for JDK 23 [v2] In-Reply-To: References: Message-ID: > Updated the version to 23-ea and year to 2024. > > This initial generation also picks up the unpublished changes from: > > - [JDK-8302233](https://bugs.openjdk.org/browse/JDK-8302233) (keytool & jarsigner) > - [JDK-8290702](https://bugs.openjdk.org/browse/JDK-8290702) (javadoc) (JDK 23 backport) > > Thanks David Holmes has updated the pull request incrementally with one additional commit since the last revision: Revert "8309981: Remove expired flags in JDK 23" This reverts commit 0324a90e936ae01e42ae099e7235156326cc318a. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17101/files - new: https://git.openjdk.org/jdk/pull/17101/files/65a8c9ed..8b052141 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17101&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17101&range=00-01 Stats: 23 lines in 2 files changed: 10 ins; 11 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17101.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17101/head:pull/17101 PR: https://git.openjdk.org/jdk/pull/17101 From dholmes at openjdk.org Thu Dec 14 12:43:46 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 12:43:46 GMT Subject: RFR: 8309981: Remove expired flags in JDK 23 Message-ID: All expired flags are removed from the flags table. The two documented flags have their documentation moved from the obsolete section to the removed section. The link to JDK 22 removed flags is added to the list. Thanks. ------------- Commit messages: - Revert "8322065: Initial nroff manpage generation for JDK 23" - Revert "Revert "8309981: Remove expired flags in JDK 23"" - Revert "8309981: Remove expired flags in JDK 23" - 8322065: Initial nroff manpage generation for JDK 23 - 8309981: Remove expired flags in JDK 23 Changes: https://git.openjdk.org/jdk/pull/17107/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17107&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8309981 Stats: 23 lines in 2 files changed: 11 ins; 10 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17107.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17107/head:pull/17107 PR: https://git.openjdk.org/jdk/pull/17107 From luhenry at openjdk.org Thu Dec 14 13:34:50 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 14 Dec 2023 13:34:50 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 19:09:46 GMT, Hamlin Li wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains four additional commits since the last revision: >> >> - Flag fixes >> - Merge branch 'master' into sha256 >> - Share code >> - SHA-2 > > src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 1359: > >> 1357: } >> 1358: >> 1359: inline void vmsltu_vi(VectorRegister Vd, VectorRegister Vs2, int32_t imm, VectorMask vm = unmasked) { > > Seems this function is not used in the code? > And, when `imm` == 0, seems it will output unexpected value? We can keep it as it's a good complement to existing methods and the cost is very low. Good point on `imm == 0`, we should probably have an assert for that case or have a `splat(false)` operation. > src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 1363: > >> 1361: } >> 1362: >> 1363: inline void vmsgeu_vi(VectorRegister Vd, VectorRegister Vs2, int32_t imm, VectorMask vm = unmasked) { > > Same comments as `vmsltu_vi ` above. It's used at https://github.com/openjdk/jdk/pull/16562/files#diff-97f199af6d1c8c17b2fa4f50eb1bbc0081858cc59a899f32792a2d31f933ccc4R3945. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1426722994 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1426721667 From gcao at openjdk.org Thu Dec 14 14:35:57 2023 From: gcao at openjdk.org (Gui Cao) Date: Thu, 14 Dec 2023 14:35:57 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v3] In-Reply-To: References: Message-ID: > MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? > This issue is used to track avoid passing t0 as a temporary register in the following cases: > 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. > 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad > 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad > > Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. > https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 > > ### Testing: > - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) > - [x] Run tier1-3 tests with SiFive unmatched (release) Gui Cao has updated the pull request incrementally with one additional commit since the last revision: Remove unneeded kill cr ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16880/files - new: https://git.openjdk.org/jdk/pull/16880/files/a97d3627..cffb639a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16880&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16880&range=01-02 Stats: 46 lines in 2 files changed: 4 ins; 0 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/16880.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16880/head:pull/16880 PR: https://git.openjdk.org/jdk/pull/16880 From gcao at openjdk.org Thu Dec 14 14:49:38 2023 From: gcao at openjdk.org (Gui Cao) Date: Thu, 14 Dec 2023 14:49:38 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v2] In-Reply-To: References: <5qbvK3wdtVka-kI4XsN7Q40R2K2_6Uppz_pPsN7-834=.4a1c4819-dc1c-4e1d-8edb-c8e9112fb4dd@github.com> Message-ID: On Thu, 14 Dec 2023 10:26:46 GMT, Robbin Ehn wrote: >> I think we can pretty easy add a scoped object, similar to uncompressed region, where we can white-list registers. >> >> So the ad file would add for example to fast_lock: >> >> >> { >> AllowAdditionalRegister(t1); // As we kill cr fastlock may use t1/(cr) in addition to t0 and passed in regs. >> ___ fastlock(...); >> } > > Anyhow something to think about. @robehn @RealFYang I have done a check and no cr is being used and have removed the extra kill cr. please take another look, thanks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16880#discussion_r1426817191 From rehn at openjdk.org Thu Dec 14 14:52:44 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 14:52:44 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v3] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 14:35:57 GMT, Gui Cao wrote: >> MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? >> This issue is used to track avoid passing t0 as a temporary register in the following cases: >> 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. >> 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad >> 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad >> >> Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. >> https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 >> >> ### Testing: >> - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) >> - [x] Run tier1-3 tests with SiFive unmatched (release) > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Remove unneeded kill cr Thank again! ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16880#pullrequestreview-1781986614 From rehn at openjdk.org Thu Dec 14 15:11:02 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 14 Dec 2023 15:11:02 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Merge branch 'master' into sha256 - Materialize constants address once - Removed template - Flag fixes - Merge branch 'master' into sha256 - Share code - SHA-2 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/d5048756..fdb17d1b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=02-03 Stats: 109037 lines in 2172 files changed: 58482 ins; 42477 del; 8078 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From dchuyko at openjdk.org Thu Dec 14 15:29:06 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 14 Dec 2023 15:29:06 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v15] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 33 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 23 more: https://git.openjdk.org/jdk/compare/fde5b168...44d680cd ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=14 Stats: 372 lines in 15 files changed: 339 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From mbaesken at openjdk.org Thu Dec 14 15:32:40 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 14 Dec 2023 15:32:40 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding the description of ` -Xcheck:jni ` says https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/clopts002.html "The -Xcheck:jni Option This option is useful in diagnosing problems with applications that use the Java Native Interface (JNI). Sometimes bugs in the native code can cause the HotSpot VM to crash or behave incorrectly." Not sure if this IEEE conformance issue fits perfectly well into what is said above, but looking at the discussion in this PR, it seems to be controversial . ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1856061793 From sroy at openjdk.org Thu Dec 14 16:21:40 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Thu, 14 Dec 2023 16:21:40 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v3] In-Reply-To: <7V0zHrWeOjnDyHJuq3DFsb-BvaQvZbwE5zIGyxWvGNE=.48a0fc72-c70d-4e21-891a-5f4714bac830@github.com> References: <7V0zHrWeOjnDyHJuq3DFsb-BvaQvZbwE5zIGyxWvGNE=.48a0fc72-c70d-4e21-891a-5f4714bac830@github.com> Message-ID: On Tue, 5 Dec 2023 13:48:11 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> encapsulate everything in os::Aix::dlopen > > Excellent, this is how I have pictured a good solution. Very nice. > > A number of remarks, but nothing fundamental. @tstuefe Sorry to tag you. Can you review the code. Once this code goes in I can push in my changes. We are targeting the fix for January. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1856145815 From alanb at openjdk.org Thu Dec 14 16:28:38 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 16:28:38 GMT Subject: RFR: 8309981: Remove expired flags in JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 12:38:48 GMT, David Holmes wrote: > All expired flags are removed from the flags table. > > The two documented flags have their documentation moved from the obsolete section to the removed section. > > The link to JDK 22 removed flags is added to the list. > > Thanks. Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17107#pullrequestreview-1782197847 From sspitsyn at openjdk.org Thu Dec 14 16:59:40 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 16:59:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Thu, 14 Dec 2023 12:11:42 GMT, Serguei Spitsyn wrote: >> src/java.base/share/classes/java/lang/VirtualThread.java line 1164: >> >>> 1162: >>> 1163: @IntrinsicCandidate >>> 1164: private native void notifyJvmtiCriticalLock(boolean enter); >> >> The name is confusing to me, the CriticalLock looks like it is the section is critical and might be taken by a single thread only. Or it's just unclear what is critical here. >> However, the purpose is to disable suspend >> Wouldn't be 'notifyJvmtiSuspendLock notifyJvmtiDisableSuspend' better name here? >> or comment what critical means here. > > Okay, thanks. I like your name suggestion but let's check with Alan first. Implemented this renaming suggestion. Let's wait if Alan ia okay with it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1426990736 From alanb at openjdk.org Thu Dec 14 17:08:40 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 17:08:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Thu, 14 Dec 2023 16:57:25 GMT, Serguei Spitsyn wrote: > Implemented this renaming suggestion. Let's wait if Alan ia okay with it. Are you planning to drop the changes to mount/unmount too? They shouldn't be needed. notifyJvmtiCriticalLock(boolean) is okay for now but needs to be called before the try, not in the block. We have changes coming that will require moving these hooks to critical section enter/exit methods, so the naming will be less important then. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427000950 From sspitsyn at openjdk.org Thu Dec 14 17:30:54 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 17:30:54 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v4] In-Reply-To: References: Message-ID: > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: 1) replace CriticalLock with DisableSuspend; 2) minor tweaks ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17011/files - new: https://git.openjdk.org/jdk/pull/17011/files/18f1752e..4e5f6447 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=02-03 Stats: 68 lines in 14 files changed: 9 ins; 10 del; 49 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From jnimeh at openjdk.org Thu Dec 14 17:33:50 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Thu, 14 Dec 2023 17:33:50 GMT Subject: [jdk22] RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes Message-ID: This is the JDK 22 backport of JDK-8321542 ------------- Commit messages: - Backport 5718039a46ae51fa9b7042fe7163e3637e981b05 Changes: https://git.openjdk.org/jdk22/pull/14/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=14&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321542 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/14.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/14/head:pull/14 PR: https://git.openjdk.org/jdk22/pull/14 From sspitsyn at openjdk.org Thu Dec 14 17:37:39 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 17:37:39 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Thu, 14 Dec 2023 17:06:05 GMT, Alan Bateman wrote: >> Implemented this renaming suggestion. Let's wait if Alan ia okay with it. > >> Implemented this renaming suggestion. Let's wait if Alan ia okay with it. > > Are you planning to drop the changes to mount/unmount too? They shouldn't be needed. > > notifyJvmtiCriticalLock(boolean) is okay for now but needs to be called before the try, not in the block. We have changes coming that will require moving these hooks to critical section enter/exit methods, so the naming will be less important then. Yes, I've dropped changes in the mount/unmount methods. I've already done renaming to `notifyJvmtiDisableSuspend(boolean)`. Let's see if it is okay with you. It is not a problem to rename it back to `notifyJvmtiCriticalLock(boolean)`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427032721 From alanb at openjdk.org Thu Dec 14 18:06:40 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 18:06:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: <0IfR_dHcE18gNj1SXT6J0RppMZ2bNU7U51iqi-25bh0=.728701de-31f3-4a6e-a454-a4395f4effe5@github.com> On Thu, 14 Dec 2023 12:19:43 GMT, Alan Bateman wrote: > Okay. What about the Leonid's suggestion to name it `notifyJvmtiDisableSuspend()` ? Okay with me. We'll need to move the notifyJvmtiDisableSuspend(true) to before the try in all cases, I've pointed out the cases that we missed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1856339508 From alanb at openjdk.org Thu Dec 14 18:06:47 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 18:06:47 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v4] In-Reply-To: References: Message-ID: <7LcxdLPaxGRqXIER2MYIoCk6yk0CCnzLk-CtSn7A800=.1de3bfc7-4dc0-443f-bc48-983c22f24766@github.com> On Thu, 14 Dec 2023 17:30:54 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: 1) replace CriticalLock with DisableSuspend; 2) minor tweaks src/java.base/share/classes/java/lang/VirtualThread.java line 746: > 744: } else if ((s == PINNED) || (s == TIMED_PINNED)) { > 745: try { > 746: notifyJvmtiDisableSuspend(true); Move to before the try. src/java.base/share/classes/java/lang/VirtualThread.java line 853: > 851: checkAccess(); > 852: try { > 853: notifyJvmtiDisableSuspend(true); This one also needs to be before try. src/java.base/share/classes/java/lang/VirtualThread.java line 886: > 884: if (oldValue) { > 885: try { > 886: notifyJvmtiDisableSuspend(true); This one also needs to be before try. src/java.base/share/classes/java/lang/VirtualThread.java line 917: > 915: case RUNNING: > 916: try { > 917: notifyJvmtiDisableSuspend(true); This one also needs to be before try. src/java.base/share/classes/java/lang/VirtualThread.java line 1042: > 1040: if (carrier != null) { > 1041: try { > 1042: notifyJvmtiDisableSuspend(true); this one too. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427080057 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427080394 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427080484 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427080704 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427080811 From kvn at openjdk.org Thu Dec 14 18:11:45 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 14 Dec 2023 18:11:45 GMT Subject: [jdk22] RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 17:27:03 GMT, Jamil Nimeh wrote: > This is the JDK 22 backport of JDK-8321542 Good. Why it needs review if backport is clean? ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk22/pull/14#pullrequestreview-1782408397 From jnimeh at openjdk.org Thu Dec 14 18:11:45 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Thu, 14 Dec 2023 18:11:45 GMT Subject: [jdk22] RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 17:27:03 GMT, Jamil Nimeh wrote: > This is the JDK 22 backport of JDK-8321542 Not really sure, but I saw in the checkboxes above that it required a reviewer. Perhaps because JDK 22 is in RDP1? ------------- PR Comment: https://git.openjdk.org/jdk22/pull/14#issuecomment-1856343750 From jnimeh at openjdk.org Thu Dec 14 18:11:46 2023 From: jnimeh at openjdk.org (Jamil Nimeh) Date: Thu, 14 Dec 2023 18:11:46 GMT Subject: [jdk22] Integrated: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 17:27:03 GMT, Jamil Nimeh wrote: > This is the JDK 22 backport of JDK-8321542 This pull request has now been integrated. Changeset: d7b592ab Author: Jamil Nimeh URL: https://git.openjdk.org/jdk22/commit/d7b592ab21fb268120059b466a100f70e8c279b9 Stats: 8 lines in 1 file changed: 8 ins; 0 del; 0 mod 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes Reviewed-by: kvn Backport-of: 5718039a46ae51fa9b7042fe7163e3637e981b05 ------------- PR: https://git.openjdk.org/jdk22/pull/14 From sspitsyn at openjdk.org Thu Dec 14 18:26:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 18:26:55 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v5] In-Reply-To: References: Message-ID: > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: moved notifyJvmtiDisableSuspend(true) out of try-block ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17011/files - new: https://git.openjdk.org/jdk/pull/17011/files/4e5f6447..ad990422 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=03-04 Stats: 10 lines in 1 file changed: 5 ins; 5 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From sspitsyn at openjdk.org Thu Dec 14 18:26:56 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 18:26:56 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: <0IfR_dHcE18gNj1SXT6J0RppMZ2bNU7U51iqi-25bh0=.728701de-31f3-4a6e-a454-a4395f4effe5@github.com> References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> <0IfR_dHcE18gNj1SXT6J0RppMZ2bNU7U51iqi-25bh0=.728701de-31f3-4a6e-a454-a4395f4effe5@github.com> Message-ID: On Thu, 14 Dec 2023 18:04:02 GMT, Alan Bateman wrote: > Okay with me. We'll need to move the notifyJvmtiDisableSuspend(true) to before the try in all cases, I've pointed out the cases that we missed. Thank you, Alan. Fixed now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1856366484 From sspitsyn at openjdk.org Thu Dec 14 18:26:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 18:26:57 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> Message-ID: On Thu, 14 Dec 2023 12:16:34 GMT, Serguei Spitsyn wrote: >> src/hotspot/share/runtime/javaThread.hpp line 320: >> >>> 318: bool _is_in_VTMS_transition; // thread is in virtual thread mount state transition >>> 319: bool _is_in_tmp_VTMS_transition; // thread is in temporary virtual thread mount state transition >>> 320: bool _is_in_critical_section; // thread is in a locking critical section >> >> might make sense to add a comment, that his variable Is changed/read only by current thread and no sync is needed. > > Good suggestion, thanks. Fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427099325 From sspitsyn at openjdk.org Thu Dec 14 18:27:00 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 18:27:00 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v4] In-Reply-To: <7LcxdLPaxGRqXIER2MYIoCk6yk0CCnzLk-CtSn7A800=.1de3bfc7-4dc0-443f-bc48-983c22f24766@github.com> References: <7LcxdLPaxGRqXIER2MYIoCk6yk0CCnzLk-CtSn7A800=.1de3bfc7-4dc0-443f-bc48-983c22f24766@github.com> Message-ID: On Thu, 14 Dec 2023 18:03:00 GMT, Alan Bateman wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: 1) replace CriticalLock with DisableSuspend; 2) minor tweaks > > src/java.base/share/classes/java/lang/VirtualThread.java line 1042: > >> 1040: if (carrier != null) { >> 1041: try { >> 1042: notifyJvmtiDisableSuspend(true); > > this one too. Thanks. All cases fixed now. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427095561 From kvn at openjdk.org Thu Dec 14 19:09:38 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Thu, 14 Dec 2023 19:09:38 GMT Subject: RFR: 8309981: Remove expired flags in JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 12:38:48 GMT, David Holmes wrote: > All expired flags are removed from the flags table. > > The two documented flags have their documentation moved from the obsolete section to the removed section. > > The link to JDK 22 removed flags is added to the list. > > Thanks. Good. ------------- Marked as reviewed by kvn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17107#pullrequestreview-1782526555 From cslucas at openjdk.org Thu Dec 14 19:44:01 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 14 Dec 2023 19:44:01 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v6] In-Reply-To: References: Message-ID: > ### Description > > Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. > > Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. > > The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. > > ### Benchmarking > > **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. > **Note 2:** Marging of error was negligible. > > | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | > |--------------------------------------|------------------|-------------------| > | TestTrapAfterMerge | 19.515 | 13.386 | > | TestArgEscape | 33.165 | 33.254 | > | TestCallTwoSide | 70.547 | 69.427 | > | TestCmpAfterMerge | 16.400 | 2.984 | > | TestCmpMergeWithNull_Second | 27.204 | 27.293 | > | TestCmpMergeWithNull | 8.248 | 4.920 | > | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | > | TestCondAfterMergeWithNull | 6.265 | 5.078 | > | TestCondLoadAfterMerge | 12.713 | 5.163 | > | TestConsecutiveSimpleMerge | 30.863 | 4.068 | > | TestDoubleIfElseMerge | 16.069 | 2.444 | > | TestEscapeInCallAfterMerge | 23.111 | 22.924 | > | TestGlobalEscape | 14.459 | 14.425 | > | TestIfElseInLoop | 246.061 | 42.786 | > | TestLoadAfterLoopAlias | 45.808 | 45.812 | > | TestLoadAfterTrap | 28.370 | ... Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge with origin/master - Update test/micro/org/openjdk/bench/vm/compiler/AllocationMerges.java Co-authored-by: Andrey Turbanov - Ammend previous fix & add repro tests. - Fix to prevent reducing already reduced Phi - Fix to prevent creating NULL ConNKlass constants. - Refrain from RAM of arrays and Phis controlled by Loop nodes. - Fix typo in test. - Fix build after merge. - Fix merge - Support for reducing nullable allocation merges. ------------- Changes: https://git.openjdk.org/jdk/pull/15825/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=15825&range=05 Stats: 2431 lines in 13 files changed: 2181 ins; 90 del; 160 mod Patch: https://git.openjdk.org/jdk/pull/15825.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15825/head:pull/15825 PR: https://git.openjdk.org/jdk/pull/15825 From cslucas at openjdk.org Thu Dec 14 19:44:09 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Thu, 14 Dec 2023 19:44:09 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: > # Description > > Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. > > Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. > > # Help Needed for Testing > > I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. > > # Testing status > > ## tier1 > > | | Win | Mac | Linux | > |----------|---------|---------|---------| > | ARM64 | | | | > | ARM32 | | | | > | x86 | | | | > | x64 | | | | > | PPC64 | | | | > | S390x | | | | > | RiscV | | | | Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: - Merge with origin/master - Fix build, copyright dates, m4 files. - Fix merge - Catch up with master branch. Merge remote-tracking branch 'origin/master' into reuse-macroasm - Some inst_mark fixes; Catch up with master. - Catch up with changes on master - Reuse same C2_MacroAssembler object to emit instructions. ------------- Changes: https://git.openjdk.org/jdk/pull/16484/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16484&range=05 Stats: 2446 lines in 61 files changed: 106 ins; 434 del; 1906 mod Patch: https://git.openjdk.org/jdk/pull/16484.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16484/head:pull/16484 PR: https://git.openjdk.org/jdk/pull/16484 From alanb at openjdk.org Thu Dec 14 19:53:40 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 19:53:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v3] In-Reply-To: References: <-3HkgdDQk9480AnHQeIxaNsqK0nVzjaf6UZJ5E9LGHo=.bdff2ed6-db98-4a77-832f-6accad54245f@github.com> <0IfR_dHcE18gNj1SXT6J0RppMZ2bNU7U51iqi-25bh0=.728701de-31f3-4a6e-a454-a4395f4effe5@github.com> Message-ID: On Thu, 14 Dec 2023 18:24:16 GMT, Serguei Spitsyn wrote: > Thank you, Alan. Fixed now. I believe, all your suggestions have been addressed now. Thanks, it looks much better now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1856485757 From alanb at openjdk.org Thu Dec 14 19:53:44 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 14 Dec 2023 19:53:44 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v5] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 18:26:55 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: moved notifyJvmtiDisableSuspend(true) out of try-block src/java.base/share/classes/java/lang/VirtualThread.java line 918: > 916: notifyJvmtiDisableSuspend(true); > 917: try { > 918: // if mounted then return state of carrier thread Can you move this comment line to before the notifyJvmtiDisableSuspend(true)? src/java.base/share/classes/java/lang/VirtualThread.java line 1043: > 1041: notifyJvmtiDisableSuspend(true); > 1042: try { > 1043: // include the carrier thread state and name when mounted This one too, can you move the comment to before the notifyJvmtiDisableSuspend. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427198296 PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427198673 From dholmes at openjdk.org Thu Dec 14 21:26:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 21:26:50 GMT Subject: RFR: 8309981: Remove expired flags in JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 16:25:41 GMT, Alan Bateman wrote: >> All expired flags are removed from the flags table. >> >> The two documented flags have their documentation moved from the obsolete section to the removed section. >> >> The link to JDK 22 removed flags is added to the list. >> >> Thanks. > > Marked as reviewed by alanb (Reviewer). Thanks for the reviews @AlanBateman and @vnkozlov . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17107#issuecomment-1856633394 From dholmes at openjdk.org Thu Dec 14 21:26:52 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 21:26:52 GMT Subject: Integrated: 8309981: Remove expired flags in JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 12:38:48 GMT, David Holmes wrote: > All expired flags are removed from the flags table. > > The two documented flags have their documentation moved from the obsolete section to the removed section. > > The link to JDK 22 removed flags is added to the list. > > Thanks. This pull request has now been integrated. Changeset: d02bc873 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/d02bc873f806c90754da10c8a052e32836e895fd Stats: 23 lines in 2 files changed: 11 ins; 10 del; 2 mod 8309981: Remove expired flags in JDK 23 Reviewed-by: alanb, kvn ------------- PR: https://git.openjdk.org/jdk/pull/17107 From dholmes at openjdk.org Thu Dec 14 21:28:47 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 14 Dec 2023 21:28:47 GMT Subject: Integrated: 8322065: Initial nroff manpage generation for JDK 23 In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 05:46:01 GMT, David Holmes wrote: > Updated the version to 23-ea and year to 2024. > > This initial generation also picks up the unpublished changes from: > > - [JDK-8302233](https://bugs.openjdk.org/browse/JDK-8302233) (keytool & jarsigner) > - [JDK-8290702](https://bugs.openjdk.org/browse/JDK-8290702) (javadoc) (JDK 23 backport) > > Thanks This pull request has now been integrated. Changeset: 692be577 Author: David Holmes URL: https://git.openjdk.org/jdk/commit/692be577385844bf00a01ff10e390e014191569f Stats: 193 lines in 27 files changed: 36 ins; 51 del; 106 mod 8322065: Initial nroff manpage generation for JDK 23 Reviewed-by: alanb ------------- PR: https://git.openjdk.org/jdk/pull/17101 From sspitsyn at openjdk.org Thu Dec 14 22:37:42 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 22:37:42 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v5] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 19:50:00 GMT, Alan Bateman wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: moved notifyJvmtiDisableSuspend(true) out of try-block > > src/java.base/share/classes/java/lang/VirtualThread.java line 1043: > >> 1041: notifyJvmtiDisableSuspend(true); >> 1042: try { >> 1043: // include the carrier thread state and name when mounted > > This one too, can you move the comment to before the notifyJvmtiDisableSuspend. Moved both comments out of try blocks. What about this one (it seems we would wont to do the same) ? : notifyJvmtiDisableSuspend(true); try { // unpark carrier thread when pinned synchronized (carrierThreadAccessLock()) { Thread carrier = carrierThread; if (carrier != null && ((s = state()) == PINNED || s == TIMED_PINNED)) { U.unpark(carrier); } } } finally { notifyJvmtiDisableSuspend(false); } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427373522 From sspitsyn at openjdk.org Thu Dec 14 22:57:53 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 22:57:53 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v6] In-Reply-To: References: Message-ID: > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: moved a couple of comments out of try blocks ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17011/files - new: https://git.openjdk.org/jdk/pull/17011/files/ad990422..917dc724 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=04-05 Stats: 6 lines in 1 file changed: 3 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From sspitsyn at openjdk.org Thu Dec 14 22:57:55 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 14 Dec 2023 22:57:55 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v5] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 22:35:18 GMT, Serguei Spitsyn wrote: >> src/java.base/share/classes/java/lang/VirtualThread.java line 1043: >> >>> 1041: notifyJvmtiDisableSuspend(true); >>> 1042: try { >>> 1043: // include the carrier thread state and name when mounted >> >> This one too, can you move the comment to before the notifyJvmtiDisableSuspend. > > Moved both comments out of try blocks. > What about this one (it seems we would wont to do the same) ? : > > notifyJvmtiDisableSuspend(true); > try { > // unpark carrier thread when pinned > synchronized (carrierThreadAccessLock()) { > Thread carrier = carrierThread; > if (carrier != null && ((s = state()) == PINNED || s == TIMED_PINNED)) { > U.unpark(carrier); > } > } > } finally { > notifyJvmtiDisableSuspend(false); > } Moved 3 comments out of try blocks. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427386103 From luhenry at openjdk.org Fri Dec 15 01:49:41 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 15 Dec 2023 01:49:41 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 19:44:09 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | | | >> | ARM32 | | | | >> | x86 | | | | >> | x64 | | | | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | | | | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. I've verified it works on riscv64, passing hotspot tier1 and tier2 tests. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1857140962 From fyang at openjdk.org Fri Dec 15 02:29:43 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 15 Dec 2023 02:29:43 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v3] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 14:35:57 GMT, Gui Cao wrote: >> MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? >> This issue is used to track avoid passing t0 as a temporary register in the following cases: >> 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. >> 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad >> 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad >> >> Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. >> https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 >> >> ### Testing: >> - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) >> - [x] Run tier1-3 tests with SiFive unmatched (release) > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Remove unneeded kill cr Updated changes looks good. Thanks. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16880#pullrequestreview-1783034477 From gcao at openjdk.org Fri Dec 15 04:01:53 2023 From: gcao at openjdk.org (Gui Cao) Date: Fri, 15 Dec 2023 04:01:53 GMT Subject: RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved Message-ID: The fix for JDK-8315743 touches MacroAssembler::load_reserved replacing `t0` with `dst`. But it missed change for the third case (that is `uint32`) if the switch in this assember function. We should also replace `t0` used in `zero_extend` with `dst`. @robehn can you help confirm this? - [ ] Run tier1 tests on qemu 8.1.50 with UseRVV (release) ------------- Commit messages: - 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved Changes: https://git.openjdk.org/jdk/pull/17117/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17117&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322154 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17117.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17117/head:pull/17117 PR: https://git.openjdk.org/jdk/pull/17117 From duke at openjdk.org Fri Dec 15 04:03:13 2023 From: duke at openjdk.org (Liming Liu) Date: Fri, 15 Dec 2023 04:03:13 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v18] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Try to add a thread to use memory ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/e1f844f8..f5a8c446 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=17 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=16-17 Stats: 47 lines in 1 file changed: 43 ins; 0 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From fyang at openjdk.org Fri Dec 15 04:07:48 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 15 Dec 2023 04:07:48 GMT Subject: RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 03:56:04 GMT, Gui Cao wrote: > The fix for JDK-8315743 touches MacroAssembler::load_reserved replacing `t0` with `dst`. But it missed change for the third case (that is `uint32`) if the switch in this assember function. We should also replace `t0` used in `zero_extend` with `dst`. @robehn can you help confirm this? > > - [ ] Run tier1 tests on qemu 8.1.50 with UseRVV (release) Good catch! I think we should also fix this for JDK-22. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17117#pullrequestreview-1783089797 From duke at openjdk.org Fri Dec 15 04:11:16 2023 From: duke at openjdk.org (Liming Liu) Date: Fri, 15 Dec 2023 04:11:16 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v19] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Fix type errors ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/f5a8c446..aec67985 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=17-18 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From dholmes at openjdk.org Fri Dec 15 05:34:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 15 Dec 2023 05:34:38 GMT Subject: RFR: 8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 18:22:32 GMT, Steven Schlansker wrote: > Discovered while deep in an InternalError debugging session... Looks fine and trivial (only 1 review needed). Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17021#pullrequestreview-1783147614 From duke at openjdk.org Fri Dec 15 06:19:59 2023 From: duke at openjdk.org (Liming Liu) Date: Fri, 15 Dec 2023 06:19:59 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v20] In-Reply-To: References: Message-ID: > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Remove the deletion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/aec67985..ae9f6f3a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=19 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=18-19 Stats: 4 lines in 1 file changed: 0 ins; 4 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From dholmes at openjdk.org Fri Dec 15 06:25:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 15 Dec 2023 06:25:40 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding I had forgotten about the periodic checks related to `-Xcheck:jni`. Yes we probably could do something here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1857343823 From rehn at openjdk.org Fri Dec 15 06:38:37 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 06:38:37 GMT Subject: RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: <4YqN6hzgxxgkcj1CnVjo-1vP-hVbQFdk-42uRF_Dl2U=.49874f94-a86c-4252-9d0b-81d9593d0699@github.com> On Fri, 15 Dec 2023 03:56:04 GMT, Gui Cao wrote: > The fix for https://bugs.openjdk.org/browse/JDK-8315743 touches MacroAssembler::load_reserved replacing `t0` with `dst`. But it missed change for the third case (that is `uint32`) of the switch in this assember function. We should also replace `t0` used in `zero_extend` with `dst`. @robehn can you help confirm this? > > - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) Good catch, thank you! ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17117#pullrequestreview-1783205759 From gcao at openjdk.org Fri Dec 15 07:26:49 2023 From: gcao at openjdk.org (Gui Cao) Date: Fri, 15 Dec 2023 07:26:49 GMT Subject: RFR: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr [v3] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 14:35:57 GMT, Gui Cao wrote: >> MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? >> This issue is used to track avoid passing t0 as a temporary register in the following cases: >> 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. >> 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad >> 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad >> >> Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. >> https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 >> >> ### Testing: >> - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) >> - [x] Run tier1-3 tests with SiFive unmatched (release) > > Gui Cao has updated the pull request incrementally with one additional commit since the last revision: > > Remove unneeded kill cr Thanks all for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16880#issuecomment-1857396208 From gcao at openjdk.org Fri Dec 15 07:26:51 2023 From: gcao at openjdk.org (Gui Cao) Date: Fri, 15 Dec 2023 07:26:51 GMT Subject: Integrated: 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr In-Reply-To: References: Message-ID: On Wed, 29 Nov 2023 11:58:31 GMT, Gui Cao wrote: > MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header is non-trivial on linux-riscv64 platform. Passing t0(aka x5) as temporary register to this functions can also be error prone. As a reserved scratch register, t0 is implicitly clobberred by various assembler functions. @robehn can you help review this PR? > This issue is used to track avoid passing t0 as a temporary register in the following cases: > 1. avoid passing t0 as temp register to MacroAssembler::cmpxchg/cmpxchgptr/cmpxchg_obj_header. > 2. avoid passing t0 as temp register to x_load_barrier and x_load_barrier_slow_path function in x_riscv.ad > 3. avoid passing t0 as temp register to z_store_barrier and z_color function in z_riscv.ad > > Note that I didn't touch MacroAssembler::cmpxchg because it seems to me that this function is designed that it allows t0 to be used as the result register. As the result register will be set on exits, there should be no risk when using t0 for receiving the result. > https://github.com/openjdk/jdk/blob/e44d4b24ed794957c47c140ab6f15544efa2b278/src/hotspot/cpu/riscv/macroAssembler_riscv.cpp#L2910-L2925 > > ### Testing: > - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) > - [x] Run tier1-3 tests with SiFive unmatched (release) This pull request has now been integrated. Changeset: 0be0775a Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/0be0775a762edbefacf4188b4787b039153fe670 Stats: 78 lines in 5 files changed: 4 ins; 4 del; 70 mod 8320397: RISC-V: Avoid passing t0 as temp register to MacroAssembler:: cmpxchg_obj_header/cmpxchgptr Reviewed-by: rehn, fyang ------------- PR: https://git.openjdk.org/jdk/pull/16880 From stuefe at openjdk.org Fri Dec 15 07:38:58 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 15 Dec 2023 07:38:58 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 14:05:48 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > followed the proposals Is this libpath parsing code copied from the R3 kernel? If yes, pls make sure there are no licensing issues. src/hotspot/os/aix/os_aix.cpp line 206: > 204: constexpr int max_handletable = 1024; > 205: static int g_handletable_used = 0; > 206: static struct handletableentry g_handletable[max_handletable] = {{0, 0, 0, 0}}; I would move all that new and clearly delineated dlopen stuff into an own file, e.g. dlopen_aix.cpp or porting_aix.cpp (in porting_aix.cpp, we already have wrappers for other functions). os_aix.cpp is already massive. src/hotspot/os/aix/os_aix.cpp line 1129: > 1127: > 1128: // get the library search path burned in to the executable file during linking > 1129: // If the libpath cannot be retrieved return an empty path This is new. Is this complexity needed, if yes, why? Don't see a comment, may have missed it. src/hotspot/os/aix/os_aix.cpp line 1131: > 1129: // If the libpath cannot be retrieved return an empty path > 1130: static const char* rtv_linkedin_libpath() { > 1131: static char buffer[4096]; This coding has some issues: - a generic char buffer is not a good idea. Forces you to do casts all over the place, and introduces alignment issues with unaligned char buffer. Which I assume is the reason for all the separate memcpy-into-structures below. I would just read into the structures directly. - you need to check the return codes for fread to make sure you read the number of bytes expected, lest you work with uninitialized memory and maybe to handle sporadic EINTR. - I don't get all the separate "SZ" macros. They must be equal to sizeof(structure), right, otherwise you get buffer overruns or work with uninitialized memory? Proposal: add a local wrapper function like this: template static bool my_checked_fread(FILE* f, T* out) { // read sizeof(T) from f. // Check return code. // Return bool if sizeof(T) bytes were read. e.g. in a very trivial form: int bytesread = fread(out, sizeof(T), 1, f); return bytesread == sizeof(T); } and use it in your code like this: struct xcoff64 the_xcoff64; struct scn64 the_scn64; struct ldr64 the_ldr64; if (!my_checked_fread(f, &the_xcoff64)) { assert? } ... if (!my_checked_fread(f, &the_ldr64) { .. handle error } src/hotspot/os/aix/os_aix.cpp line 1132: > 1130: static const char* rtv_linkedin_libpath() { > 1131: static char buffer[4096]; > 1132: static const char* libpath = 0; If your intent is to return an empty buffer if there is no contained libpath, I would just: static const char* libpath = ""; then you can always just return libpath. src/hotspot/os/aix/os_aix.cpp line 1135: > 1133: > 1134: if (libpath) > 1135: return libpath; { } src/hotspot/os/aix/os_aix.cpp line 1137: > 1135: return libpath; > 1136: > 1137: char pgmpath[32+1]; Will overflow if pid_t is 64bit. Give it a larger size; after all, you are giving buffer 4K above, so you are not overly concerned with saving stack space. src/hotspot/os/aix/os_aix.cpp line 1146: > 1144: fread(buffer, 1, FILHSZ_64 + _AOUTHSZ_EXEC_64, f); > 1145: > 1146: if (((struct filehdr*)buffer)->f_magic == U802TOCMAGIC ) { as stated above, I don't think this section is needed. src/hotspot/os/aix/os_aix.cpp line 1170: > 1168: else if (((struct filehdr*)buffer)->f_magic == U64_TOCMAGIC ) { > 1169: // __XCOFF64__ > 1170: struct _S_(xcoffhdr) xcoff64; whats with the `_S_`? src/hotspot/os/aix/os_aix.cpp line 1174: > 1172: struct _S_(ldhdr) ldr64; > 1173: memcpy((char*)&xcoff64, buffer, FILHSZ_64 + _AOUTHSZ_EXEC_64); > 1174: int ldroffset = FILHSZ_64 + xcoff64.filehdr.f_opthdr + (xcoff64.aouthdr.o_snloader -1)*SCNHSZ_64; why the -1? I assume thats the section number? is it 1 based? how odd.. src/hotspot/os/aix/os_aix.cpp line 1187: > 1185: fread(buffer, 1, LDHDRSZ_64, f); > 1186: memcpy((char*)&ldr64, buffer, LDHDRSZ_64); > 1187: fseek (f, scn64.s_scnptr + ldr64.l_impoff, SEEK_SET); nit: please use consistent spacing according to hotspot rules. here, remove space. src/hotspot/os/aix/os_aix.cpp line 1191: > 1189: } > 1190: else > 1191: buffer[0] = 0; {} src/hotspot/os/aix/os_aix.cpp line 1234: > 1232: > 1233: stringStream Libpath; > 1234: if (env == nullptr) { Proposal for shorter version not needing string assembly: const char* paths [2] = { env, rtv_linkedin_libpath() }: for (int i = 0; i < 2; i ++) { const char* this_libpath = paths[i]; if (this_libpath == nullptr) { continue; } ... do the token thing... } } ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1783187856 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427593949 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427597791 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427606275 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427632243 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427594255 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427604761 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427610156 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427610650 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427622550 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427635888 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427633296 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427640624 From stuefe at openjdk.org Fri Dec 15 07:38:58 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 15 Dec 2023 07:38:58 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 06:22:39 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> followed the proposals > > src/hotspot/os/aix/os_aix.cpp line 1129: > >> 1127: >> 1128: // get the library search path burned in to the executable file during linking >> 1129: // If the libpath cannot be retrieved return an empty path > > This is new. Is this complexity needed, if yes, why? Don't see a comment, may have missed it. Also, why are we parsing xcoff32 headers in there? AIX OpenJDK will always be 64-bit. So, you can replace the whole xcoff32 section with assert( f_magic == U802TOCMAGIC, ..). The function becomes a lot simpler then. > src/hotspot/os/aix/os_aix.cpp line 1132: > >> 1130: static const char* rtv_linkedin_libpath() { >> 1131: static char buffer[4096]; >> 1132: static const char* libpath = 0; > > If your intent is to return an empty buffer if there is no contained libpath, I would just: > > > static const char* libpath = ""; > > then you can always just return libpath. But looking at the using code, returning NULL in case there is no contained libpath would be actually easier, see below. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427609926 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427639138 From kbarrett at openjdk.org Fri Dec 15 07:53:43 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 15 Dec 2023 07:53:43 GMT Subject: RFR: 8314488: Compile the JDK as C++17 In-Reply-To: References: Message-ID: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> On Mon, 24 Jul 2023 01:41:16 GMT, Julian Waters wrote: > Implementation of [JEP draft: Compile the JDK as C++17](https://bugs.openjdk.org/browse/JDK-8310260) Nearly ready. Still need to figure out the minimum compiler versions to require. src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 77: > 75: #define read_csr(csr) \ > 76: ({ \ > 77: unsigned long __v; \ Can this change be made separately? I'd like to have the C++17 switch be as clean as possible. src/hotspot/share/memory/allocation.cpp line 114: > 112: // > 113: > 114: void* AnyObj::operator new(size_t size, Arena *arena) { This change was recently made as part of JDK-8317132. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/14988#pullrequestreview-1783283088 PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427655730 PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427654886 From jwaters at openjdk.org Fri Dec 15 08:08:10 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 15 Dec 2023 08:08:10 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: > Implementation of [JEP draft: Compile the JDK as C++17](https://bugs.openjdk.org/browse/JDK-8310260) Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: - Merge branch 'openjdk:master' into patch-7 - Revert vm_version_linux_riscv.cpp - vm_version_linux_riscv.cpp - allocation.cpp - 8310260 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14988/files - new: https://git.openjdk.org/jdk/pull/14988/files/09cecd7e..a1f21bbd Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=00-01 Stats: 1027205 lines in 9267 files changed: 298949 ins; 587677 del; 140579 mod Patch: https://git.openjdk.org/jdk/pull/14988.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14988/head:pull/14988 PR: https://git.openjdk.org/jdk/pull/14988 From jwaters at openjdk.org Fri Dec 15 08:08:11 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 15 Dec 2023 08:08:11 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> Message-ID: On Fri, 15 Dec 2023 07:48:46 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-7 >> - Revert vm_version_linux_riscv.cpp >> - vm_version_linux_riscv.cpp >> - allocation.cpp >> - 8310260 > > src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 77: > >> 75: #define read_csr(csr) \ >> 76: ({ \ >> 77: unsigned long __v; \ > > Can this change be made separately? I'd like to have the C++17 switch be as clean as possible. No problem! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427669495 From mbaesken at openjdk.org Fri Dec 15 08:10:46 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 15 Dec 2023 08:10:46 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 15 Dec 2023 06:23:04 GMT, David Holmes wrote: > I had forgotten about the periodic checks related to `-Xcheck:jni`. Yes we probably could do something here. Hi David, what periodic JNI checks are you talking about ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1857450740 From jwaters at openjdk.org Fri Dec 15 08:15:40 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 15 Dec 2023 08:15:40 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> Message-ID: <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> On Fri, 15 Dec 2023 08:03:45 GMT, Julian Waters wrote: >> src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp line 77: >> >>> 75: #define read_csr(csr) \ >>> 76: ({ \ >>> 77: unsigned long __v; \ >> >> Can this change be made separately? I'd like to have the C++17 switch be as clean as possible. > > No problem! There are, strangely, many more register keywords in the JDK codebase than just this one, but none of them throw the same errors, only this one does ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427676677 From azafari at openjdk.org Fri Dec 15 08:30:40 2023 From: azafari at openjdk.org (Afshin Zafari) Date: Fri, 15 Dec 2023 08:30:40 GMT Subject: RFR: 8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp In-Reply-To: References: Message-ID: <1Bl904Ss5bDJlOR29d4TJmRX6yk_D_zCOh4uM3oxpRU=.76d242ee-d471-4f92-8317-f16ee46c79e6@github.com> On Thu, 7 Dec 2023 18:22:32 GMT, Steven Schlansker wrote: > Discovered while deep in an InternalError debugging session... Thanks for fixing it. ------------- Marked as reviewed by azafari (Committer). PR Review: https://git.openjdk.org/jdk/pull/17021#pullrequestreview-1783345227 From gcao at openjdk.org Fri Dec 15 08:55:46 2023 From: gcao at openjdk.org (Gui Cao) Date: Fri, 15 Dec 2023 08:55:46 GMT Subject: RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 03:56:04 GMT, Gui Cao wrote: > The fix for https://bugs.openjdk.org/browse/JDK-8315743 touches MacroAssembler::load_reserved replacing `t0` with `dst`. But it missed change for the third case (that is `uint32`) of the switch in this assember function. We should also replace `t0` used in `zero_extend` with `dst`. @robehn can you help confirm this? > > - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) Thanks all for the review. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17117#issuecomment-1857507829 From alanb at openjdk.org Fri Dec 15 09:00:43 2023 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 15 Dec 2023 09:00:43 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v6] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 22:57:53 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: moved a couple of comments out of try blocks src/hotspot/share/prims/jvm.cpp line 4019: > 4017: return; > 4018: } > 4019: assert(thread->is_disable_suspend() != (bool)enter, "recursive disable suspend is not allowed"); This is an important assert, the message should probably say nested or unbalanced enter/exit not allowed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427719197 From kbarrett at openjdk.org Fri Dec 15 09:08:40 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 15 Dec 2023 09:08:40 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> Message-ID: On Fri, 15 Dec 2023 08:12:47 GMT, Julian Waters wrote: >> No problem! > > There are, strangely, many more register keywords in the JDK codebase than just this one, but none of them throw the same errors, only this one does Looks like this change has also already been made, by JDK-8319440. All of the other non-comment uses of "register" I found in HotSpot are gcc local variable register specifications: https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Local-Register-Variables.html So are a different thing and not affected by the deprecation/removal of the C++ "register" keyword. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427728697 From rehn at openjdk.org Fri Dec 15 09:19:38 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 09:19:38 GMT Subject: RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 03:56:04 GMT, Gui Cao wrote: > The fix for https://bugs.openjdk.org/browse/JDK-8315743 touches MacroAssembler::load_reserved replacing `t0` with `dst`. But it missed change for the third case (that is `uint32`) of the switch in this assember function. We should also replace `t0` used in `zero_extend` with `dst`. @robehn can you help confirm this? > > - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) I notice some of RV PRs goes a bit fast. Note https://openjdk.org/guide/ -> "Allow enough time for review". Hence I can sponsor, but not now. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17117#issuecomment-1857540340 From alanb at openjdk.org Fri Dec 15 09:29:41 2023 From: alanb at openjdk.org (Alan Bateman) Date: Fri, 15 Dec 2023 09:29:41 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v6] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 22:57:53 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: moved a couple of comments out of try blocks I think okay, I don't have any other comments. ------------- Marked as reviewed by alanb (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17011#pullrequestreview-1783445183 From jkern at openjdk.org Fri Dec 15 09:59:40 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 09:59:40 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 06:44:03 GMT, Thomas Stuefe wrote: >> src/hotspot/os/aix/os_aix.cpp line 1129: >> >>> 1127: >>> 1128: // get the library search path burned in to the executable file during linking >>> 1129: // If the libpath cannot be retrieved return an empty path >> >> This is new. Is this complexity needed, if yes, why? Don't see a comment, may have missed it. > > Also, why are we parsing xcoff32 headers in there? AIX OpenJDK will always be 64-bit. So, you can replace the whole xcoff32 section with assert( f_magic == U802TOCMAGIC, ..). The function becomes a lot simpler then. I found a leak in my previous implementation. It is more or less academical, but this solution is the complete one. I would prefer this complete solution even it is complex, because if dlopen follows a slightly different algorithm in resolving the library we surely get into trouble. If we omit the xcoff32 we have to ensure that no xcoff32 executable file comes into play. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427782107 From jkern at openjdk.org Fri Dec 15 10:21:44 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 10:21:44 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 06:15:15 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> followed the proposals > > src/hotspot/os/aix/os_aix.cpp line 206: > >> 204: constexpr int max_handletable = 1024; >> 205: static int g_handletable_used = 0; >> 206: static struct handletableentry g_handletable[max_handletable] = {{0, 0, 0, 0}}; > > I would move all that new and clearly delineated dlopen stuff into an own file, e.g. dlopen_aix.cpp or porting_aix.cpp (in porting_aix.cpp, we already have wrappers for other functions). os_aix.cpp is already massive. I moved the static variable declarations and the functions `Aix_dlopen(), search_file_in_LIBPATH(), rtv_linkedin_libpath()` and `os::pd_dll_unload()` to porting_aix.cpp. This links, but in my opinion `os::pd_dll_unload()` should reside in os_aix.cpp, because it is member of the os class. But there it will not compile anymore if the static variables are moved away. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427803856 From stuefe at openjdk.org Fri Dec 15 10:21:46 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 15 Dec 2023 10:21:46 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 09:57:19 GMT, Joachim Kern wrote: > If we omit the xcoff32 we have to ensure that no xcoff32 executable file comes into play. xcoff32 is for 32-bit binaries. The AIX port only exists for 64-bit, and there will never be a 32-bit AIX port, so there is no reason for handling 32-bit xcoff headers. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427803763 From stuefe at openjdk.org Fri Dec 15 10:29:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 15 Dec 2023 10:29:40 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 10:18:53 GMT, Joachim Kern wrote: >> src/hotspot/os/aix/os_aix.cpp line 206: >> >>> 204: constexpr int max_handletable = 1024; >>> 205: static int g_handletable_used = 0; >>> 206: static struct handletableentry g_handletable[max_handletable] = {{0, 0, 0, 0}}; >> >> I would move all that new and clearly delineated dlopen stuff into an own file, e.g. dlopen_aix.cpp or porting_aix.cpp (in porting_aix.cpp, we already have wrappers for other functions). os_aix.cpp is already massive. > > I moved the static variable declarations and the functions `Aix_dlopen(), search_file_in_LIBPATH(), rtv_linkedin_libpath()` and `os::pd_dll_unload()` to porting_aix.cpp. This links, but in my opinion `os::pd_dll_unload()` should reside in os_aix.cpp, because it is member of the os class. But there it will not compile anymore if the static variables are moved away. No, what I meant was to provide a "libc-like" equivalent for dlopen, similar to what we do with dladdr (see https://github.com/openjdk/jdk/blob/b7676822886eac21f61ff361a32928a966d8fe31/src/hotspot/os/aix/porting_aix.cpp#L306). But never mind; I am also fine with moving os::pd_dlopen into a different cpp file, e.g. "dlopen_aix.cpp". Just move it out of os_aix.cpp, since that is already massive and you add >300 lines of more code and more dependencies. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427812795 From sspitsyn at openjdk.org Fri Dec 15 10:49:56 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 15 Dec 2023 10:49:56 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v7] In-Reply-To: References: Message-ID: <4iSULgKTef_C2q4AJpEKB64tZh_QDIB77Ov2rwZ78nY=.39d3c708-4175-42b5-8eb9-58684e131ccf@github.com> > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: review: improve an assert message ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17011/files - new: https://git.openjdk.org/jdk/pull/17011/files/917dc724..6f8cdf06 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=05-06 Stats: 2 lines in 1 file changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From sspitsyn at openjdk.org Fri Dec 15 10:50:01 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Fri, 15 Dec 2023 10:50:01 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v6] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 08:57:45 GMT, Alan Bateman wrote: >> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: >> >> review: moved a couple of comments out of try blocks > > src/hotspot/share/prims/jvm.cpp line 4019: > >> 4017: return; >> 4018: } >> 4019: assert(thread->is_disable_suspend() != (bool)enter, "recursive disable suspend is not allowed"); > > This is an important assert, the message should probably say nested or unbalanced enter/exit not allowed. Thanks. Updated the assert message as suggested. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1427830694 From rehn at openjdk.org Fri Dec 15 11:12:09 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 11:12:09 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v5] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Index load, other comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/fdb17d1b..283f186f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=03-04 Stats: 34 lines in 4 files changed: 10 ins; 17 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From rehn at openjdk.org Fri Dec 15 11:12:12 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 11:12:12 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v2] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 13:31:43 GMT, Ludovic Henry wrote: >> src/hotspot/cpu/riscv/macroAssembler_riscv.hpp line 1359: >> >>> 1357: } >>> 1358: >>> 1359: inline void vmsltu_vi(VectorRegister Vd, VectorRegister Vs2, int32_t imm, VectorMask vm = unmasked) { >> >> Seems this function is not used in the code? >> And, when `imm` == 0, seems it will output unexpected value? > > We can keep it as it's a good complement to existing methods and the cost is very low. Good point on `imm == 0`, we should probably have an assert for that case or have a `splat(false)` operation. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1427851962 From rehn at openjdk.org Fri Dec 15 11:12:12 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 11:12:12 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v5] In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 14:14:45 GMT, Robbin Ehn wrote: >> Yeah. Why not consider something more simpler if there is no known big difference on performance numbers? And this is the first version when RVV-1.0 compatible hardwares are not popular yet :-) > > Not yet addressed. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1427850585 From rehn at openjdk.org Fri Dec 15 11:12:13 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 11:12:13 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v5] In-Reply-To: References: <9x5sC6aXWG2OUYXdS97o-fJgjhNODf-mVC69bQNSSjI=.6425f2fc-d793-4b49-bf97-1ea55d0fd443@github.com> <_LkvimbOKKuIZon0Ajv9lKReO19xQjFI2VH2b4hsCE4=.89f5725a-150c-4a03-a6c2-a71a2f5fe3b6@github.com> <1rTN32en51Pjpr-mdaDjw3UzQnf7W4J8JQTf-CMG04s=.904657b9-7a3a-46e3-8936-cf0f16b5c7b9@github.com> <5ydUXSyM7-XcGRH86bvVH4LJM94sAY7rahyUeqcrkBk=.e237d328-06f4-4919-af88-ea6f56d0b202@github.com> Message-ID: On Tue, 28 Nov 2023 14:16:37 GMT, Robbin Ehn wrote: >> We don't either have such hardware, we simulate via gem5. >> Ventana v2 should have 15 wide pipeline with RVV 1.0 how knows how this will execute on such :) >> >> As we don't know I think you are correct in we should write the most readable version first. >> And later we can apt these for hwprobe triplet of vendor/arch/impl if we think that it's worth it. > > Not yet addressed. As I don't have access to these instruction in performance simulator. I'll leave this as is, since now 256/512 are identical in that regard. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1427851619 From rehn at openjdk.org Fri Dec 15 11:12:15 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 11:12:15 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v5] In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 14:14:57 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4215: >> >>> 4213: __ vslidedown_vi(v16, v27, 2); // v16 = {_,_,e,f} >>> 4214: // Merge elements [3..2] of v26 ({a,b}) into elements [3..2] of v16 >>> 4215: __ vmerge_vvm(v16, v26, v16); // v16 = {a,b,e,f} >> >> Simlar here. Can we make use of index-load and index-store to simplify the code for the 512 case too? > > Not yet addressed. Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1427850711 From rehn at openjdk.org Fri Dec 15 11:16:45 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 11:16:45 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 15:11:02 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: > > - Merge branch 'master' into sha256 > - Materialize constants address once > - Removed template > - Flag fixes > - Merge branch 'master' into sha256 > - Share code > - SHA-2 Hi all, I have address all comments. The only code change I didn't do was register caching of constants. This is because I don't have access to sha2 in performance simulator. Without it 256 and 512 have 'identical' path. I'll create a jira for that, so I can revisit it once I have access. I hope that is okay @RealFYang ? (i.e. ship this and do a follow-up) Also @VladimirKempik the flag issue is not resolved. For now we use this experimental flag which is inline with the other flags. Any other things to address, new or that I missed? (passes compiler/intrinsics/sha/) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1857702779 From jkern at openjdk.org Fri Dec 15 11:24:44 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 11:24:44 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 06:15:56 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> followed the proposals > > src/hotspot/os/aix/os_aix.cpp line 1135: > >> 1133: >> 1134: if (libpath) >> 1135: return libpath; > > { } done > src/hotspot/os/aix/os_aix.cpp line 1137: > >> 1135: return libpath; >> 1136: >> 1137: char pgmpath[32+1]; > > Will overflow if pid_t is 64bit. Give it a larger size; after all, you are giving buffer 4K above, so you are not overly concerned with saving stack space. adopted. use buffer instead of pgmpath > src/hotspot/os/aix/os_aix.cpp line 1146: > >> 1144: fread(buffer, 1, FILHSZ_64 + _AOUTHSZ_EXEC_64, f); >> 1145: >> 1146: if (((struct filehdr*)buffer)->f_magic == U802TOCMAGIC ) { > > as stated above, I don't think this section is needed. Completely rewritten; Only xcoff64 handled > src/hotspot/os/aix/os_aix.cpp line 1170: > >> 1168: else if (((struct filehdr*)buffer)->f_magic == U64_TOCMAGIC ) { >> 1169: // __XCOFF64__ >> 1170: struct _S_(xcoffhdr) xcoff64; > > whats with the `_S_`? Not needed any more, because only xcoff64 handled ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427862523 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427862370 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427863562 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427864005 From jwaters at openjdk.org Fri Dec 15 11:27:42 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 15 Dec 2023 11:27:42 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> Message-ID: On Fri, 15 Dec 2023 09:05:37 GMT, Kim Barrett wrote: >> There are, strangely, many more register keywords in the JDK codebase than just this one, but none of them throw the same errors, only this one does > > Looks like this change has also already been made, by JDK-8319440. > > All of the other non-comment uses of "register" I found in HotSpot are gcc local variable register specifications: > https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Local-Register-Variables.html > So are a different thing and not affected by the deprecation/removal of the C++ "register" keyword. Ah, I was not aware that the asm specifications for explicit registers overrides the warning about the register keyword, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427866744 From jkern at openjdk.org Fri Dec 15 11:27:47 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 11:27:47 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 07:01:06 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> followed the proposals > > src/hotspot/os/aix/os_aix.cpp line 1174: > >> 1172: struct _S_(ldhdr) ldr64; >> 1173: memcpy((char*)&xcoff64, buffer, FILHSZ_64 + _AOUTHSZ_EXEC_64); >> 1174: int ldroffset = FILHSZ_64 + xcoff64.filehdr.f_opthdr + (xcoff64.aouthdr.o_snloader -1)*SCNHSZ_64; > > why the -1? I assume thats the section number? is it 1 based? how odd.. Yes, the section numbers are 1 based. e.g. Beginning of section 4 has an offset of 3 section sizes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427866203 From jkern at openjdk.org Fri Dec 15 11:31:41 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 11:31:41 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 07:20:47 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> followed the proposals > > src/hotspot/os/aix/os_aix.cpp line 1187: > >> 1185: fread(buffer, 1, LDHDRSZ_64, f); >> 1186: memcpy((char*)&ldr64, buffer, LDHDRSZ_64); >> 1187: fseek (f, scn64.s_scnptr + ldr64.l_impoff, SEEK_SET); > > nit: please use consistent spacing according to hotspot rules. here, remove space. Do you mean the space `fseek (` ? Done. > src/hotspot/os/aix/os_aix.cpp line 1191: > >> 1189: } >> 1190: else >> 1191: buffer[0] = 0; > > {} Done, due to complete rewriting. s.o. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427869786 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427870433 From vkempik at openjdk.org Fri Dec 15 11:34:42 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Fri, 15 Dec 2023 11:34:42 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 11:11:59 GMT, Robbin Ehn wrote: > Hi all, I have address all comments. > > The only code change I didn't do was register caching of constants. This is because I don't have access to sha2 in performance simulator. Without it 256 and 512 have 'identical' path. I'll create a jira for that, so I can revisit it once I have access. I hope that is okay @RealFYang ? (i.e. ship this and do a follow-up) > > Also @VladimirKempik the flag issue is not resolved. For now we use this experimental flag which is inline with the other flags. > > Any other things to address, new or that I missed? > > (passes compiler/intrinsics/sha/) It was mostly a wish we look at flags later and simplify it ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1857731028 From luhenry at openjdk.org Fri Dec 15 11:38:40 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 15 Dec 2023 11:38:40 GMT Subject: RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 03:56:04 GMT, Gui Cao wrote: > The fix for https://bugs.openjdk.org/browse/JDK-8315743 touches MacroAssembler::load_reserved replacing `t0` with `dst`. But it missed change for the third case (that is `uint32`) of the switch in this assember function. We should also replace `t0` used in `zero_extend` with `dst`. @robehn can you help confirm this? > > - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17117#pullrequestreview-1783684654 From jkern at openjdk.org Fri Dec 15 11:39:41 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 11:39:41 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 07:27:14 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> followed the proposals > > src/hotspot/os/aix/os_aix.cpp line 1234: > >> 1232: >> 1233: stringStream Libpath; >> 1234: if (env == nullptr) { > > Proposal for shorter version not needing string assembly: > > const char* paths [2] = { env, rtv_linkedin_libpath() }: > for (int i = 0; i < 2; i ++) { > const char* this_libpath = paths[i]; > if (this_libpath == nullptr) { > continue; > } > ... do the token thing... > } > } Sorry, I did not clearly understand how this should work. The mystery must be in _... do the token thing ..._ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1427877337 From luhenry at openjdk.org Fri Dec 15 11:40:45 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 15 Dec 2023 11:40:45 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v5] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 11:12:09 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Index load, other comment Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1783688607 From jkern at openjdk.org Fri Dec 15 11:57:51 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 11:57:51 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: - trailing whitespace - Following most of Thomas proposals ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/b7676822..18d9d2b0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=03-04 Stats: 562 lines in 3 files changed: 272 ins; 290 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From jkern at openjdk.org Fri Dec 15 11:57:53 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 15 Dec 2023 11:57:53 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 14:05:48 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > followed the proposals The libpath parsing code is from me, so no license problems. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1857762912 From kbarrett at openjdk.org Fri Dec 15 12:34:41 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 15 Dec 2023 12:34:41 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> Message-ID: On Fri, 15 Dec 2023 11:25:24 GMT, Julian Waters wrote: >> Looks like this change has also already been made, by JDK-8319440. >> >> All of the other non-comment uses of "register" I found in HotSpot are gcc local variable register specifications: >> https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Local-Register-Variables.html >> So are a different thing and not affected by the deprecation/removal of the C++ "register" keyword. > > Ah, I was not aware that the asm specifications for explicit registers overrides the warning about the register keyword, thanks! It's not that it overrides the warning. They are different syntactic constructs that just happen to have a word in common. The joys of parsing C++ ... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427925300 From dholmes at openjdk.org Fri Dec 15 12:43:40 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 15 Dec 2023 12:43:40 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding I previously stated "JNI checking is for checking actual JNI API functions" but Thomas reminded me above that it also enables periodic checks e.g. for the signal handlers, so we already go beyond just checking JNI API function calls. Hence there is scope for performing these FP checks under -Xcheck:jni. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1857822793 From jwaters at openjdk.org Fri Dec 15 12:58:39 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 15 Dec 2023 12:58:39 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> Message-ID: On Fri, 15 Dec 2023 12:31:51 GMT, Kim Barrett wrote: >> Ah, I was not aware that the asm specifications for explicit registers overrides the warning about the register keyword, thanks! > > It's not that it overrides the warning. They are different syntactic constructs that just happen to have > a word in common. The joys of parsing C++ ... The keyword also happens to go in the same location as well. How coincidental... ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427947238 From jwaters at openjdk.org Fri Dec 15 13:08:38 2023 From: jwaters at openjdk.org (Julian Waters) Date: Fri, 15 Dec 2023 13:08:38 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> Message-ID: On Fri, 15 Dec 2023 12:56:07 GMT, Julian Waters wrote: >> It's not that it overrides the warning. They are different syntactic constructs that just happen to have >> a word in common. The joys of parsing C++ ... > > The keyword also happens to go in the same location as well. How coincidental... I also realized that this uses a gcc statement expression currently, I wonder if this could use a lambda expression instead in another change? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1427955879 From kcr at openjdk.org Fri Dec 15 13:18:44 2023 From: kcr at openjdk.org (Kevin Rushforth) Date: Fri, 15 Dec 2023 13:18:44 GMT Subject: [jdk22] RFR: 8321542: C2: Missing ChaCha20 stub for x86_32 leads to crashes In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 18:05:18 GMT, Vladimir Kozlov wrote: > Why it needs review if backport is clean? Because that's the policy for the feature release during stabilization. See the [Integrating fixes and enhancements](https://openjdk.org/jeps/3#Integrating-fixes-and-enhancements) section of JEP 3. ------------- PR Comment: https://git.openjdk.org/jdk22/pull/14#issuecomment-1857866844 From fyang at openjdk.org Fri Dec 15 13:39:41 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 15 Dec 2023 13:39:41 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v5] In-Reply-To: References: Message-ID: On Tue, 12 Dec 2023 21:09:45 GMT, Ludovic Henry wrote: >> 8315856: RISC-V: Use Zacas extension for cmpxchg > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > Fix narrow compxchg Two minor comments remain. What about the test coverage? src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 2969: > 2967: seqz(result, t0); > 2968: } else { > 2969: if (result != expected) { The same check is already there in `mv` [1]. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L470 src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3096: > 3094: #undef ATOMIC_CAS > 3095: > 3096: #define ATOMIC_CASU(OP1, OP2) \ Nit: Need one extra space to keep the `` aligned. ------------- PR Review: https://git.openjdk.org/jdk/pull/16910#pullrequestreview-1783057607 PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1427503033 PR Review Comment: https://git.openjdk.org/jdk/pull/16910#discussion_r1427652620 From mdoerr at openjdk.org Fri Dec 15 13:54:47 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 15 Dec 2023 13:54:47 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 08:08:10 GMT, Julian Waters wrote: >> Implementation of [JEP draft: Compile the JDK as C++17](https://bugs.openjdk.org/browse/JDK-8310260) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-7 > - Revert vm_version_linux_riscv.cpp > - vm_version_linux_riscv.cpp > - allocation.cpp > - 8310260 In case you want to update the required compiler versions as part of this PR: We have tested -TOOLCHAIN_MINIMUM_VERSION_xlc="16.1.0.0011" +TOOLCHAIN_MINIMUM_VERSION_xlc="17.1.1.4" (Already discussed with Kim.) ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1857916995 From rehn at openjdk.org Fri Dec 15 14:02:49 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 14:02:49 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions Message-ID: Hi, this is the instructions for zcb. Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. I think we need to do some rework here. I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). (macro stuff was originally done when templates where blacklisted in hotspot) And I don't want an options for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). I have done some modification since it passed tier1, so I'm running stuff over the weekend. ------------- Commit messages: - zcb instruction set Changes: https://git.openjdk.org/jdk/pull/17122/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320069 Stats: 318 lines in 5 files changed: 277 ins; 0 del; 41 mod Patch: https://git.openjdk.org/jdk/pull/17122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17122/head:pull/17122 PR: https://git.openjdk.org/jdk/pull/17122 From fyang at openjdk.org Fri Dec 15 14:04:47 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 15 Dec 2023 14:04:47 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 11:11:59 GMT, Robbin Ehn wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: >> >> - Merge branch 'master' into sha256 >> - Materialize constants address once >> - Removed template >> - Flag fixes >> - Merge branch 'master' into sha256 >> - Share code >> - SHA-2 > > Hi all, I have address all comments. > > The only code change I didn't do was register caching of constants. > This is because I don't have access to sha2 in performance simulator. > Without it 256 and 512 have 'identical' path. > I'll create a jira for that, so I can revisit it once I have access. > I hope that is okay @RealFYang ? (i.e. ship this and do a follow-up) > > Also @VladimirKempik the flag issue is not resolved. > For now we use this experimental flag which is inline with the other flags. > > Any other things to address, new or that I missed? > > (passes compiler/intrinsics/sha/) > > REF: https://bugs.openjdk.org/browse/JDK-8322177 @robehn : Thanks for the update. I am OK to leave aside caching of constants for some while when we can compare the numbers. I think I can take another look early next week. But I doubt about the need of saving & restoring of `t2` in the latest version. According to the ABI, `t2` is a caller-save register and is supposed to be saved by the call is it is alive. Also, the vector registers used in this stub are also caller-save registers, we don't save them either on stub entry. (BTW: Seems this file should not be there in this PR: src/hotspot/cpu/riscv/.macroAssembler_riscv.cpp.swp) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1857925864 From rehn at openjdk.org Fri Dec 15 14:10:44 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 14:10:44 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 11:11:59 GMT, Robbin Ehn wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: >> >> - Merge branch 'master' into sha256 >> - Materialize constants address once >> - Removed template >> - Flag fixes >> - Merge branch 'master' into sha256 >> - Share code >> - SHA-2 > > Hi all, I have address all comments. > > The only code change I didn't do was register caching of constants. > This is because I don't have access to sha2 in performance simulator. > Without it 256 and 512 have 'identical' path. > I'll create a jira for that, so I can revisit it once I have access. > I hope that is okay @RealFYang ? (i.e. ship this and do a follow-up) > > Also @VladimirKempik the flag issue is not resolved. > For now we use this experimental flag which is inline with the other flags. > > Any other things to address, new or that I missed? > > (passes compiler/intrinsics/sha/) > > REF: https://bugs.openjdk.org/browse/JDK-8322177 > @robehn : Thanks for the update. I am OK to leave aside caching of constants for some while when we can compare the numbers. I think I can take another look early next week. But I doubt about the need of saving & restoring of `t2` in the latest version. According to the ABI, `t2` is a caller-save register and is supposed to be saved by the call is it is alive. Also, the vector registers used in this stub are also caller-save registers. > > (BTW: Seems this file should not be there in this PR: src/hotspot/cpu/riscv/.macroAssembler_riscv.cpp.swp) This stubroutine should be inlined via "LibraryCallKit::try_to_inline", meaning there is no call here. Hence why you need to use "enter" to setup a frame. If there is no call, nothing is caller saved, AFAICT we pass in live registers via the Nodes. (no argument passing) ciMethod* callee = kit.callee(); const int bci = kit.bci(); // Try to inline the intrinsic. if (callee->check_intrinsic_candidate() && kit.try_to_inline(_last_predicate)) { Correct me if I'm wrong? (and if there is no call CR must be kept alive, which is a bit scary, e.g. must spill t1 to make sure we don't kill CR) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1857940451 From rehn at openjdk.org Fri Dec 15 14:17:01 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 15 Dec 2023 14:17:01 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v6] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: Removed swap file ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/283f186f..c92975e0 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=04-05 Stats: 0 lines in 1 file changed: 0 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From luhenry at openjdk.org Fri Dec 15 14:46:43 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 15 Dec 2023 14:46:43 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 14:06:04 GMT, Robbin Ehn wrote: > This stubroutine should be inlined via "LibraryCallKit::try_to_inline", meaning there is no call here. It is not inlined because it's a stub. If it were to be inlined (which it shouldn't given how big it is), it should be declared in `macroAssembler_riscv.*` with the corresponding C2 Node. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1857994269 From luhenry at openjdk.org Fri Dec 15 14:51:06 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 15 Dec 2023 14:51:06 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v6] In-Reply-To: References: Message-ID: > 8315856: RISC-V: Use Zacas extension for cmpxchg Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16910/files - new: https://git.openjdk.org/jdk/pull/16910/files/0b0363da..9dbad9ac Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16910&range=04-05 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16910.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16910/head:pull/16910 PR: https://git.openjdk.org/jdk/pull/16910 From luhenry at openjdk.org Fri Dec 15 14:51:08 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 15 Dec 2023 14:51:08 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v5] In-Reply-To: References: Message-ID: <4OazKD1uEYF1YKskIjZw63n00yhAuGh-OP9hwhcUmVc=.c165cc70-f521-47f1-91bd-f56ff95bc949@github.com> On Fri, 15 Dec 2023 13:36:42 GMT, Fei Yang wrote: > What about the test coverage? it's been tested on QEMU running hotspot tier1, everything is passing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16910#issuecomment-1857995595 From fyang at openjdk.org Fri Dec 15 14:52:48 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 15 Dec 2023 14:52:48 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 14:06:04 GMT, Robbin Ehn wrote: > Correct me if I'm wrong? > > (and if there is no call CR must be kept alive, which is a bit scary, e.g. must spill t1 to make sure we don't kill CR) I haven't checked how the stub are invoked. But I just took a look at aarch64's version of `generate_sha256_implCompress`. Here is what I see. The `ofs` register alias `c_rarg2` and is updated at [1], but it is not saved & restored. Another case is the CR flag register, it is clobberd by the compare at [2], but still not saved & restored. So do the caller-save vector registers used in this stub. [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L3798 [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/stubGenerator_aarch64.cpp#L3799 > (I don't see what you mean by vector are saved?, only t2 is push to stack ?) I mean shouldn't we and other platforms like aarch64 also save & restore those caller-save vector registers if it is necessary? Why just `t2`? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1858000854 From duke at openjdk.org Fri Dec 15 15:41:50 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Fri, 15 Dec 2023 15:41:50 GMT Subject: RFR: 8274051: remove supports_vtime() Message-ID: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> 8274051: remove supports_vtime() ------------- Commit messages: - remove 'supports_vtime' Changes: https://git.openjdk.org/jdk/pull/17125/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17125&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8274051 Stats: 10 lines in 4 files changed: 0 ins; 9 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17125.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17125/head:pull/17125 PR: https://git.openjdk.org/jdk/pull/17125 From kbarrett at openjdk.org Fri Dec 15 15:54:47 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 15 Dec 2023 15:54:47 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 08:08:10 GMT, Julian Waters wrote: >> Implementation of [JEP draft: Compile the JDK as C++17](https://bugs.openjdk.org/browse/JDK-8310260) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-7 > - Revert vm_version_linux_riscv.cpp > - vm_version_linux_riscv.cpp > - allocation.cpp > - 8310260 In conjunction with changing to C++17, I suggest the following changes to the minimum compiler versions, as indicated in make/autoconf/toolchain.m4 old: make/autoconf/toolchain.m4 TOOLCHAIN_MINIMUM_VERSION_clang="3.5" TOOLCHAIN_MINIMUM_VERSION_gcc="6.0" TOOLCHAIN_MINIMUM_VERSION_microsoft="19.28.0.0" # VS2019 16.8, aka MSVC 14.28 TOOLCHAIN_MINIMUM_VERSION_xlc="16.1.0.0011" proposed new: TOOLCHAIN_MINIMUM_VERSION_clang="13.0" TOOLCHAIN_MINIMUM_VERSION_gcc="9.0" TOOLCHAIN_MINIMUM_VERSION_microsoft="19.28.0.0" # VS2019 16.8, aka MSVC 14.28 TOOLCHAIN_MINIMUM_VERSION_xlc="17.1.1.4" Here's the rationale for each of these: ----- gcc: https://gcc.gnu.org/gcc-9/changes.html "The C++17 implementation is no longer experimental." ----- open xl c++ for aix https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.1?topic=features-supported-language-levels supports C17 and C++17, with experimental support for C++20 17.1.0 docs explicitly says __clang_version__ is 13.0.0, with the other version macros set accordingly. 17.1.1 just describes the version macros, but doesn't say what their values are. But the __VERSION__ macro description includes "Clang 15.0.0" in the string. Note that there is now a 17.1.2 version, but the aix-ppc porters haven't proposed going that far. ----- Visual Studio https://learn.microsoft.com/en-gb/cpp/overview/visual-cpp-language-conformance?view=msvc-170 We already require VS2019 16.8, which covers all of C++17 features listed on that page. ----- clang https://clang.llvm.org/cxx_status.html c++17 - Clang 5 However, there is a critical bug for which we really want a fix. Using [[noreturn]] seems to be buggy and leads to crashes. This has been seen with clang 12. It appears to be fixed with clang 13.0.0 (Xcode 13.0). There may also be a bug somewhere in the 13.x release series with the handling of noexcept. See discussion in https://bugs.openjdk.org/browse/JDK-8255082. Oracle is currently using Xcode 14.3.1 (clang 14.0.3), so I think we wouldn't object to something between 13.0.0 and 14.0.3 as the minimum version. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1858097523 From kbarrett at openjdk.org Fri Dec 15 16:22:48 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 15 Dec 2023 16:22:48 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 08:08:10 GMT, Julian Waters wrote: >> Implementation of [JEP draft: Compile the JDK as C++17](https://bugs.openjdk.org/browse/JDK-8310260) > > Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: > > - Merge branch 'openjdk:master' into patch-7 > - Revert vm_version_linux_riscv.cpp > - vm_version_linux_riscv.cpp > - allocation.cpp > - 8310260 I agree that before throwing this switch, we need to look at some specific issues that might need to be addressed, discuss the benefits, and also the costs. As was discussed for the change to C++14, there is *never* a good time to start introducing the use of new language features as far as backporting is concerned, unless one is going to backport the language change too. We didn't do that for C++14, and I don't think we are going to (nor should) do it for C++17 either. But backporting concerns can't be all powerful, as that will forever prevent potentially significant improvements. I started to make a list of new language features that seem particularly beneficial or otherwise important. I was going to write style guide updates for these, but haven't gotten very far with that yet. P0035R4: Dynamic memory allocation for over-aligned data P0135R1: Guaranteed copy elision P0145R3: Refining Expression Evaluation Order for Idiomatic C++ P0292R2: constexpr if P0091R3/P0512R0: Template argument deduction for class templates Here are some others that might be of interest to us. N4268: Allow constant evaluation for all non-type template arguments N3928: Extending static_assert P0118R1: [[fallthrough]] attribute P0189R1: [[nodiscard]] attribute P0212R1: [[maybe_unused]] attribute P0170R1: Wording for constexpr lambda P0283R2: Ignoring unsupported non-standard attributes P0061R1: __has_include for C++17 P0386R2: Inline variables ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1858136247 From mli at openjdk.org Fri Dec 15 17:52:44 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 15 Dec 2023 17:52:44 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic Message-ID: Hi, Can you review this patch to implement SHA-1 intrinsic for riscv? Thanks! ## Test ### Functionality tests under `test/hotspot/jtreg/compiler/intrinsics/sha` tests found via `find test/jdk -iname "*SHA1*.java"` ### Performance tested on `T-HEAD Light Lichee Pi 4A` benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. **when intrinsic is enabled** o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? 150.045 ns/op o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 10 191325.246 ? 3298.882 ns/op o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 128 N/A N/A avgt 10 8220.886 ? 53.684 ns/op o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 1024 N/A N/A avgt 10 18006.955 ? 92.432 ns/op o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 10 11688843.558 ? 34924.678 ns/op **when intrinsic is disabled** o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 496.890 ? 6.695 ns/op o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3832.145 ? 178.196 ns/op o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 3625.522 ? 170.757 ns/op o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 12026.787 ? 221.032 ns/op o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 1307913.534 ? 5527.527 ns/op o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 17707.156 ? 378.556 ns/op o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 10 1379660.864 ? 49441.834 ns/op o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 128 N/A N/A avgt 10 34101.577 ? 116.905 ns/op o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 1024 N/A N/A avgt 10 107906.128 ? 966.146 ns/op o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 10 82834313.280 ? 311513.127 ns/op ------------- Commit messages: - Initial commit Changes: https://git.openjdk.org/jdk/pull/17130/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322179 Stats: 323 lines in 4 files changed: 317 ins; 1 del; 5 mod Patch: https://git.openjdk.org/jdk/pull/17130.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17130/head:pull/17130 PR: https://git.openjdk.org/jdk/pull/17130 From cslucas at openjdk.org Fri Dec 15 18:14:00 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 15 Dec 2023 18:14:00 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v7] In-Reply-To: References: Message-ID: > ### Description > > Many, if not most, allocation merges (Phis) are nullable because they join object allocations with "NULL", or objects returned from method calls, etc. Please review this Pull Request that improves Reduce Allocation Merge implementation so that it can reduce at least some of these allocation merges. > > Overall, the improvements are related to 1) making rematerialization of merges able to represent "NULL" objects, and 2) being able to reduce merges used by CmpP/N and CastPP. > > The approach to reducing CmpP/N and CastPP is pretty similar to that used in the `MemNode::split_through_phi` method: a clone of the node being split is added on each input of the Phi. I make use of `optimize_ptr_compare` and some type information to remove redundant CmpP and CastPP nodes. I added a bunch of ASCII diagrams illustrating what some of the more important methods are doing. > > ### Benchmarking > > **Note:** In some of these tests no reduction happens. I left them in to validate that no perf. regression happens in that case. > **Note 2:** Marging of error was negligible. > > | Benchmark | No RAM (ms/op) | Yes RAM (ms/op) | > |--------------------------------------|------------------|-------------------| > | TestTrapAfterMerge | 19.515 | 13.386 | > | TestArgEscape | 33.165 | 33.254 | > | TestCallTwoSide | 70.547 | 69.427 | > | TestCmpAfterMerge | 16.400 | 2.984 | > | TestCmpMergeWithNull_Second | 27.204 | 27.293 | > | TestCmpMergeWithNull | 8.248 | 4.920 | > | TestCondAfterMergeWithAllocate | 12.890 | 5.252 | > | TestCondAfterMergeWithNull | 6.265 | 5.078 | > | TestCondLoadAfterMerge | 12.713 | 5.163 | > | TestConsecutiveSimpleMerge | 30.863 | 4.068 | > | TestDoubleIfElseMerge | 16.069 | 2.444 | > | TestEscapeInCallAfterMerge | 23.111 | 22.924 | > | TestGlobalEscape | 14.459 | 14.425 | > | TestIfElseInLoop | 246.061 | 42.786 | > | TestLoadAfterLoopAlias | 45.808 | 45.812 | > | TestLoadAfterTrap | 28.370 | ... Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: Fix broken build. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15825/files - new: https://git.openjdk.org/jdk/pull/15825/files/eec5b669..1b088401 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15825&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15825&range=05-06 Stats: 31 lines in 1 file changed: 0 ins; 31 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/15825.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15825/head:pull/15825 PR: https://git.openjdk.org/jdk/pull/15825 From cslucas at openjdk.org Fri Dec 15 18:21:47 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Fri, 15 Dec 2023 18:21:47 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 01:47:26 GMT, Ludovic Henry wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > I've verified it works on riscv64, passing hotspot tier1 and tier2 tests. Thank you for testing @luhenry . Which OS did you run the tests? @offamitkumar, @TheRealMDoerr - can you please re-run the tests on the platforms convenient for you? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1858306441 From luhenry at openjdk.org Fri Dec 15 18:33:43 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Fri, 15 Dec 2023 18:33:43 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 18:19:02 GMT, Cesar Soares Lucas wrote: > Which OS did you run the tests? I ran it on Linux. Ubuntu 22.04 to be specific. Right now, riscv64 is only supported on Linux (no macOS or Windows for RISC-V in sight). ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1858319485 From cjplummer at openjdk.org Sat Dec 16 01:20:41 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Sat, 16 Dec 2023 01:20:41 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v2] In-Reply-To: <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> Message-ID: <8LPvhH3PtUVqRGTTOj05yDoPMATSyJLTL_h6tkk_lXs=.5f4e9703-9126-458b-ac10-f0b61984304b@github.com> On Wed, 6 Dec 2023 05:50:02 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this patch? >> >> In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. >> >> This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. >> >> Best, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > change the location of test src/hotspot/share/runtime/globals.hpp line 557: > 555: "Limit the number of heap dumps triggered by " \ > 556: "HeapDumpBeforeFullGC or HeapDumpAfterFullGC " \ > 557: "(0 means no limit)" ) \ nit: should be no space before the last paren. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16976#discussion_r1428625595 From cjplummer at openjdk.org Sat Dec 16 01:20:42 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Sat, 16 Dec 2023 01:20:42 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v2] In-Reply-To: <8LPvhH3PtUVqRGTTOj05yDoPMATSyJLTL_h6tkk_lXs=.5f4e9703-9126-458b-ac10-f0b61984304b@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> <8LPvhH3PtUVqRGTTOj05yDoPMATSyJLTL_h6tkk_lXs=.5f4e9703-9126-458b-ac10-f0b61984304b@github.com> Message-ID: On Sat, 16 Dec 2023 01:09:16 GMT, Chris Plummer wrote: >> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: >> >> change the location of test > > src/hotspot/share/runtime/globals.hpp line 557: > >> 555: "Limit the number of heap dumps triggered by " \ >> 556: "HeapDumpBeforeFullGC or HeapDumpAfterFullGC " \ >> 557: "(0 means no limit)" ) \ > > nit: should be no space before the last paren. I was wondering if it is worth having HeapDumpBeforeFullGC and HeapDumpAfterFullGC also mention FullGCHeapDumpLimit. Just something simple like "Also see FullGCHeapDumpLimit." Not necessary, but if you think it would be useful then please add. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16976#discussion_r1428626556 From duke at openjdk.org Sat Dec 16 01:42:48 2023 From: duke at openjdk.org (Steven Schlansker) Date: Sat, 16 Dec 2023 01:42:48 GMT Subject: Integrated: 8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 18:22:32 GMT, Steven Schlansker wrote: > Discovered while deep in an InternalError debugging session... This pull request has now been integrated. Changeset: 34351b7a Author: Steven Schlansker Committer: Jaikiran Pai URL: https://git.openjdk.org/jdk/commit/34351b7a7950a3b563748f40f2619374f62f9b16 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8321892: Typo in log message logged by src/hotspot/share/nmt/virtualMemoryTracker.cpp Reviewed-by: dholmes, azafari ------------- PR: https://git.openjdk.org/jdk/pull/17021 From dholmes at openjdk.org Sat Dec 16 03:54:39 2023 From: dholmes at openjdk.org (David Holmes) Date: Sat, 16 Dec 2023 03:54:39 GMT Subject: RFR: 8274051: remove supports_vtime() In-Reply-To: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> References: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> Message-ID: On Fri, 15 Dec 2023 15:34:50 GMT, Lei Zaakjyu wrote: > 8274051: remove supports_vtime() Hi @LizBing , This issue was already assigned to @fbredber - did you discuss taking it over with them? src/hotspot/share/runtime/os.hpp line 284: > 282: // this functionality for the current thread, and if so the second > 283: // returns the elapsed virtual time for the current thread. > 284: static bool supports_vtime(); You need to update the comment appropriately too. ------------- PR Review: https://git.openjdk.org/jdk/pull/17125#pullrequestreview-1785117562 PR Review Comment: https://git.openjdk.org/jdk/pull/17125#discussion_r1428669299 From duke at openjdk.org Sat Dec 16 04:07:41 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 16 Dec 2023 04:07:41 GMT Subject: RFR: 8274051: remove supports_vtime() In-Reply-To: References: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> Message-ID: On Sat, 16 Dec 2023 03:51:58 GMT, David Holmes wrote: > Hi @LizBing , > > This issue was already assigned to @fbredber - did you discuss taking it over with them? I didn't. However, I saw this issue hadn't been solved for a long time, so I tried handling it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17125#issuecomment-1858706752 From duke at openjdk.org Sat Dec 16 04:16:00 2023 From: duke at openjdk.org (Lei Zaakjyu) Date: Sat, 16 Dec 2023 04:16:00 GMT Subject: RFR: 8274051: remove supports_vtime() [v2] In-Reply-To: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> References: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> Message-ID: > 8274051: remove supports_vtime() Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: modify the comment ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17125/files - new: https://git.openjdk.org/jdk/pull/17125/files/c0d89b4e..bb6f0cba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17125&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17125&range=00-01 Stats: 4 lines in 1 file changed: 0 ins; 2 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17125.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17125/head:pull/17125 PR: https://git.openjdk.org/jdk/pull/17125 From ddong at openjdk.org Sat Dec 16 04:36:57 2023 From: ddong at openjdk.org (Denghui Dong) Date: Sat, 16 Dec 2023 04:36:57 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v3] In-Reply-To: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> Message-ID: <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> > Hi, > > Could I have a review of this patch? > > In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. > > This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. > > Best, > Denghui Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: refine description ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16976/files - new: https://git.openjdk.org/jdk/pull/16976/files/442b7f47..a0cae5c2 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16976&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16976&range=01-02 Stats: 5 lines in 1 file changed: 2 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16976.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16976/head:pull/16976 PR: https://git.openjdk.org/jdk/pull/16976 From ddong at openjdk.org Sat Dec 16 04:36:58 2023 From: ddong at openjdk.org (Denghui Dong) Date: Sat, 16 Dec 2023 04:36:58 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v2] In-Reply-To: References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> <8LPvhH3PtUVqRGTTOj05yDoPMATSyJLTL_h6tkk_lXs=.5f4e9703-9126-458b-ac10-f0b61984304b@github.com> Message-ID: On Sat, 16 Dec 2023 01:11:44 GMT, Chris Plummer wrote: >> src/hotspot/share/runtime/globals.hpp line 557: >> >>> 555: "Limit the number of heap dumps triggered by " \ >>> 556: "HeapDumpBeforeFullGC or HeapDumpAfterFullGC " \ >>> 557: "(0 means no limit)" ) \ >> >> nit: should be no space before the last paren. > > I was wondering if it is worth having HeapDumpBeforeFullGC and HeapDumpAfterFullGC also mention FullGCHeapDumpLimit. Just something simple like "Also see FullGCHeapDumpLimit." Not necessary, but if you think it would be useful then please add. > nit: should be no space before the last paren. fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16976#discussion_r1428685031 From ddong at openjdk.org Sat Dec 16 04:36:59 2023 From: ddong at openjdk.org (Denghui Dong) Date: Sat, 16 Dec 2023 04:36:59 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v2] In-Reply-To: References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <11q5GvzTPts9R6r7B1-KNh0me5AJwnSuqPP-J-LTuRc=.66380e47-6f34-4803-b522-9b22395466cb@github.com> <8LPvhH3PtUVqRGTTOj05yDoPMATSyJLTL_h6tkk_lXs=.5f4e9703-9126-458b-ac10-f0b61984304b@github.com> Message-ID: On Sat, 16 Dec 2023 04:33:48 GMT, Denghui Dong wrote: >> I was wondering if it is worth having HeapDumpBeforeFullGC and HeapDumpAfterFullGC also mention FullGCHeapDumpLimit. Just something simple like "Also see FullGCHeapDumpLimit." Not necessary, but if you think it would be useful then please add. > >> nit: should be no space before the last paren. > > fixed. > I was wondering if it is worth having HeapDumpBeforeFullGC and HeapDumpAfterFullGC also mention FullGCHeapDumpLimit. Just something simple like "Also see FullGCHeapDumpLimit." Not necessary, but if you think it would be useful then please add. Make sense. Added. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16976#discussion_r1428685126 From amitkumar at openjdk.org Sat Dec 16 05:05:43 2023 From: amitkumar at openjdk.org (Amit Kumar) Date: Sat, 16 Dec 2023 05:05:43 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission In-Reply-To: References: Message-ID: On Wed, 8 Nov 2023 04:34:03 GMT, Amit Kumar wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | ? | | >> | ARM32 | | | | >> | x86 | | | ? | >> | x64 | | | ? | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | n/a | n/a | ? | > > `s390x` also run into assert failure: `assert(masm->inst_mark() == nullptr) failed: should be.` > > > V [libjvm.so+0xfb0938] PhaseOutput::fill_buffer(C2_MacroAssembler*, unsigned int*)+0x2370 (output.cpp:1812) > V [libjvm.so+0xfb21ce] PhaseOutput::Output()+0xcae (output.cpp:362) > V [libjvm.so+0x6a90a8] Compile::Code_Gen()+0x460 (compile.cpp:2989) > V [libjvm.so+0x6ad848] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1738 (compile.cpp:887) > V [libjvm.so+0x4fb932] C2Compiler::compile_method(ciEnv*, ciMethod*, int, bool, DirectiveSet*)+0x14a (c2compiler.cpp:119) > V [libjvm.so+0x6b81a2] CompileBroker::invoke_compiler_on_method(CompileTask*)+0xd9a (compileBroker.cpp:2282) > V [libjvm.so+0x6b8eaa] CompileBroker::compiler_thread_loop()+0x5a2 (compileBroker.cpp:1943) >@offamitkumar, @TheRealMDoerr - can you please re-run the tests on the platforms convenient for you? I run build for fastdebug & release VMs and tier1 test for fastdebug VM. Everything seems good. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1858721643 From aph at openjdk.org Sat Dec 16 12:46:42 2023 From: aph at openjdk.org (Andrew Haley) Date: Sat, 16 Dec 2023 12:46:42 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 19:44:09 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | ? | | >> | ARM32 | | | | >> | x86 | | | ? | >> | x64 | | | ? | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. It seems odd to me that this substantial and complex patch lacks any justification. As far as I can tell, the decision to make class MacroAssembler very lightweight so that new instances could be created as needed was deliberate. Why change now? Is it performance, or something else? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1858810100 From aph at openjdk.org Sat Dec 16 12:52:37 2023 From: aph at openjdk.org (Andrew Haley) Date: Sat, 16 Dec 2023 12:52:37 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: <8IhchE1DjBZ_tCNQkxxpshcR3ntMHBJqUYaBHYwB2PQ=.e99c550d-40db-4537-97e9-95211326a860@github.com> On Wed, 6 Dec 2023 07:25:49 GMT, Matthias Baesken wrote: > > > /label jfr > > > I'm not sure I understand the issue, but adding a field to an event because of a GCC bug seems excessive. > > > > > > It's a nasty hard-to-find bug that breaks Java compatibility. Some people have wondered if this is a real-world problem, and the answer is that it's happening, right now, in Oracle's CI testing. > > Interesting, do you have some details about the 'Oracle CI testing' occurrence ? If so, what lib caused it ? It was here: https://bugs.openjdk.org/browse/JDK-8320151 > Do you think it would be beneficial to have it in the JFR for this particular case (maybe as a separate event if this is prefered over the current suggestion) ? Yes. I think it would be strange to have such a nasty bug and have JFR not record the fact. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1858811084 From aph at openjdk.org Sat Dec 16 13:04:38 2023 From: aph at openjdk.org (Andrew Haley) Date: Sat, 16 Dec 2023 13:04:38 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: <7D2C1yM2AS-G86oZF15MzMSrO0-pFMwXBVtVgzQ-_-Q=.0a5e2f6b-9aaa-483a-97c6-d75fee5f197d@github.com> <-sioWQwLXv1R5crwmt49RTy4FLTqLaHTqDxPpEIlPQY=.73c20e2d-fdcd-49a6-9851-c8198b20efc9@github.com> Message-ID: On Fri, 15 Dec 2023 13:05:32 GMT, Julian Waters wrote: >> The keyword also happens to go in the same location as well. How coincidental... > > I also realized that this uses a gcc statement expression currently, I wonder if this could use a lambda expression instead in another change? That's the standard idiom for accessing CSR, as used in libc and elsewhere. I think that avoiding divergence with such other sources is probably more important than some notion of standard purity. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14988#discussion_r1428804231 From rehn at openjdk.org Sat Dec 16 14:56:43 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Sat, 16 Dec 2023 14:56:43 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: <3nrDvWwhOl64vqG8kWOeWmEaHMFB3abs4x_6nqE1esg=.50bfb65f-c855-4e1c-8c5f-f0274edb0113@github.com> On Fri, 15 Dec 2023 14:43:58 GMT, Ludovic Henry wrote: > > This stubroutine should be inlined via "LibraryCallKit::try_to_inline", meaning there is no call here. > > It is not inlined because it's a stub. If it were to be inlined (which it shouldn't given how big it is), it should be declared in `macroAssembler_riscv.*` with the corresponding C2 Node. C2 asks if it should inline the method call to: "private void implCompress0(byte[] buf, int ofs)". As we don't want to call that method we say it is now inlined, by adding a runtime call to a piece of out-of-line code. For all purposes C2 needs to know this method is now part of this graph. That is what inlined means in this context, no Java call to that method is generated. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1858836815 From lmesnik at openjdk.org Sun Dec 17 20:30:39 2023 From: lmesnik at openjdk.org (Leonid Mesnik) Date: Sun, 17 Dec 2023 20:30:39 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v7] In-Reply-To: <4iSULgKTef_C2q4AJpEKB64tZh_QDIB77Ov2rwZ78nY=.39d3c708-4175-42b5-8eb9-58684e131ccf@github.com> References: <4iSULgKTef_C2q4AJpEKB64tZh_QDIB77Ov2rwZ78nY=.39d3c708-4175-42b5-8eb9-58684e131ccf@github.com> Message-ID: On Fri, 15 Dec 2023 10:49:56 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: improve an assert message Marked as reviewed by lmesnik (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17011#pullrequestreview-1785514646 From dholmes at openjdk.org Sun Dec 17 22:17:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Sun, 17 Dec 2023 22:17:38 GMT Subject: RFR: 8274051: remove supports_vtime() [v2] In-Reply-To: References: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> Message-ID: <9KfWaTB4OAe-EMcvy68zl5kzahdNjDIunBY-jnDJ5Iw=.c5400d8c-b317-4d8c-a853-457a2f77dd6b@github.com> On Sat, 16 Dec 2023 04:16:00 GMT, Lei Zaakjyu wrote: >> 8274051: remove supports_vtime() > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > modify the comment Marked as reviewed by dholmes (Reviewer). Reminder: hotspot changes require 2 reviews ------------- PR Review: https://git.openjdk.org/jdk/pull/17125#pullrequestreview-1785528538 PR Comment: https://git.openjdk.org/jdk/pull/17125#issuecomment-1859301283 From dholmes at openjdk.org Mon Dec 18 00:38:39 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 18 Dec 2023 00:38:39 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v3] In-Reply-To: <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> Message-ID: On Sat, 16 Dec 2023 04:36:57 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this patch? >> >> In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. >> >> This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. >> >> Best, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > refine description Update looks good. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16976#pullrequestreview-1785576985 From fyang at openjdk.org Mon Dec 18 06:30:45 2023 From: fyang at openjdk.org (Fei Yang) Date: Mon, 18 Dec 2023 06:30:45 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v7] In-Reply-To: References: Message-ID: On Fri, 8 Dec 2023 22:46:45 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Optimization against regression on SiFive So I tried your test on both Unmatched and Licheepi-4A boards. And I see the C2 JIT code is exercised and the test is passing over the full 32-bit range. Several minor comments though. src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4264: > 4262: void MacroAssembler::java_round_float(Register dst, FloatRegister src, FloatRegister ftmp) { > 4263: Label done; > 4264: mv(dst, zr); I see slightly improvement on both platforms when moving `mv` into between `feq_s`/`feq_d` and `beqz`. src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4265: > 4263: Label done; > 4264: mv(dst, zr); > 4265: li(t0, 0x3f000000); Suggestion: `mv(t0, jint_cast(0.5f));` src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4282: > 4280: Label done; > 4281: mv(dst, zr); > 4282: li(t0, 0x3fe0000000000000); Suggestion: `mv(t0, julong_cast(0.5));` src/hotspot/cpu/riscv/riscv.ad line 8367: > 8365: %} > 8366: > 8367: ins_pipe(pipe_class_default); Suggestion: `ins_pipe(pipe_slow);` src/hotspot/cpu/riscv/riscv.ad line 8381: > 8379: %} > 8380: > 8381: ins_pipe(pipe_class_default); Suggestion: `ins_pipe(pipe_slow);` ------------- PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1785876685 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429517638 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429518186 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429518536 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429546156 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1429546238 From epeter at openjdk.org Mon Dec 18 07:20:05 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 18 Dec 2023 07:20:05 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity Message-ID: Before this patch, we always initialized the GrowableArray up to its `capacity`, and not just up to `length`. This is problematic for a few reasons: - It is not expected. `std::vector` also only initializes the elements up to its size, and not to capacity. - It requires a default-constructor for the element type. And the default-constructor is then used to fill in the elements between length and capacity. If the elements do any allocation themselves, then this is a waste of resources. - The implementation also required the copy-assignment-operator for the element type. This is a lesser restriction. But the copy-assignment-operator was used in cases like `append` (where placement copy-construct would be expected), and not just in true assignment kinds of cases like `at_put`. For this reason, I reworked a lot of the methods to ensure that only the "slots" up to `length` are ever initialized, and the space between `length` and `capacity` is always garbage. ----- Also, before this patch, one can CHeap allocate both with `GrowableArray` and `GrowableArrayCHeap`. This is unnecessary. It required more complex verification in `GrowableArray` to deal with all cases. And `GrowableArrayCHeap` is already explicitly a smaller object, and should hence be preferred. Hence I changed all CHeap allocating cases of `GrowableArray` to `GrowableArrayCHeap`. This also allows for a clear separation: - `GrowableArray` only deals with arena / resource area allocation. These are arrays that are regularly abandoned at the end of their use, rather than deleted or even cleared. - `GrowableArrayCHeap` only deals with CHeap allocated memory. We expect that the destructor for it is called eventually, either when it goes out of scope or when `delete` is explicitly called. We expect that the elements could be allocating resources internally, and hence rely on the destructors for the elements being called, which may free up those internally allocated resources. Therefore, we now only allow `GrowableArrayCHeap` to have element types with non-trivial destructors, but `GrowableArray` checks that element types do not have non-trivial destructors (since it is common practice to just abandon arena / resource area allocated arrays, rather than calling the destructor or clearing the array, which also destructs all elements). This more clearly separates the two worlds: clean-up your own mess (CHeap) vs abandon your mess (arena / resource area). ----- I also completely refactored and improved the tests for `GrowableArray(CHeap)`: https://github.com/openjdk/jdk/blob/e5eb36010355b444a719da6bdcd8c5de3145b961/test/hotspot/gtest/utilities/test_growableArray.cpp#L29-L60 The main improvement is that now **all** `GrowableArray` methods are tested, and that we test it with many different element types (including such without default-constructor or copy-assign-constructor). And we also check that the number of live elements is correct, which we can compute as `live = constructred - destructed`. This is especially valuable because I refactored the use of constructors/destructors heavily, to do the change from initializing up to `length` instead of `capacity`. ---- **Note on move-semantics** Since move semantics is currently not allowed by the style guide, we have to "simulate" a move by placement new with copy-constructor, and then destruct the old element. See this example when we need to grow an array, and move the elements from the old data to the new data: https://github.com/openjdk/jdk/blob/e5eb36010355b444a719da6bdcd8c5de3145b961/src/hotspot/share/utilities/growableArray.hpp#L530-L563 Of course this is nothing new with my change here. I just want to record that we are doing it this way, and in fact have to do so without any move-semantics. The problem with this: If you use nested `GrowableArray>`, then the inner arrays get copy-constructed around when we re-allocate the outer array. We now have two choices for how `GrowableArray` could copy (currently it is a shallow-copy): - shallow-copy: works well for reallocating outer arrays, since the inner array is now just shallow-copied to the new data, and the destructor for the old inner arrays does nothing. Shallow-copy of course would not work for `GrowableArrayCHeap`, since it would deallocate the data of the old inner arrays, and the new inner array would still have a pointer to that deallocated data (bad!). But shallow-copy is generally dangerous, since the copy-constructor may be used in non-obvious cases: ResourceMark rm; GrowableArray> outer; outer.at_grow(100); // default argument calls default constructor, and (shallow) copy-constructs it so all elements outer.at(0).at_put_grow(0, 42); outer.at(1).at_put_grow(0, 666); // inner array at position 1 has reference to same data as inner array 0 ASSERT_EQ(outer.at(0).at(0), 42); // fails, we see 666 instead of 42 ASSERT_EQ(outer.at(1).at(0), 666); - deep-copy: This ensures correctness, we never have two arrays with the same underlying data. But that also means that when we re-allocate an outer array, we now (deep) copy-construct all new elements from the old elements. And that seems quite wasteful, both for the memory and the time needed to deep-copy everything over. Neither of these options is good. This is exactly why the move-semantics were introduced in `C++11`. We should therefore discuss the introduction of move-semantics, and weigh it against the additional complexity that it introduces. ----- Testing: tier1-3 and stress testing. Running. ------------- Commit messages: - whitespaces - fix issue with clang, need explicit copy constructor if assignment operator deleted - Merge branch 'master' into JDK-8319115 - fix small comment - fix some comments - test shallow assign / copy - 2 negative tests: nested ra, insert_before to itself - refactor and test swap - find_sorted and insert_sorted - test 2 versions of sort - ... and 44 more: https://git.openjdk.org/jdk/compare/b31454e3...1ed037fd Changes: https://git.openjdk.org/jdk/pull/16918/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16918&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8319115 Stats: 3439 lines in 127 files changed: 2150 ins; 493 del; 796 mod Patch: https://git.openjdk.org/jdk/pull/16918.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16918/head:pull/16918 PR: https://git.openjdk.org/jdk/pull/16918 From dholmes at openjdk.org Mon Dec 18 07:51:43 2023 From: dholmes at openjdk.org (David Holmes) Date: Mon, 18 Dec 2023 07:51:43 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 07:56:04 GMT, Emanuel Peter wrote: > Before this patch, we always initialized the GrowableArray up to its `capacity`, and not just up to `length`. This is problematic for a few reasons: > > - It is not expected. `std::vector` also only initializes the elements up to its size, and not to capacity. > - It requires a default-constructor for the element type. And the default-constructor is then used to fill in the elements between length and capacity. If the elements do any allocation themselves, then this is a waste of resources. > - The implementation also required the copy-assignment-operator for the element type. This is a lesser restriction. But the copy-assignment-operator was used in cases like `append` (where placement copy-construct would be expected), and not just in true assignment kinds of cases like `at_put`. > > For this reason, I reworked a lot of the methods to ensure that only the "slots" up to `length` are ever initialized, and the space between `length` and `capacity` is always garbage. > > ----- > > Also, before this patch, one can CHeap allocate both with `GrowableArray` and `GrowableArrayCHeap`. This is unnecessary. It required more complex verification in `GrowableArray` to deal with all cases. And `GrowableArrayCHeap` is already explicitly a smaller object, and should hence be preferred. Hence I changed all CHeap allocating cases of `GrowableArray` to `GrowableArrayCHeap`. This also allows for a clear separation: > - `GrowableArray` only deals with arena / resource area allocation. These are arrays that are regularly abandoned at the end of their use, rather than deleted or even cleared. > - `GrowableArrayCHeap` only deals with CHeap allocated memory. We expect that the destructor for it is called eventually, either when it goes out of scope or when `delete` is explicitly called. We expect that the elements could be allocating resources internally, and hence rely on the destructors for the elements being called, which may free up those internally allocated resources. > > Therefore, we now only allow `GrowableArrayCHeap` to have element types with non-trivial destructors, but `GrowableArray` checks that element types do not have non-trivial destructors (since it is common practice to just abandon arena / resource area allocated arrays, rather than calling the destructor or clearing the array, which also destructs all elements). This more clearly separates the two worlds: clean-up your own mess (CHeap) vs abandon your mess (arena / resource area). > > ----- > > I also completely refactored and improved ... @eme64 Is it feasible to split this up to solve each of the problems you identify in stages? There is also overlap here with JDK-8319709 IIUC. Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16918#issuecomment-1859711325 From stuefe at openjdk.org Mon Dec 18 09:09:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 09:09:40 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding I vote for doing both; the checkjni thing can be done in a separate RFE. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1859852901 From epeter at openjdk.org Mon Dec 18 09:14:45 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Mon, 18 Dec 2023 09:14:45 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 07:49:04 GMT, David Holmes wrote: >> Before this patch, we always initialized the GrowableArray up to its `capacity`, and not just up to `length`. This is problematic for a few reasons: >> >> - It is not expected. `std::vector` also only initializes the elements up to its size, and not to capacity. >> - It requires a default-constructor for the element type. And the default-constructor is then used to fill in the elements between length and capacity. If the elements do any allocation themselves, then this is a waste of resources. >> - The implementation also required the copy-assignment-operator for the element type. This is a lesser restriction. But the copy-assignment-operator was used in cases like `append` (where placement copy-construct would be expected), and not just in true assignment kinds of cases like `at_put`. >> >> For this reason, I reworked a lot of the methods to ensure that only the "slots" up to `length` are ever initialized, and the space between `length` and `capacity` is always garbage. >> >> ----- >> >> Also, before this patch, one can CHeap allocate both with `GrowableArray` and `GrowableArrayCHeap`. This is unnecessary. It required more complex verification in `GrowableArray` to deal with all cases. And `GrowableArrayCHeap` is already explicitly a smaller object, and should hence be preferred. Hence I changed all CHeap allocating cases of `GrowableArray` to `GrowableArrayCHeap`. This also allows for a clear separation: >> - `GrowableArray` only deals with arena / resource area allocation. These are arrays that are regularly abandoned at the end of their use, rather than deleted or even cleared. >> - `GrowableArrayCHeap` only deals with CHeap allocated memory. We expect that the destructor for it is called eventually, either when it goes out of scope or when `delete` is explicitly called. We expect that the elements could be allocating resources internally, and hence rely on the destructors for the elements being called, which may free up those internally allocated resources. >> >> Therefore, we now only allow `GrowableArrayCHeap` to have element types with non-trivial destructors, but `GrowableArray` checks that element types do not have non-trivial destructors (since it is common practice to just abandon arena / resource area allocated arrays, rather than calling the destructor or clearing the array, which also destructs all elements). This more clearly separates the two worlds: clean-up your own mess (CHeap) vs abandon your mess (arena / resource area). >> >> ----- >> >> I al... > > @eme64 Is it feasible to split this up to solve each of the problems you identify in stages? There is also overlap here with JDK-8319709 IIUC. Thanks. @dholmes-ora These are the "parts": 1. initialize up to capacity vs length 2. update the test to verify this (complete refactoring) 3. remove cheap use of GrowableArray -> use GrowableArrayCHeap instead The first 2 items are inseparable, I cannot make substantial changes to many GrowableArray methods without there even being tests for them. And the tests would not pass before the changes for item 1, since the tests also verify what elements of the array are initialized. So adding the tests first would not be very feasible. The 3rd item could maybe be split, and be done before the rest. Though it would also require lots of changes to the test, which then I would have to completely refactor with items 1+2 anyway. And the items are related conceptually, that is why I would felt ok pushing them together. It is all about when (item 1) and what kinds of (item 3) constructors / destructors are called for the elements of the arrays, and verifying that thoroughly (item 2). Hence: feasible probably, but lots of work overhead. Do you think it is worth it? I am aware of JDK-8319709, and in conversation with @jdksjolen - I basically took this item over for him ;) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16918#issuecomment-1859859939 From stuefe at openjdk.org Mon Dec 18 09:18:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 09:18:43 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding src/hotspot/os/bsd/os_bsd.cpp line 1007: > 1005: assert(rtn == 0, "fegetenv must succeed"); > 1006: #endif // IA32 > 1007: Its difficult to see what exactly changed on MacOS. Is this restructuring necessary? src/hotspot/os/bsd/os_bsd.hpp line 73: > 71: static void clock_init(void); > 72: > 73: static void *dlopen_helper(const char *path, int mode, char *ebuf, int ebuflen); Per-existing, but these helpers should not have been exposed via os::Bsd or os::Linux, they should be filescope static. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16903#discussion_r1429750794 PR Review Comment: https://git.openjdk.org/jdk/pull/16903#discussion_r1429749909 From kbarrett at openjdk.org Mon Dec 18 09:22:41 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 18 Dec 2023 09:22:41 GMT Subject: RFR: 8274051: remove supports_vtime() [v2] In-Reply-To: References: <35gxEbiDfghFA2QnjBjp1PXw-D6mGyZA-eTT-k_kGgM=.a3f8337e-7142-42bc-81af-91afe8bf517e@github.com> Message-ID: On Sat, 16 Dec 2023 04:16:00 GMT, Lei Zaakjyu wrote: >> 8274051: remove supports_vtime() > > Lei Zaakjyu has updated the pull request incrementally with one additional commit since the last revision: > > modify the comment I don't think this change should be made, for reasons discussed in JBS and in email back when the JBS issues was first opened. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17125#pullrequestreview-1786270040 From egahlin at openjdk.org Mon Dec 18 09:56:40 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Mon, 18 Dec 2023 09:56:40 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding I like to better understand the problem. Do we expect this to be an issue in 10-15 years, or is it more of a temporary thing? Would we be comfortable adding a method to a MXBean to check if this has happened? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1859969731 From stuefe at openjdk.org Mon Dec 18 10:29:50 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 10:29:50 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: References: Message-ID: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> On Fri, 15 Dec 2023 11:57:51 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: > > - trailing whitespace > - Following most of Thomas proposals I like this, this is good. Small nits remain. src/hotspot/os/aix/os_aix.cpp line 30: > 28: #pragma alloca > 29: > 30: please remove whitespace change src/hotspot/os/aix/os_aix.cpp line 193: > 191: // local variables > 192: > 193: please remove whitespace change src/hotspot/os/aix/os_aix.cpp line 1113: > 1111: } > 1112: > 1113: please remove whitespace change src/hotspot/os/aix/porting_aix.cpp line 934: > 932: struct scnhdr the_scn; > 933: struct ldhdr the_ldr; > 934: size_t sz = FILHSZ + _AOUTHSZ_EXEC; please rename to xcoffsz, and make constexpr: `constexpr size_t xcoffsz = ...` src/hotspot/os/aix/porting_aix.cpp line 990: > 988: if (env == nullptr) { > 989: // no LIBPATH, try with LD_LIBRARY_PATH > 990: env = getenv("LD_LIBRARY_PATH"); Is LD_LIBRARY_PATH a thing on AIX? I thought it is only used on non-AIX. src/hotspot/os/aix/porting_aix.cpp line 1005: > 1003: // LIBPATH or LD_LIBRARY_PATH and second with burned in libpath. > 1004: // No check against current working directory > 1005: Libpath.print("%s:%s", env, rtv_linkedin_libpath()); Are you sure libpath env var has precedence over the baked-in libpath? src/hotspot/os/aix/porting_aix.cpp line 1097: > 1095: } > 1096: > 1097: pthread_mutex_lock(&g_handletable_mutex); You can make your life a lot easier by defining an RAII object at the start of the file: struct TableLocker { TableLocker() { pthread_mutex_lock(&g_handletable_mutex); } ~TableLocker() { pthread_mutex_unlock(&g_handletable_mutex); } }; and just place this at the beginning of your two functions TableLocker lock: ... no need to manually unlock then, with the danger of missing a return. src/hotspot/os/aix/porting_aix.cpp line 1101: > 1099: for (i = 0; i < g_handletable_used; i++) { > 1100: if (g_handletable[i].handle == libhandle) { > 1101: // handle found, decrease refcount `assert(refcount > 0, "Sanity"))` src/hotspot/os/aix/porting_aix.cpp line 1143: > 1141: // entry of the array to the place of the entry we want to remove and overwrite it > 1142: if (i < g_handletable_used) { > 1143: g_handletable[i] = g_handletable[g_handletable_used]; To be super careful, I would zero out at least the handle of the moved item like this: `g_handletable[g_handletable_used].handle = nullptr` ------------- Changes requested by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1786400492 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870755 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870833 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870885 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429849403 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429858465 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429859923 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429868182 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429863665 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429870057 From stuefe at openjdk.org Mon Dec 18 10:29:52 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 10:29:52 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:06:34 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: >> >> - trailing whitespace >> - Following most of Thomas proposals > > src/hotspot/os/aix/porting_aix.cpp line 934: > >> 932: struct scnhdr the_scn; >> 933: struct ldhdr the_ldr; >> 934: size_t sz = FILHSZ + _AOUTHSZ_EXEC; > > please rename to xcoffsz, and make constexpr: `constexpr size_t xcoffsz = ...` Also, can you please add STATIC_ASSERT(sizeof(the_xcoff) == xcoffsz); STATIC_ASSERT(sizeof(the_scn) == SCNHSZ); STATIC_ASSERT(sizeof(the_ldr) == LDHDRSZ); ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429853292 From jkern at openjdk.org Mon Dec 18 10:38:42 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 18 Dec 2023 10:38:42 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:14:42 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: >> >> - trailing whitespace >> - Following most of Thomas proposals > > src/hotspot/os/aix/porting_aix.cpp line 990: > >> 988: if (env == nullptr) { >> 989: // no LIBPATH, try with LD_LIBRARY_PATH >> 990: env = getenv("LD_LIBRARY_PATH"); > > Is LD_LIBRARY_PATH a thing on AIX? I thought it is only used on non-AIX. Yes it is, It's the fallback if LIBPATH is not defined ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429891049 From jkern at openjdk.org Mon Dec 18 11:02:45 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 18 Dec 2023 11:02:45 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:19:24 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: >> >> - trailing whitespace >> - Following most of Thomas proposals > > src/hotspot/os/aix/porting_aix.cpp line 1101: > >> 1099: for (i = 0; i < g_handletable_used; i++) { >> 1100: if (g_handletable[i].handle == libhandle) { >> 1101: // handle found, decrease refcount > > `assert(refcount > 0, "Sanity"))` Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429931831 From aph at openjdk.org Mon Dec 18 10:44:40 2023 From: aph at openjdk.org (Andrew Haley) Date: Mon, 18 Dec 2023 10:44:40 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Mon, 18 Dec 2023 09:54:18 GMT, Erik Gahlin wrote: > I like to better understand the problem. Do we expect this to be an issue in 10-15 years, or is it more of a temporary thing? it's been a problem for at least 15 years. Rounding modes are an endless game of whack-a-mole. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1860081331 From stuefe at openjdk.org Mon Dec 18 10:57:46 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 10:57:46 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:35:48 GMT, Joachim Kern wrote: >> src/hotspot/os/aix/porting_aix.cpp line 990: >> >>> 988: if (env == nullptr) { >>> 989: // no LIBPATH, try with LD_LIBRARY_PATH >>> 990: env = getenv("LD_LIBRARY_PATH"); >> >> Is LD_LIBRARY_PATH a thing on AIX? I thought it is only used on non-AIX. > > Yes it is, It's the fallback if LIBPATH is not defined In that case there may be errors in other places, since so far we assumed its either one or the other, but not both. Example: https://github.com/openjdk/jdk/blob/a247d0c74bea50f11d24fb5f3576947c6901e567/src/java.base/unix/native/libjli/java_md.c#L43C1-L47 Maybe you need to take a look here, in case LD_LIBRARYPATH needs to be handled in addition to LIBPATH? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429917901 From jkern at openjdk.org Mon Dec 18 10:57:48 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 18 Dec 2023 10:57:48 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:16:07 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: >> >> - trailing whitespace >> - Following most of Thomas proposals > > src/hotspot/os/aix/porting_aix.cpp line 1005: > >> 1003: // LIBPATH or LD_LIBRARY_PATH and second with burned in libpath. >> 1004: // No check against current working directory >> 1005: Libpath.print("%s:%s", env, rtv_linkedin_libpath()); > > Are you sure libpath env var has precedence over the baked-in libpath? Yes, that was the outcome of my experiments, although the IBM docu says the oposite: _"Specifies that the library path used at process exec time should be prepended to any library path specified in the load call (either as an argument or environment variable). It is recommended that this flag be specified in all calls to the load subroutine."_ My experiment showed: LIBPATH=libpath; baked-in-libpath=baked-in-libpath; mylib.so is in both paths. After dlopen(mylib.so) a map call shows the library was loaded from libpath. Then I remove the LIBPATH envvar and repeat. Now after dlopen(mylib.so) a map call shows the library was loaded from baked-in-libpath. So the LIBPATH envvar has precedence over the baked-in-libpath. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429919510 From jkern at openjdk.org Mon Dec 18 11:19:47 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 18 Dec 2023 11:19:47 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:25:50 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: >> >> - trailing whitespace >> - Following most of Thomas proposals > > src/hotspot/os/aix/os_aix.cpp line 30: > >> 28: #pragma alloca >> 29: >> 30: > > please remove whitespace change Done > src/hotspot/os/aix/os_aix.cpp line 193: > >> 191: // local variables >> 192: >> 193: > > please remove whitespace change Done > src/hotspot/os/aix/porting_aix.cpp line 1097: > >> 1095: } >> 1096: >> 1097: pthread_mutex_lock(&g_handletable_mutex); > > You can make your life a lot easier by defining an RAII object at the start of the file: > > struct TableLocker { > TableLocker() { pthread_mutex_lock(&g_handletable_mutex); } > ~TableLocker() { pthread_mutex_unlock(&g_handletable_mutex); } > }; > > and just place this at the beginning of your two functions > > TableLocker lock: > ... > > > no need to manually unlock then, with the danger of missing a return. Great, thank you. This was one of the things I thought about, but was not sure, because I did not fully understood the MutexLocker class and the difference between Monitor and Mutex. > src/hotspot/os/aix/porting_aix.cpp line 1143: > >> 1141: // entry of the array to the place of the entry we want to remove and overwrite it >> 1142: if (i < g_handletable_used) { >> 1143: g_handletable[i] = g_handletable[g_handletable_used]; > > To be super careful, I would zero out at least the handle of the moved item like this: > `g_handletable[g_handletable_used].handle = nullptr` Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429950832 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429951237 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429946043 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1429947950 From jkern at openjdk.org Mon Dec 18 11:30:59 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 18 Dec 2023 11:30:59 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v6] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: Followed Thomas proposals ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/18d9d2b0..978ed33c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=04-05 Stats: 79 lines in 2 files changed: 19 ins; 21 del; 39 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From avoitylov at openjdk.org Mon Dec 18 12:01:01 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Mon, 18 Dec 2023 12:01:01 GMT Subject: [jdk22] RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly Message-ID: Hi all, This pull request contains a backport of commit [f573f6d2](https://github.com/openjdk/jdk/commit/f573f6d233d5ea1657018c3c806fee0fac382ac3) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Aleksei Voitylov on 13 Dec 2023 and was reviewed by Aleksey Shipilev. Thanks! ------------- Commit messages: - Backport f573f6d233d5ea1657018c3c806fee0fac382ac3 Changes: https://git.openjdk.org/jdk22/pull/17/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=17&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8321515 Stats: 33 lines in 3 files changed: 14 ins; 4 del; 15 mod Patch: https://git.openjdk.org/jdk22/pull/17.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/17/head:pull/17 PR: https://git.openjdk.org/jdk22/pull/17 From jkern at openjdk.org Mon Dec 18 12:53:43 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 18 Dec 2023 12:53:43 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:54:31 GMT, Thomas Stuefe wrote: >> Yes it is, It's the fallback if LIBPATH is not defined > > In that case there may be errors in other places, since so far we assumed its either one or the other, but not both. Example: > > https://github.com/openjdk/jdk/blob/a247d0c74bea50f11d24fb5f3576947c6901e567/src/java.base/unix/native/libjli/java_md.c#L43C1-L47 > > Maybe you need to take a look here, in case LD_LIBRARYPATH needs to be handled in addition to LIBPATH? Yes, it's one or the other. If LIBPATH envvar exists (even empty string), LD_LIBRARY_PATH is ignored. So, no problems. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1430106335 From shade at openjdk.org Mon Dec 18 13:15:40 2023 From: shade at openjdk.org (Aleksey Shipilev) Date: Mon, 18 Dec 2023 13:15:40 GMT Subject: [jdk22] RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 11:54:53 GMT, Aleksei Voitylov wrote: > Hi all, > > This pull request contains a backport of commit [f573f6d2](https://github.com/openjdk/jdk/commit/f573f6d233d5ea1657018c3c806fee0fac382ac3) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Aleksei Voitylov on 13 Dec 2023 and was reviewed by Aleksey Shipilev. > > Thanks! Marked as reviewed by shade (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/17#pullrequestreview-1786869600 From stuefe at openjdk.org Mon Dec 18 13:36:44 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 13:36:44 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v6] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 11:30:59 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > Followed Thomas proposals Well done. Releasing the mutex before asserting is not necessary; we don't pull the handle table lock as part of error reporting. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1786905733 From mli at openjdk.org Mon Dec 18 14:04:52 2023 From: mli at openjdk.org (Hamlin Li) Date: Mon, 18 Dec 2023 14:04:52 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v2] In-Reply-To: References: Message-ID: > Hi, > Can you review this patch to implement SHA-1 intrinsic for riscv? > Thanks! > > > ## Test > > ### Functionality > > tests under `test/hotspot/jtreg/compiler/intrinsics/sha` > tests found via `find test/jdk -iname "*SHA1*.java"` > > ### Performance > > tested on `T-HEAD Light Lichee Pi 4A` > > benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. > > **when intrinsic is enabled** > > o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op > o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op > o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? 150.045 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 10 191325.246 ? 3298.882 ns/op > o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 128 N/A N/A avgt 10 8220.886 ? 53.684 ns/op > o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 1024 N/A N/A avgt 10 18006.955 ? 92.432 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 10 11688843.558 ? 34924.678 ns/op > > > **when intrinsic is disabled** > > o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 496.890 ? 6.695 ns/op > o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt ... Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: Add some comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17130/files - new: https://git.openjdk.org/jdk/pull/17130/files/505eca03..c4dc07be Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=00-01 Stats: 12 lines in 1 file changed: 11 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17130.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17130/head:pull/17130 PR: https://git.openjdk.org/jdk/pull/17130 From mdoerr at openjdk.org Mon Dec 18 14:46:42 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 18 Dec 2023 14:46:42 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 19:44:09 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | ? | | >> | ARM32 | | | | >> | x86 | | | ? | >> | x64 | | | ? | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. Seems to work on PPC64, now. I agree with Andrew. It should be clear what the benefits are. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1860685005 From vkempik at openjdk.org Mon Dec 18 15:12:39 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Mon, 18 Dec 2023 15:12:39 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: References: Message-ID: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> On Fri, 15 Dec 2023 13:50:14 GMT, Robbin Ehn wrote: > Hi, this is the instructions for zcb. > > Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. > Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. > I think we need to do some rework here. > > I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). > (macro stuff was originally done when templates where blacklisted in hotspot) > > And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). > > I have done some modification since it passed tier1, so I'm running stuff over the weekend. We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1860765932 From duke at openjdk.org Mon Dec 18 15:21:49 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Mon, 18 Dec 2023 15:21:49 GMT Subject: Integrated: 8316197: Make tracing of inline cache available in unified logging In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 23:07:21 GMT, Yi-Fan Tsai wrote: > This removes develop flag `TraceICs` and makes the logs available via `-Xlog`. > > Example: > > % java -Xlog:inlinecache=trace -version > [0.061s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739001d17: set_to_interpreted java.lang.StringLatin1.hashCode([B)I > [0.078s][trace][inlinecache] IC at 0x00007f3739004a87: monomorphic to compiled (rcvr klass = nullptr) > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739005dff: set_to_interpreted jdk.internal.util.ArraysSupport.vectorizedHashCode(Ljava/lang/Object;IIII)I > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900502f: set_to_interpreted jdk.internal.org.objectweb.asm.ByteVector.enlarge(I)V > [0.079s][trace][inlinecache] IC at 0x00007f373900502f: monomorphic to interpreter: {method} {0x00007f36f03e6318} 'enlarge' '(I)V' in 'jdk/internal/org/objectweb/asm/ByteVector' > [0.079s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f3739006b0f: set_to_compiled 0x00007f3739002120 > [0.083s][trace][inlinecache] CompiledDirectStaticCall at 0x00007f373900928f: set_to_interpreted java.lang.AbstractStringBuilder.newCapacity(I)I > [0.083s][trace][inlinecache] IC at 0x00007f373900928f: monomorphic to interpreter: {method} {0x00007f36f00cd170} 'newCapacity' '(I)I' in 'java/lang/AbstractStringBuilder' > ... This pull request has now been integrated. Changeset: c0a3b769 Author: Yi-Fan Tsai Committer: Paul Hohensee URL: https://git.openjdk.org/jdk/commit/c0a3b76958bd6766b18cab31b461c1b0ac2c65cd Stats: 35 lines in 10 files changed: 6 ins; 3 del; 26 mod 8316197: Make tracing of inline cache available in unified logging Reviewed-by: kvn, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17026 From duke at openjdk.org Mon Dec 18 15:23:54 2023 From: duke at openjdk.org (Yi-Fan Tsai) Date: Mon, 18 Dec 2023 15:23:54 GMT Subject: Integrated: 8314029: Add file name parameter to Compiler.perfmap In-Reply-To: References: Message-ID: On Thu, 21 Sep 2023 20:43:56 GMT, Yi-Fan Tsai wrote: > `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. > > `jcmd PID help Compiler.perfmap` shows the following usage. > > > Compiler.perfmap > Write map file for Linux perf tool. > > Impact: Low > > Syntax : Compiler.perfmap [] > > Arguments: > filename : [optional] Name of the map file (STRING, no default value) > > > The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) > > > Compiler.perfmap [arguments] (Linux only) > Write map file for Linux perf tool. > > Impact: Low > > arguments: > > ? filename: (Optional) Name of the map file (STRING, no default value) > > If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then > the default filename will be /tmp/perf-12345.map. This pull request has now been integrated. Changeset: a5122d7f Author: Yi-Fan Tsai Committer: Paul Hohensee URL: https://git.openjdk.org/jdk/commit/a5122d7f6c36a4c98ea4bea7a7c8081e2a4dadca Stats: 56 lines in 6 files changed: 37 ins; 3 del; 16 mod 8314029: Add file name parameter to Compiler.perfmap Reviewed-by: cjplummer, eastigeevich ------------- PR: https://git.openjdk.org/jdk/pull/15871 From omikhaltcova at openjdk.org Mon Dec 18 16:05:54 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Mon, 18 Dec 2023 16:05:54 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v8] In-Reply-To: References: Message-ID: > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Used jint_cast/julong_cast; moved mv between feq and beqz ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/d60488fa..df70bcba Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=06-07 Stats: 8 lines in 2 files changed: 2 ins; 2 del; 4 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From omikhaltcova at openjdk.org Mon Dec 18 16:06:00 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Mon, 18 Dec 2023 16:06:00 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v7] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 05:25:06 GMT, Fei Yang wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Optimization against regression on SiFive > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4264: > >> 4262: void MacroAssembler::java_round_float(Register dst, FloatRegister src, FloatRegister ftmp) { >> 4263: Label done; >> 4264: mv(dst, zr); > > I see slightly improvement on both platforms when moving `mv` into between `feq_s`/`feq_d` and `beqz`. Thx! Fixed. > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4265: > >> 4263: Label done; >> 4264: mv(dst, zr); >> 4265: li(t0, 0x3f000000); > > Suggestion: `mv(t0, jint_cast(0.5f));` Thx! Fixed. > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4282: > >> 4280: Label done; >> 4281: mv(dst, zr); >> 4282: li(t0, 0x3fe0000000000000); > > Suggestion: `mv(t0, julong_cast(0.5));` Thx! Fixed. > src/hotspot/cpu/riscv/riscv.ad line 8367: > >> 8365: %} >> 8366: >> 8367: ins_pipe(pipe_class_default); > > Suggestion: `ins_pipe(pipe_slow);` Thx! Fixed. > src/hotspot/cpu/riscv/riscv.ad line 8381: > >> 8379: %} >> 8380: >> 8381: ins_pipe(pipe_class_default); > > Suggestion: `ins_pipe(pipe_slow);` Thx! Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430349432 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430350571 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430349683 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430350023 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430349863 From omikhaltcova at openjdk.org Mon Dec 18 16:17:45 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Mon, 18 Dec 2023 16:17:45 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v6] In-Reply-To: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> References: <7GqjvcQqsvlQZw4-4iKvUElpeZz717Nf8uTd_YY_LBk=.1bfd1f5a-5670-4962-9620-9c93a192a033@github.com> Message-ID: On Tue, 5 Dec 2023 03:33:52 GMT, Fei Yang wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Replaced tmp with t0 > > Unfortunately, I witnessed performance regression on sifive unmatched board. > > Before: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.243 ? 0.506 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.448 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.411 ? 0.134 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 31.329 ? 0.085 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 31.328 ? 0.031 ops/ms > > After: > > FpRoundingBenchmark.test_ceil 2048 thrpt 15 39.375 ? 0.125 ops/ms > FpRoundingBenchmark.test_floor 2048 thrpt 15 39.407 ? 0.076 ops/ms > FpRoundingBenchmark.test_rint 2048 thrpt 15 39.387 ? 0.235 ops/ms > FpRoundingBenchmark.test_round_double 2048 thrpt 15 23.940 ? 0.025 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 30.629 ? 0.021 ops/ms @RealFYang thank you for these suggestions! I've fixed all above mentioned and re-ran benchmark on VisionFive 2 and T-Head. The results are as followed: **VisionFive 2** Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 38.855 ? 0.117 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 50.301 ? 0.028 ops/ms **T-Head** Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 117.959 1.533 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 121.091 0.267 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1860934621 From mdoerr at openjdk.org Mon Dec 18 16:22:46 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 18 Dec 2023 16:22:46 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v6] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 11:30:59 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > Followed Thomas proposals I like getting rid of `#ifdef AIX` in shared code. The change is not simple, but looks basically good to me. I'll take a closer look when I find more time. I have some coding style requests. Please also see https://wiki.openjdk.org/display/HotSpot/StyleGuide (especially section Whitespace). src/hotspot/os/aix/porting_aix.cpp line 964: > 962: > 963: return libpath; > 964: Empty line could get removed. src/hotspot/os/aix/porting_aix.cpp line 985: > 983: if (strchr(path2, '/')) { > 984: stringStream combined; > 985: if (*path2 == '/' || *path2 == '.') We usually use `{` and `}` unless for extremely simple substatements on the same line src/hotspot/share/runtime/os.hpp line 1068: > 1066: static bool pd_dll_unload(void* libhandle, char* ebuf, int ebuflen); > 1067: > 1068: Please remove empty lines. ------------- PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1787236876 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1430362306 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1430366154 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1430367434 From mdoerr at openjdk.org Mon Dec 18 16:22:50 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 18 Dec 2023 16:22:50 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 10:25:57 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with two additional commits since the last revision: >> >> - trailing whitespace >> - Following most of Thomas proposals > > src/hotspot/os/aix/os_aix.cpp line 1113: > >> 1111: } >> 1112: >> 1113: > > please remove whitespace change +1 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1430352685 From stuefe at openjdk.org Mon Dec 18 16:31:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 16:31:43 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v5] In-Reply-To: References: <6kfjEICoOee2rRHe1OsqY2xHvvd_Cab0ZQCpp41VfNk=.0fb32b43-56ed-4d81-ab1d-8d40dfb58e4f@github.com> Message-ID: On Mon, 18 Dec 2023 11:12:23 GMT, Joachim Kern wrote: >> src/hotspot/os/aix/porting_aix.cpp line 1097: >> >>> 1095: } >>> 1096: >>> 1097: pthread_mutex_lock(&g_handletable_mutex); >> >> You can make your life a lot easier by defining an RAII object at the start of the file: >> >> struct TableLocker { >> TableLocker() { pthread_mutex_lock(&g_handletable_mutex); } >> ~TableLocker() { pthread_mutex_unlock(&g_handletable_mutex); } >> }; >> >> and just place this at the beginning of your two functions >> >> TableLocker lock: >> ... >> >> >> no need to manually unlock then, with the danger of missing a return. > > Great, thank you. This was one of the things I thought about, but was not sure, because I did not fully understood the MutexLocker class and the difference between Monitor and Mutex. In hindsight, pthread mutex is the better choice anyway: it avoids dependencies to the VM lifecycle (VM mutexes are only available after VM initialization, so we could not call dlopen() before that). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1430380082 From sspitsyn at openjdk.org Mon Dec 18 16:35:40 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 18 Dec 2023 16:35:40 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v7] In-Reply-To: <4iSULgKTef_C2q4AJpEKB64tZh_QDIB77Ov2rwZ78nY=.39d3c708-4175-42b5-8eb9-58684e131ccf@github.com> References: <4iSULgKTef_C2q4AJpEKB64tZh_QDIB77Ov2rwZ78nY=.39d3c708-4175-42b5-8eb9-58684e131ccf@github.com> Message-ID: On Fri, 15 Dec 2023 10:49:56 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision: > > review: improve an assert message Alan and Leonid, thank you for review! Will push after the final mach5 testing is completed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17011#issuecomment-1860980377 From tonyp at openjdk.org Mon Dec 18 16:43:46 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Mon, 18 Dec 2023 16:43:46 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v2] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 14:04:52 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. >> >> **when intrinsic is enabled** >> >> o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op >> o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op >> o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? 150.045 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 10 191325.246 ? 3298.882 ns/op >> o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 128 N/A N/A avgt 10 8220.886 ? 53.684 ns/op >> o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 1024 N/A N/A avgt 10 18006.955 ? 92.432 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 10 11688843.558 ? 34924.678 ns/op >> >> >> **when intrinsic is disabled** >> >> o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 496.890 ? 6.695 ns/op >> o.o.b.java.security.GetMessageDigest.getInstance ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > Add some comments I'll finish the review tomorrow. But posting the comments I have so far. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4442: > 4440: // M't, 0 <= t <= 15 > 4441: // ROTL'1(W't-3 ^ W't-8 ^ W't-14 ^ W't-16), 16 <= t <= 79 > 4442: void sha1_prepare_w(int round, Register cur_w, Register ws[], Register buf, Register tmp) { I'd just use `t0` directly instead of passing the temp register as an arg. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4448: > 4446: > 4447: if (round%2 == 0) { > 4448: __ ld(ws[round/2], Address(buf, 0)); Instead of incrementing `buf` 8 times, could you just increment the offset (0, 8, 16, etc.) and only increment `buf` once per loop iteration? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4585: > 4583: } > 4584: > 4585: void sha1_reserve_prev_abcde(Register a, Register b, Register c, Register d, Register e, I think it's safe to just use t0 and t1 for intermediate results without passing them as args. I used to pass them as args too, but I changed that for md5. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4592: > 4590: > 4591: __ slli(tmp1, b, 32); > 4592: __ andi(prev_ab, a, mask32, tmp2); I think this will materialize `mask32` in `tmp2` twice, once per `andi`, given that the value won't work as an intermediate. I'd do `__ mv(tmp2, mask32)` and use `__ andr(prev_ab, a, tmp2)` and `__ andr(prev_cd, c, tmp2)`. I think it will save 2-3 instructions here. No idea how performance-critical this section is, though! I assume not much? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4627: > 4625: > 4626: // c_rarg0 - c_rarg3: x10 - x13 > 4627: Register buf = c_rarg0; You could copy the four arguments to a different set of registers and use a0 -> a3 for some of the other values to see if you can increase the number of compressed instructions that can be used. Unclear whether it's worth it or not. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4664: > 4662: > 4663: RegSet saved_regs = RegSet::range(x18, x27); > 4664: saved_regs += RegSet::of(t2); Do you need to save t2? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4673: > 4671: __ srli(d, c, 32); > 4672: __ lw(e, Address(state, 16)); > 4673: (nit) extra whitespace src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4713: > 4711: __ sd(c, Address(state, 8)); > 4712: __ sw(e, Address(state, 16)); > 4713: (nit) extra whitespace ------------- PR Review: https://git.openjdk.org/jdk/pull/17130#pullrequestreview-1786898719 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430181504 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430361118 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430162672 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430175000 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430156869 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430146051 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430147125 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1430148290 From jkern at openjdk.org Mon Dec 18 16:57:55 2023 From: jkern at openjdk.org (Joachim Kern) Date: Mon, 18 Dec 2023 16:57:55 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v7] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: cosmetic changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/978ed33c..f79c89da Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=05-06 Stats: 7 lines in 3 files changed: 1 ins; 4 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From stuefe at openjdk.org Mon Dec 18 17:02:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Mon, 18 Dec 2023 17:02:40 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: <3pfgWe1NIoMrOXlGqLsyJCsgPgMZ6AJtlxSy64o76o8=.ecc470d4-12c2-4b1b-9da9-1155ceb8329e@github.com> Message-ID: On Mon, 11 Dec 2023 06:02:29 GMT, David Holmes wrote: > > I cannot just use scanf with %f since that would also parse values without decimal point that are meant to be absolute. > > 0.0 -.999... == % else absolute ? Hi David, you have read my misgivings about using dot in command line arguments? E.g. on machines with a german locale, one would have to type a comma instead. That makes documentation cumbersome to write and conflicts with how we normally handle percentages at the command line. If you insist this is the best way, I will do this to get this PR to proceed; just thought I ask again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1861060140 From cjplummer at openjdk.org Mon Dec 18 17:07:59 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 18 Dec 2023 17:07:59 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 22:41:56 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) >> >> >> Compiler.perfmap [arguments] (Linux only) >> Write map file for Linux perf tool. >> >> Impact: Low >> >> arguments: >> >> ? filename: (Optional) Name of the map file (STRING, no default value) >> >> If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then >> the default filename will be /tmp/perf-12345.map. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright of PerfMapTest @phohensee Although this PR had 2 reviews, there was still some unresolved discussion. It should not have been pushed. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15871#issuecomment-1861074850 From sspitsyn at openjdk.org Mon Dec 18 17:09:59 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Mon, 18 Dec 2023 17:09:59 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - Merge - review: improve an assert message - review: moved a couple of comments out of try blocks - review: moved notifyJvmtiDisableSuspend(true) out of try-block - review: 1) replace CriticalLock with DisableSuspend; 2) minor tweaks - review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods - Resolved merge conflict in VirtualThread.java - added @summary to new test SuspendWithInterruptLock.java - add new test SuspendWithInterruptLock.java - 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable ------------- Changes: https://git.openjdk.org/jdk/pull/17011/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17011&range=07 Stats: 229 lines in 15 files changed: 196 ins; 0 del; 33 mod Patch: https://git.openjdk.org/jdk/pull/17011.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17011/head:pull/17011 PR: https://git.openjdk.org/jdk/pull/17011 From cjplummer at openjdk.org Mon Dec 18 17:49:39 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 18 Dec 2023 17:49:39 GMT Subject: RFR: 8321404: Limit the number of heap dumps triggered by HeapDumpBeforeFullGC/AfterFullGC [v3] In-Reply-To: <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> References: <0q_yL6Q90R3L0R2-m94w1cCdbkOwapo0hLn_x_QAIVc=.f7a3b73f-9015-485f-9e9c-b4585ca84dd9@github.com> <8GGPQMjfU6YWa1i0yjk7SvrJ-lnZu6TxG8zPcbWN3jE=.1a4bb16e-dfc6-46ed-84e1-f2ed3d911699@github.com> Message-ID: On Sat, 16 Dec 2023 04:36:57 GMT, Denghui Dong wrote: >> Hi, >> >> Could I have a review of this patch? >> >> In the current implementation, HeapDumpBeforeFullGC/AfterFullGC will generate dumps for every FGC, increasing the risk of disk full. >> >> This patch introduces a new option 'FullGCHeapDumpLimit' to limit the number of dumps triggered by HeapDumpBeforeFullGC/AfterFullGC to enhance production-friendliness. >> >> Best, >> Denghui > > Denghui Dong has updated the pull request incrementally with one additional commit since the last revision: > > refine description Marked as reviewed by cjplummer (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16976#pullrequestreview-1787426650 From cslucas at openjdk.org Mon Dec 18 18:19:43 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 18 Dec 2023 18:19:43 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Sat, 16 Dec 2023 12:44:12 GMT, Andrew Haley wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > It seems odd to me that this substantial and complex patch lacks any justification. As far as I can tell, the decision to make class MacroAssembler very lightweight so that new instances could be created as needed was deliberate. Why change now? Is it performance, or something else? @theRealAph , @TheRealMDoerr - I just picked a JBS work item that seemed important (P3..) and started working on it. To me the refactoring made a lot of sense as well - why just create thousands of objects if just a few would do. If this is something that doesn't need to be done, please let me know. It already took substantial effort as you said. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1861244791 From cslucas at openjdk.org Mon Dec 18 18:36:41 2023 From: cslucas at openjdk.org (Cesar Soares Lucas) Date: Mon, 18 Dec 2023 18:36:41 GMT Subject: RFR: JDK-8316991: Reduce nullable allocation merges [v4] In-Reply-To: <9ruRW2rZxYXBEPWwT7s9bfsfipjfC-sddzxomBiOHNI=.b1a27ab4-d214-4091-a90c-a276b01587f7@github.com> References: <9ruRW2rZxYXBEPWwT7s9bfsfipjfC-sddzxomBiOHNI=.b1a27ab4-d214-4091-a90c-a276b01587f7@github.com> Message-ID: <5TdDDfxiKWzbaRrbRRUTgh3_yXh_YRNiVyWSQCr_xPM=.1155f87c-b3f3-47cf-b836-669854c45f64@github.com> On Mon, 13 Nov 2023 07:23:55 GMT, Tobias Hartmann wrote: >> Cesar Soares Lucas has updated the pull request incrementally with one additional commit since the last revision: >> >> Ammend previous fix & add repro tests. > > All tests passed. I'll provide a review later this week. @TobiHartmann - please let me know if there is anything I can do to make reviewing easier. @vnkozlov @iwanowww - could you also please take a look at this patch? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15825#issuecomment-1861285843 From phh at openjdk.org Mon Dec 18 18:56:54 2023 From: phh at openjdk.org (Paul Hohensee) Date: Mon, 18 Dec 2023 18:56:54 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: <5jNzyVdKTTdgaMGoKhHaqoRBnLHkXRys3gCIRvjHIdE=.bf6c28df-ebfe-4486-a146-dbfa55cffd33@github.com> On Mon, 11 Dec 2023 22:41:56 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) >> >> >> Compiler.perfmap [arguments] (Linux only) >> Write map file for Linux perf tool. >> >> Impact: Low >> >> arguments: >> >> ? filename: (Optional) Name of the map file (STRING, no default value) >> >> If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then >> the default filename will be /tmp/perf-12345.map. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright of PerfMapTest Shall I revert it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/15871#issuecomment-1861334362 From mdoerr at openjdk.org Mon Dec 18 20:22:44 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Mon, 18 Dec 2023 20:22:44 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: <4Gr1LLsOrG-7sJDE0mlR_x9QxrvQBMFzDe-atrmFAPs=.bf32dd9d-0abc-4de1-8ab7-3f12377e5098@github.com> On Thu, 14 Dec 2023 19:44:09 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | ? | | >> | ARM32 | | | | >> | x86 | | | ? | >> | x64 | | | ? | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. Cleanup is not bad. Fewer objects and a bit shorter code at some places are an advantage. Maybe Vladimir had some more reasons in mind when filing the issue. It's linked to https://bugs.openjdk.org/browse/JDK-8239472. It'd be nice if you or Vladimir could add a bit of motivation to the description of the PR or the JBS issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1861538259 From cjplummer at openjdk.org Mon Dec 18 20:33:54 2023 From: cjplummer at openjdk.org (Chris Plummer) Date: Mon, 18 Dec 2023 20:33:54 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 22:41:56 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) >> >> >> Compiler.perfmap [arguments] (Linux only) >> Write map file for Linux perf tool. >> >> Impact: Low >> >> arguments: >> >> ? filename: (Optional) Name of the map file (STRING, no default value) >> >> If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then >> the default filename will be /tmp/perf-12345.map. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright of PerfMapTest If we do settle on some additional changes, I think probably a follow-up CR would be cleaner than a BACKOUT and REDO. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15871#issuecomment-1861563261 From phh at openjdk.org Mon Dec 18 21:27:50 2023 From: phh at openjdk.org (Paul Hohensee) Date: Mon, 18 Dec 2023 21:27:50 GMT Subject: RFR: 8314029: Add file name parameter to Compiler.perfmap [v8] In-Reply-To: References: Message-ID: On Mon, 11 Dec 2023 22:41:56 GMT, Yi-Fan Tsai wrote: >> `jcmd Compiler.perfmap` uses the hard-coded file name for a perf map: `/tmp/perf-%d.map`. This change adds an optional argument for specifying a file name. >> >> `jcmd PID help Compiler.perfmap` shows the following usage. >> >> >> Compiler.perfmap >> Write map file for Linux perf tool. >> >> Impact: Low >> >> Syntax : Compiler.perfmap [] >> >> Arguments: >> filename : [optional] Name of the map file (STRING, no default value) >> >> >> The following section of man page is also updated. (`man -l src/jdk.jcmd/share/man/jcmd.1`) >> >> >> Compiler.perfmap [arguments] (Linux only) >> Write map file for Linux perf tool. >> >> Impact: Low >> >> arguments: >> >> ? filename: (Optional) Name of the map file (STRING, no default value) >> >> If filename is not specified, a default file name is chosen using the pid of the target JVM process. For example, if the pid is 12345, then >> the default filename will be /tmp/perf-12345.map. > > Yi-Fan Tsai has updated the pull request incrementally with one additional commit since the last revision: > > Update copyright of PerfMapTest Thanks. ------------- PR Comment: https://git.openjdk.org/jdk/pull/15871#issuecomment-1861689670 From kbarrett at openjdk.org Mon Dec 18 22:50:44 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Mon, 18 Dec 2023 22:50:44 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 07:49:04 GMT, David Holmes wrote: >> Before this patch, we always initialized the GrowableArray up to its `capacity`, and not just up to `length`. This is problematic for a few reasons: >> >> - It is not expected. `std::vector` also only initializes the elements up to its size, and not to capacity. >> - It requires a default-constructor for the element type. And the default-constructor is then used to fill in the elements between length and capacity. If the elements do any allocation themselves, then this is a waste of resources. >> - The implementation also required the copy-assignment-operator for the element type. This is a lesser restriction. But the copy-assignment-operator was used in cases like `append` (where placement copy-construct would be expected), and not just in true assignment kinds of cases like `at_put`. >> >> For this reason, I reworked a lot of the methods to ensure that only the "slots" up to `length` are ever initialized, and the space between `length` and `capacity` is always garbage. >> >> ----- >> >> Also, before this patch, one can CHeap allocate both with `GrowableArray` and `GrowableArrayCHeap`. This is unnecessary. It required more complex verification in `GrowableArray` to deal with all cases. And `GrowableArrayCHeap` is already explicitly a smaller object, and should hence be preferred. Hence I changed all CHeap allocating cases of `GrowableArray` to `GrowableArrayCHeap`. This also allows for a clear separation: >> - `GrowableArray` only deals with arena / resource area allocation. These are arrays that are regularly abandoned at the end of their use, rather than deleted or even cleared. >> - `GrowableArrayCHeap` only deals with CHeap allocated memory. We expect that the destructor for it is called eventually, either when it goes out of scope or when `delete` is explicitly called. We expect that the elements could be allocating resources internally, and hence rely on the destructors for the elements being called, which may free up those internally allocated resources. >> >> Therefore, we now only allow `GrowableArrayCHeap` to have element types with non-trivial destructors, but `GrowableArray` checks that element types do not have non-trivial destructors (since it is common practice to just abandon arena / resource area allocated arrays, rather than calling the destructor or clearing the array, which also destructs all elements). This more clearly separates the two worlds: clean-up your own mess (CHeap) vs abandon your mess (arena / resource area). >> >> ----- >> >> I al... > > @eme64 Is it feasible to split this up to solve each of the problems you identify in stages? There is also overlap here with JDK-8319709 IIUC. Thanks. > @dholmes-ora These are the "parts": > > 1. initialize up to capacity vs length > > 2. update the test to verify this (complete refactoring) > > 3. remove cheap use of GrowableArray -> use GrowableArrayCHeap instead > > > The first 2 items are inseparable, I cannot make substantial changes to many GrowableArray methods without there even being tests for them. And the tests would not pass before the changes for item 1, since the tests also verify what elements of the array are initialized. So adding the tests first would not be very feasible. > > The 3rd item could maybe be split, and be done before the rest. Though it would also require lots of changes to the test, which then I would have to completely refactor with items 1+2 anyway. > > And the items are related conceptually, that is why I would felt ok pushing them together. It is all about when (item 1) and what kinds of (item 3) constructors / destructors are called for the elements of the arrays, and verifying that thoroughly (item 2). > > Hence: feasible probably, but lots of work overhead. Do you think it is worth it? I too would prefer that it be split up. It's very easy to miss important details in amongst all the mostly relatively simple renamings. That is, I think 3 should be separate from the other changes. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16918#issuecomment-1861812983 From dholmes at openjdk.org Tue Dec 19 01:04:39 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Dec 2023 01:04:39 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: <3pfgWe1NIoMrOXlGqLsyJCsgPgMZ6AJtlxSy64o76o8=.ecc470d4-12c2-4b1b-9da9-1155ceb8329e@github.com> Message-ID: On Mon, 18 Dec 2023 16:59:26 GMT, Thomas Stuefe wrote: >>> I cannot just use scanf with %f since that would also parse values without decimal point that are meant to be absolute. >> >> 0.0 -.999... == % else absolute ? > >> > I cannot just use scanf with %f since that would also parse values without decimal point that are meant to be absolute. >> >> 0.0 -.999... == % else absolute ? > > Hi David, > > you have read my misgivings about using dot in command line arguments? E.g. on machines with a german locale, one would have to type a comma instead. That makes documentation cumbersome to write and conflicts with how we normally handle percentages at the command line. > > If you insist this is the best way, I will do this to get this PR to proceed; just thought I ask again. Sorry @tstuefe I'm swamped at the moment. I don't understand the locale issue. We have double flags already and it is not considered an issue. But in any case I don't have the time to continue the debate so stick with your multiple flags. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1861932847 From fyang at openjdk.org Tue Dec 19 01:57:42 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 19 Dec 2023 01:57:42 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v8] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 16:05:54 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Used jint_cast/julong_cast; moved mv between feq and beqz Thanks. Looks fine to me except for two nits. I guess we can follow the design decisions of RISC-V about dynamic and static rounding mode from the ISA spec and keep an eye on how this may affect new hardware implementations coming out. The C99 language standard effectively mandates the provision of a dynamic rounding mode register. In typical implementations, writes to the dynamic rounding mode CSR state will serialize the pipeline. Static rounding modes are used to implement specialized arithmetic operations that often have to switch frequently between different rounding modes src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4264: > 4262: void MacroAssembler::java_round_float(Register dst, FloatRegister src, FloatRegister ftmp) { > 4263: Label done; > 4264: li(t0, jint_cast(0.5f)); Nit: Can you change this `li` into `mv`? That will be consistent with other places where we move an immediate. src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4281: > 4279: void MacroAssembler::java_round_double(Register dst, FloatRegister src, FloatRegister ftmp) { > 4280: Label done; > 4281: li(t0, julong_cast(0.5)); Same as above here. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1787934518 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430796146 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1430796295 From kbarrett at openjdk.org Tue Dec 19 02:08:54 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Tue, 19 Dec 2023 02:08:54 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 07:56:04 GMT, Emanuel Peter wrote: > Before this patch, we always initialized the GrowableArray up to its `capacity`, and not just up to `length`. This is problematic for a few reasons: > > - It is not expected. `std::vector` also only initializes the elements up to its size, and not to capacity. > - It requires a default-constructor for the element type. And the default-constructor is then used to fill in the elements between length and capacity. If the elements do any allocation themselves, then this is a waste of resources. > - The implementation also required the copy-assignment-operator for the element type. This is a lesser restriction. But the copy-assignment-operator was used in cases like `append` (where placement copy-construct would be expected), and not just in true assignment kinds of cases like `at_put`. > > For this reason, I reworked a lot of the methods to ensure that only the "slots" up to `length` are ever initialized, and the space between `length` and `capacity` is always garbage. > > ----- > > Also, before this patch, one can CHeap allocate both with `GrowableArray` and `GrowableArrayCHeap`. This is unnecessary. It required more complex verification in `GrowableArray` to deal with all cases. And `GrowableArrayCHeap` is already explicitly a smaller object, and should hence be preferred. Hence I changed all CHeap allocating cases of `GrowableArray` to `GrowableArrayCHeap`. This also allows for a clear separation: > - `GrowableArray` only deals with arena / resource area allocation. These are arrays that are regularly abandoned at the end of their use, rather than deleted or even cleared. > - `GrowableArrayCHeap` only deals with CHeap allocated memory. We expect that the destructor for it is called eventually, either when it goes out of scope or when `delete` is explicitly called. We expect that the elements could be allocating resources internally, and hence rely on the destructors for the elements being called, which may free up those internally allocated resources. > > Therefore, we now only allow `GrowableArrayCHeap` to have element types with non-trivial destructors, but `GrowableArray` checks that element types do not have non-trivial destructors (since it is common practice to just abandon arena / resource area allocated arrays, rather than calling the destructor or clearing the array, which also destructs all elements). This more clearly separates the two worlds: clean-up your own mess (CHeap) vs abandon your mess (arena / resource area). > > ----- > > I also completely refactored and improved ... That's it for today. I'll continue looking at this tomorrow. src/hotspot/share/utilities/bitMap.hpp line 191: > 189: verify_size(size_in_bits); > 190: } > 191: ~BitMap() {} This change is incorrect. This destructor is intentionally declared protected, to prevent slicing through it. It would be reasonable to change it to have a `= default` definition though, rather than the empty body definition it currently has. Note that BitMap copying has the same shallow-copying problems as GrowableArray. src/hotspot/share/utilities/growableArray.hpp line 87: > 85: } > 86: > 87: ~GrowableArrayBase() {} Another incorrect removal of an intentionally protected destructor. src/hotspot/share/utilities/growableArray.hpp line 124: > 122: GrowableArrayBase(capacity, initial_len), _data(data) {} > 123: > 124: ~GrowableArrayView() {} Another incorrect removal of an intentionally protected destructor. src/hotspot/share/utilities/growableArray.hpp line 294: > 292: void remove_range(int start, int end) { > 293: assert(0 <= start, "illegal start index %d", start); > 294: assert(start < end && end <= _len, "erase called with invalid range (%d, %d) for length %d", start, end, _len); pre-existing: I think start == end should be permitted. There's no reason to forbid an empty range, and there are algorithms that are simpler if empty ranges are permitted. src/hotspot/share/utilities/growableArray.hpp line 319: > 317: ::new ((void*)&this->_data[index]) E(_data[_len]); > 318: // Destruct last element > 319: this->_data[_len].~E(); Must not do the copy/destruct if index designated the last element. src/hotspot/share/utilities/growableArray.hpp line 327: > 325: // sort by fixed-stride sub arrays: > 326: void sort(int f(E*, E*), int stride) { > 327: qsort(_data, length() / stride, sizeof(E) * stride, (_sort_Fn)f); pre-existing: Use of qsort presumes E is trivially copyable/assignable. Use QuickSort::sort instead. src/hotspot/share/utilities/growableArray.hpp line 398: > 396: } > 397: > 398: ~GrowableArrayWithAllocator() {} Another incorrect removal of an intentionally protected destructor. src/hotspot/share/utilities/growableArray.hpp line 414: > 412: // Assignment would be wrong, as it assumes the destination > 413: // was already initialized. > 414: ::new ((void*)&this->_data[idx]) E(elem); I don't think the cast to void* is needed, and just adds clutter. There are many more of these. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16918#pullrequestreview-1787825840 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430721171 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430728218 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430728372 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430760537 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430759359 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430762966 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430734644 PR Review Comment: https://git.openjdk.org/jdk/pull/16918#discussion_r1430747894 From dholmes at openjdk.org Tue Dec 19 05:35:43 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Dec 2023 05:35:43 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 07:56:04 GMT, Emanuel Peter wrote: > Before this patch, we always initialized the GrowableArray up to its `capacity`, and not just up to `length`. This is problematic for a few reasons: > > - It is not expected. `std::vector` also only initializes the elements up to its size, and not to capacity. > - It requires a default-constructor for the element type. And the default-constructor is then used to fill in the elements between length and capacity. If the elements do any allocation themselves, then this is a waste of resources. > - The implementation also required the copy-assignment-operator for the element type. This is a lesser restriction. But the copy-assignment-operator was used in cases like `append` (where placement copy-construct would be expected), and not just in true assignment kinds of cases like `at_put`. > > For this reason, I reworked a lot of the methods to ensure that only the "slots" up to `length` are ever initialized, and the space between `length` and `capacity` is always garbage. > > ----- > > Also, before this patch, one can CHeap allocate both with `GrowableArray` and `GrowableArrayCHeap`. This is unnecessary. It required more complex verification in `GrowableArray` to deal with all cases. And `GrowableArrayCHeap` is already explicitly a smaller object, and should hence be preferred. Hence I changed all CHeap allocating cases of `GrowableArray` to `GrowableArrayCHeap`. This also allows for a clear separation: > - `GrowableArray` only deals with arena / resource area allocation. These are arrays that are regularly abandoned at the end of their use, rather than deleted or even cleared. > - `GrowableArrayCHeap` only deals with CHeap allocated memory. We expect that the destructor for it is called eventually, either when it goes out of scope or when `delete` is explicitly called. We expect that the elements could be allocating resources internally, and hence rely on the destructors for the elements being called, which may free up those internally allocated resources. > > Therefore, we now only allow `GrowableArrayCHeap` to have element types with non-trivial destructors, but `GrowableArray` checks that element types do not have non-trivial destructors (since it is common practice to just abandon arena / resource area allocated arrays, rather than calling the destructor or clearing the array, which also destructs all elements). This more clearly separates the two worlds: clean-up your own mess (CHeap) vs abandon your mess (arena / resource area). > > ----- > > I also completely refactored and improved ... Splitting out part 3 would have been preferable IMO. The CHeap changes are unrelated to the capacity issue and should have their own JBS issue. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16918#issuecomment-1862149245 From fyang at openjdk.org Tue Dec 19 05:56:39 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 19 Dec 2023 05:56:39 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v6] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 14:51:06 GMT, Ludovic Henry wrote: >> 8315856: RISC-V: Use Zacas extension for cmpxchg > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > review Thanks for the update. Seems fine given that it is an experimental feature for now. We will need more thorough tests (jcstress, benchmark, etc.) on real hardware when turning this into a product option. ------------- Marked as reviewed by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16910#pullrequestreview-1788143980 From epeter at openjdk.org Tue Dec 19 06:46:47 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 19 Dec 2023 06:46:47 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 22:48:18 GMT, Kim Barrett wrote: >> @eme64 Is it feasible to split this up to solve each of the problems you identify in stages? There is also overlap here with JDK-8319709 IIUC. Thanks. > >> @dholmes-ora These are the "parts": >> >> 1. initialize up to capacity vs length >> >> 2. update the test to verify this (complete refactoring) >> >> 3. remove cheap use of GrowableArray -> use GrowableArrayCHeap instead >> >> >> The first 2 items are inseparable, I cannot make substantial changes to many GrowableArray methods without there even being tests for them. And the tests would not pass before the changes for item 1, since the tests also verify what elements of the array are initialized. So adding the tests first would not be very feasible. >> >> The 3rd item could maybe be split, and be done before the rest. Though it would also require lots of changes to the test, which then I would have to completely refactor with items 1+2 anyway. >> >> And the items are related conceptually, that is why I would felt ok pushing them together. It is all about when (item 1) and what kinds of (item 3) constructors / destructors are called for the elements of the arrays, and verifying that thoroughly (item 2). >> >> Hence: feasible probably, but lots of work overhead. Do you think it is worth it? > > I too would prefer that it be split up. It's very easy to miss important details in amongst all the mostly relatively > simple renamings. That is, I think 3 should be separate from the other changes. @kimbarrett @dholmes-ora I will try to split out the GrowableArray cheap -> GrowableArrayCHeap changes. And thanks for the feedback you already gave, Kim! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16918#issuecomment-1862209037 From rehn at openjdk.org Tue Dec 19 07:44:40 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 19 Dec 2023 07:44:40 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v6] In-Reply-To: References: Message-ID: <5OceXYITf9roiT2TKwiBh8SuLCgrc973uPZzOQ3X6Ew=.7192440a-dd25-489c-9f8e-c93c73155893@github.com> On Fri, 15 Dec 2023 14:51:06 GMT, Ludovic Henry wrote: >> 8315856: RISC-V: Use Zacas extension for cmpxchg > > Ludovic Henry has updated the pull request incrementally with one additional commit since the last revision: > > review Thanks ------------- Marked as reviewed by rehn (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16910#pullrequestreview-1788270314 From gcao at openjdk.org Tue Dec 19 07:48:46 2023 From: gcao at openjdk.org (Gui Cao) Date: Tue, 19 Dec 2023 07:48:46 GMT Subject: Integrated: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 03:56:04 GMT, Gui Cao wrote: > The fix for https://bugs.openjdk.org/browse/JDK-8315743 touches MacroAssembler::load_reserved replacing `t0` with `dst`. But it missed change for the third case (that is `uint32`) of the switch in this assember function. We should also replace `t0` used in `zero_extend` with `dst`. @robehn can you help confirm this? > > - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) This pull request has now been integrated. Changeset: 59073fa3 Author: Gui Cao Committer: Robbin Ehn URL: https://git.openjdk.org/jdk/commit/59073fa3eb7d04d9e0f08fbef70c9db6ffde296a Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved Reviewed-by: fyang, rehn, luhenry ------------- PR: https://git.openjdk.org/jdk/pull/17117 From iklam at openjdk.org Tue Dec 19 08:00:01 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 19 Dec 2023 08:00:01 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces Message-ID: `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. ------------- Commit messages: - 8322321: Add man page doc for -XX:+VerifySharedSpaces Changes: https://git.openjdk.org/jdk/pull/17152/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17152&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322321 Stats: 9 lines in 1 file changed: 9 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17152.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17152/head:pull/17152 PR: https://git.openjdk.org/jdk/pull/17152 From rehn at openjdk.org Tue Dec 19 08:01:43 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 19 Dec 2023 08:01:43 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> References: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> Message-ID: On Mon, 18 Dec 2023 15:10:13 GMT, Vladimir Kempik wrote: > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. So here I just try to follow the current code, see how lw is changed to c_lw. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1862283903 From stuefe at openjdk.org Tue Dec 19 08:27:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Dec 2023 08:27:40 GMT Subject: RFR: JDK-8321266: Add diagnostic RSS threshold [v3] In-Reply-To: References: <3pfgWe1NIoMrOXlGqLsyJCsgPgMZ6AJtlxSy64o76o8=.ecc470d4-12c2-4b1b-9da9-1155ceb8329e@github.com> Message-ID: On Mon, 18 Dec 2023 16:59:26 GMT, Thomas Stuefe wrote: >>> I cannot just use scanf with %f since that would also parse values without decimal point that are meant to be absolute. >> >> 0.0 -.999... == % else absolute ? > >> > I cannot just use scanf with %f since that would also parse values without decimal point that are meant to be absolute. >> >> 0.0 -.999... == % else absolute ? > > Hi David, > > you have read my misgivings about using dot in command line arguments? E.g. on machines with a german locale, one would have to type a comma instead. That makes documentation cumbersome to write and conflicts with how we normally handle percentages at the command line. > > If you insist this is the best way, I will do this to get this PR to proceed; just thought I ask again. > Sorry @tstuefe I'm swamped at the moment. I don't understand the locale issue. We have double flags already and it is not considered an issue. But in any case I don't have the time to continue the debate so stick with your multiple flags. No problem at all, @dholmes-ora, and thanks for your relentless review work. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16938#issuecomment-1862316785 From gcao at openjdk.org Tue Dec 19 08:36:02 2023 From: gcao at openjdk.org (Gui Cao) Date: Tue, 19 Dec 2023 08:36:02 GMT Subject: [jdk22] RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved Message-ID: Clean backport which adds back missing code change in MacroAssembler::load_reserved in file src/hotspot/cpu/riscv/macroAssembler_riscv.cpp for https://bugs.openjdk.org/browse/JDK-8315743. This is a riscv-specific change, risk is low. ------------- Commit messages: - Backport 59073fa3eb7d04d9e0f08fbef70c9db6ffde296a Changes: https://git.openjdk.org/jdk22/pull/19/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=19&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322154 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk22/pull/19.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/19/head:pull/19 PR: https://git.openjdk.org/jdk22/pull/19 From mbaesken at openjdk.org Tue Dec 19 09:05:41 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 19 Dec 2023 09:05:41 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: <4ZjWc8mVi3t9FRtCtyExPV2GeS7hDWKw7aD-R7G5FvU=.04ad00b2-bca0-4672-b97b-1159f8621b37@github.com> On Mon, 18 Dec 2023 09:07:28 GMT, Thomas Stuefe wrote: > I vote for doing both; the checkjni thing can be done in a separate RFE. Okay let's do both. For the JNI check I created this additional JBS issue https://bugs.openjdk.org/browse/JDK-8322366 8322366: Add IEEE rounding mode corruption check to JNI checks ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1862371140 From mbaesken at openjdk.org Tue Dec 19 09:16:47 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 19 Dec 2023 09:16:47 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Mon, 18 Dec 2023 09:16:07 GMT, Thomas Stuefe wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> Adjust macOS coding > > src/hotspot/os/bsd/os_bsd.cpp line 1007: > >> 1005: assert(rtn == 0, "fegetenv must succeed"); >> 1006: #endif // IA32 >> 1007: > > Its difficult to see what exactly changed on MacOS. Is this restructuring necessary? I wanted to bring the placemenet of the JFR event and also UL logging of Linux (dlopen_helper) and BSD/macOS closer together. With the new structure it is more 'uniform' . ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16903#discussion_r1431122434 From tonyp at openjdk.org Tue Dec 19 09:21:41 2023 From: tonyp at openjdk.org (Antonios Printezis) Date: Tue, 19 Dec 2023 09:21:41 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v2] In-Reply-To: References: Message-ID: <-lnr9q8mIdLUgSU-pmItpKy3b1Ur-B_B4p6SZxoq-UM=.0a5b570f-ac73-4e58-b6d7-aba08c8e1986@github.com> On Mon, 18 Dec 2023 14:04:52 GMT, Hamlin Li wrote: >> Hi, >> Can you review this patch to implement SHA-1 intrinsic for riscv? >> Thanks! >> >> >> ## Test >> >> ### Functionality >> >> tests under `test/hotspot/jtreg/compiler/intrinsics/sha` >> tests found via `find test/jdk -iname "*SHA1*.java"` >> >> ### Performance >> >> tested on `T-HEAD Light Lichee Pi 4A` >> >> benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. >> >> **when intrinsic is enabled** >> >> o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op >> o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op >> o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op >> o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? 150.045 ns/op >> o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 10 191325.246 ? 3298.882 ns/op >> o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 128 N/A N/A avgt 10 8220.886 ? 53.684 ns/op >> o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 1024 N/A N/A avgt 10 18006.955 ? 92.432 ns/op >> o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 10 11688843.558 ? 34924.678 ns/op >> >> >> **when intrinsic is disabled** >> >> o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 496.890 ? 6.695 ns/op >> o.o.b.java.security.GetMessageDigest.getInstance ... > > Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: > > Add some comments src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4513: > 4511: // Maj(x, y, z) = (x & y) ^ (x & z) ^ (y & z) , 40 <= t <= 59 > 4512: // Parity(x, y, z) = x ^ y ^ z , 60 <= t <= 79 > 4513: void sha1_f(int round, Register dst, Register x, Register y, Register z, Register tmp) { Ditto re: tmp (just use t0 or t1 directly) src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4542: > 4540: // b = a > 4541: // a = T > 4542: void sha1_process_round(int round, Register a, Register b, Register c, Register d, Register e, Ditto re: tmp1 / tmp2 (use t0 / t1) src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4714: > 4712: } > 4713: > 4714: int64_t mask32 = 0xffffffff; See earlier comment. You can save a few instructions by assigning mask32 to a register and use `andr`. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431060923 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431061207 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431073155 From vkempik at openjdk.org Tue Dec 19 09:27:37 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 19 Dec 2023 09:27:37 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: References: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> Message-ID: On Tue, 19 Dec 2023 07:58:52 GMT, Robbin Ehn wrote: > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 > > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. > > So here I just try to follow the current code, see how lw is changed to c_lw. Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu an example 0.44% ? 0x0000003fa46a86c8: slli t3,t3,0x20 0.48% ? 0x0000003fa46a86ca: addi t3,t3,-1 .... 3.11% ? 0x0000003fa46a86dc: lw a0,0(t1) 5.34% ? 0x0000003fa46a86e0: and a0,a0,t3 Using Assembler::lwu directly resulted in a correctly generated lwu ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1862404200 From egahlin at openjdk.org Tue Dec 19 09:58:41 2023 From: egahlin at openjdk.org (Erik Gahlin) Date: Tue, 19 Dec 2023 09:58:41 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding If we believe this to be a reoccurring problem in the foreseeable future, I'm fine with adding the fields. I'm also in favour of adding logging, which I think is better suited for backports than changes to event metadata. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1862450328 From rehn at openjdk.org Tue Dec 19 10:01:39 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Tue, 19 Dec 2023 10:01:39 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: References: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> Message-ID: On Tue, 19 Dec 2023 09:25:19 GMT, Vladimir Kempik wrote: > > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? > > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 > > > > > > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. > > So here I just try to follow the current code, see how lw is changed to c_lw. > > Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu > > an example > > ``` > 0.44% ? 0x0000003fa46a86c8: slli t3,t3,0x20 > 0.48% ? 0x0000003fa46a86ca: addi t3,t3,-1 > .... > 3.11% ? 0x0000003fa46a86dc: lw a0,0(t1) > 5.34% ? 0x0000003fa46a86e0: and a0,a0,t3 > ``` > > Using Assembler::lwu directly resulted in a correctly generated lwu Yes, I have seen similar things. 0x00002aaabc9464fc: addiw ra,ra,-1365 # 0x00000000000aaaab 0x00002aaabc946500: slli ra,ra,0xd 0x00002aaabc946502: addi ra,ra,-929 0x00002aaabc946506: slli ra,ra,0xd 0x00002aaabc946508: addi ra,ra,456 0x00002aaabc94650c: jalr ra As "111001000" would fit in the signed 12imm to jalr I think this is sub-optimal. I can go over and fix them, I'll create jira. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1862454161 From stuefe at openjdk.org Tue Dec 19 10:02:40 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Dec 2023 10:02:40 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding Looks good to me, provided @egahlin is okay with this. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16903#pullrequestreview-1788510441 From tschatzl at openjdk.org Tue Dec 19 10:16:55 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Tue, 19 Dec 2023 10:16:55 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM Message-ID: Hi all, please review this change that changes the filler array class name (again) after user feedback. In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. Testing: tier1-6 Thanks, Thomas ------------- Commit messages: - final name change? - Fix test - different attempt - suggestion Changes: https://git.openjdk.org/jdk/pull/17155/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17155&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8319548 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17155.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17155/head:pull/17155 PR: https://git.openjdk.org/jdk/pull/17155 From ayang at openjdk.org Tue Dec 19 11:01:41 2023 From: ayang at openjdk.org (Albert Mingkun Yang) Date: Tue, 19 Dec 2023 11:01:41 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: <04pGtzKg8nb0VVmPooIYvEh9S9ljS_ABctMEwMznH6w=.209a5bbe-a1e4-4180-ae02-51d013ca8dbf@github.com> On Tue, 19 Dec 2023 10:08:14 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that changes the filler array class name (again) after user feedback. > > In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. > This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. > > Testing: tier1-6 > > Thanks, > Thomas Marked as reviewed by ayang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17155#pullrequestreview-1788619431 From sroy at openjdk.org Tue Dec 19 12:40:51 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 12:40:51 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 11:51:43 GMT, Joachim Kern wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> followed the proposals > > The libpath parsing code is from me, so no license problems. Hi @JoKern65 Is this good to integrate now ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1862684040 From jkern at openjdk.org Tue Dec 19 12:44:52 2023 From: jkern at openjdk.org (Joachim Kern) Date: Tue, 19 Dec 2023 12:44:52 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 12:37:33 GMT, Suchismith Roy wrote: >> The libpath parsing code is from me, so no license problems. > > Hi @JoKern65 Is this good to integrate now ? Hi @suchismith1993, I'm waiting for a second review. Complex hotspot changes should be reviewed twice. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1862690708 From stuefe at openjdk.org Tue Dec 19 12:49:42 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Dec 2023 12:49:42 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v6] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 13:33:46 GMT, Thomas Stuefe wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> Followed Thomas proposals > > Well done. > > Releasing the mutex before asserting is not necessary; we don't pull the handle table lock as part of error reporting. > @tstuefe Sorry to tag you. Can you review the code. Once this code goes in I can push in my changes. We are targeting the fix for January. > Hi @JoKern65 Is this good to integrate now ? @suchismith1993 Please don't put pressure on patch authors and developers. There is zero reason why this patch should be rushed. > Hi @suchismith1993, I'm waiting for a second review. Complex hotspot changes should be reviewed twice. Not only that, hotspot changes *need* to be reviewed by at least two reviewers. That is not optional. See OpenJDK bylaws. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1862695052 From stuefe at openjdk.org Tue Dec 19 12:54:43 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Dec 2023 12:54:43 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 12:37:33 GMT, Suchismith Roy wrote: >> The libpath parsing code is from me, so no license problems. > > Hi @JoKern65 Is this good to integrate now ? @suchismith1993 > Once this code goes in I can push in my changes. We are targeting the fix for January. If you talk about https://github.com/openjdk/jdk/pull/16604, no, you cannot push that even if Joachim finishes his work. Your patch has not even a single review, is quite controversial, and none of the issues the reviewers have mentioned are addressed. This needs a lot more discussion time. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1862704694 From sroy at openjdk.org Tue Dec 19 13:36:51 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 13:36:51 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 12:37:33 GMT, Suchismith Roy wrote: >> The libpath parsing code is from me, so no license problems. > > Hi @JoKern65 Is this good to integrate now ? > @suchismith1993 > > > Once this code goes in I can push in my changes. We are targeting the fix for January. > > If you talk about #16604, no, you cannot push that even if Joachim finishes his work. > > Your patch has not even a single review, is quite controversial, and none of the issues the reviewers have mentioned are addressed. This needs a lot more discussion time. I have the patch ready based on the changes in this patch, as I take the diff and apply. But I cannot push since it will end up adding the entire file. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1862768974 From sroy at openjdk.org Tue Dec 19 13:43:51 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 13:43:51 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v4] In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 12:52:23 GMT, Thomas Stuefe wrote: >> Hi @JoKern65 Is this good to integrate now ? > > @suchismith1993 > >> Once this code goes in I can push in my changes. We are targeting the fix for January. > > If you talk about https://github.com/openjdk/jdk/pull/16604, no, you cannot push that even if Joachim finishes his work. > > Your patch has not even a single review, is quite controversial, and none of the issues the reviewers have mentioned are addressed. This needs a lot more discussion time. > > @tstuefe Sorry to tag you. Can you review the code. Once this code goes in I can push in my changes. > > We are targeting the fix for January. > > > Hi @JoKern65 Is this good to integrate now ? > > @suchismith1993 Please don't put pressure on patch authors and developers. There is zero reason why this patch should be rushed. > > > Hi @suchismith1993, I'm waiting for a second review. Complex hotspot changes should be reviewed twice. > > Not only that, hotspot changes _need_ to be reviewed by at least two reviewers. That is not optional. See OpenJDK bylaws. Sorry about that. The fix was critical for the adoptium builds and hence was looking to fix this soon. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16920#issuecomment-1862776678 From goetz at openjdk.org Tue Dec 19 13:50:53 2023 From: goetz at openjdk.org (Goetz Lindenmaier) Date: Tue, 19 Dec 2023 13:50:53 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> <7FfEZmI1lotj-z6P6mJtk-jH7vfiq_mO0EYtW2iHuGI=.033a826a-7083-48dc-882a-2ded7b8b0da1@github.com> Message-ID: On Tue, 28 Nov 2023 12:59:01 GMT, Suchismith Roy wrote: >>> > >>> >>> @tstuefe 3rd parameter to pass the either of 2 things: >>> >>> 1. The JvmTiAgent object "agent", so that after shifting the save_library_signature to os_aix,we can still access the agent object.-> For this i tried importing jvm/prims header file, but i get segmentation faults during build . Not sure if i am doing it the right way. >>> >>> 2. Pass a character buffer(and not const char*) where we copy the modified filename back to it and then use it in jvmAgent. code. >> >> Does not sound really appealing tbh. We pile one hack atop of another. >> >> Please synchronize with @JoKern65 at SAP. He will rewrite the JVMTI handler code, which will make this point moot. See https://bugs.openjdk.org/browse/JDK-8320890. > >> > > >> > >> > >> > @tstuefe 3rd parameter to pass the either of 2 things: >> > ``` >> > 1. The JvmTiAgent object "agent", so that after shifting the save_library_signature to os_aix,we can still access the agent object.-> For this i tried importing jvm/prims header file, but i get segmentation faults during build . Not sure if i am doing it the right way. >> > >> > 2. Pass a character buffer(and not const char*) where we copy the modified filename back to it and then use it in jvmAgent. code. >> > ``` >> >> Does not sound really appealing tbh. We pile one hack atop of another. >> >> Please synchronize with @JoKern65 at SAP. He will rewrite the JVMTI handler code, which will make this point moot. See https://bugs.openjdk.org/browse/JDK-8320890. > > Hi @tstuefe Should i then wait for this code to be integrated and then rewrite the .a handling ? > I mean this PR shall remain open then right ? > @JoKern65 Are you even handling the .a handling case ? i would like this PR to stay open. Maybe i can wait for the design change that you are working on. Hi @suchismith1993 , you can place this change on top of #16920 by comparing it with branch origin/pr/16920 instead of master. This way you might be able to proceed with your change. But as Thomas says you can only push if you have appropriate reviews. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1862791022 From sroy at openjdk.org Tue Dec 19 13:57:49 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 13:57:49 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> <7FfEZmI1lotj-z6P6mJtk-jH7vfiq_mO0EYtW2iHuGI=.033a826a-7083-48dc-882a-2ded7b8b0da1@github.com> Message-ID: <4Owrf3BOOYfX2TRr_umNoiMCTAblzSc4Es44GUnC5Vc=.9fc5e71c-2b50-4617-b82f-fc3afeb9db21@github.com> On Tue, 28 Nov 2023 12:59:01 GMT, Suchismith Roy wrote: >>> > >>> >>> @tstuefe 3rd parameter to pass the either of 2 things: >>> >>> 1. The JvmTiAgent object "agent", so that after shifting the save_library_signature to os_aix,we can still access the agent object.-> For this i tried importing jvm/prims header file, but i get segmentation faults during build . Not sure if i am doing it the right way. >>> >>> 2. Pass a character buffer(and not const char*) where we copy the modified filename back to it and then use it in jvmAgent. code. >> >> Does not sound really appealing tbh. We pile one hack atop of another. >> >> Please synchronize with @JoKern65 at SAP. He will rewrite the JVMTI handler code, which will make this point moot. See https://bugs.openjdk.org/browse/JDK-8320890. > >> > > >> > >> > >> > @tstuefe 3rd parameter to pass the either of 2 things: >> > ``` >> > 1. The JvmTiAgent object "agent", so that after shifting the save_library_signature to os_aix,we can still access the agent object.-> For this i tried importing jvm/prims header file, but i get segmentation faults during build . Not sure if i am doing it the right way. >> > >> > 2. Pass a character buffer(and not const char*) where we copy the modified filename back to it and then use it in jvmAgent. code. >> > ``` >> >> Does not sound really appealing tbh. We pile one hack atop of another. >> >> Please synchronize with @JoKern65 at SAP. He will rewrite the JVMTI handler code, which will make this point moot. See https://bugs.openjdk.org/browse/JDK-8320890. > > Hi @tstuefe Should i then wait for this code to be integrated and then rewrite the .a handling ? > I mean this PR shall remain open then right ? > @JoKern65 Are you even handling the .a handling case ? i would like this PR to stay open. Maybe i can wait for the design change that you are working on. > Hi @suchismith1993 , you can place this change on top of #16920 by comparing it with branch origin/pr/16920 instead of master. This way you might be able to proceed with your change. But as Thomas says you can only push if you have appropriate reviews. Hi @GoeLin I am not sure how to do that . Could you tell me in brief ? Do I run the checkout command on the other PR and then place my change of top of it ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1862802768 From fyang at openjdk.org Tue Dec 19 14:13:48 2023 From: fyang at openjdk.org (Fei Yang) Date: Tue, 19 Dec 2023 14:13:48 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: References: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> Message-ID: On Tue, 19 Dec 2023 09:58:15 GMT, Robbin Ehn wrote: > > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? > > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 > > > > > > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. > > So here I just try to follow the current code, see how lw is changed to c_lw. > > Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu > > an example > > ``` > 0.44% ? 0x0000003fa46a86c8: slli t3,t3,0x20 > 0.48% ? 0x0000003fa46a86ca: addi t3,t3,-1 > .... > 3.11% ? 0x0000003fa46a86dc: lw a0,0(t1) > 5.34% ? 0x0000003fa46a86e0: and a0,a0,t3 > ``` > > Using Assembler::lwu directly resulted in a correctly generated lwu Interesting. This does not seem to reflect on the code of `MacroAssembler's lwu`. I wonder how could that happens. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1862828341 From luhenry at openjdk.org Tue Dec 19 14:18:56 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 19 Dec 2023 14:18:56 GMT Subject: RFR: 8315856: RISC-V: Use Zacas extension for cmpxchg [v6] In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 05:53:50 GMT, Fei Yang wrote: > Thanks for the update. Seems fine given that it is an experimental feature for now. We will need more thorough tests (jcstress, benchmark, etc.) on real hardware when turning this into a product option. I'm sure we'll be the first one to thoroughly test it and make sure everything works as expected! Thanks ------------- PR Comment: https://git.openjdk.org/jdk/pull/16910#issuecomment-1862835558 From luhenry at openjdk.org Tue Dec 19 14:18:57 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 19 Dec 2023 14:18:57 GMT Subject: Integrated: 8315856: RISC-V: Use Zacas extension for cmpxchg In-Reply-To: References: Message-ID: On Thu, 30 Nov 2023 17:48:11 GMT, Ludovic Henry wrote: > 8315856: RISC-V: Use Zacas extension for cmpxchg This pull request has now been integrated. Changeset: 6313223b Author: Ludovic Henry URL: https://git.openjdk.org/jdk/commit/6313223bcd525aabf180813af76d500cf60893d3 Stats: 189 lines in 5 files changed: 160 ins; 4 del; 25 mod 8315856: RISC-V: Use Zacas extension for cmpxchg Reviewed-by: rehn, fyang ------------- PR: https://git.openjdk.org/jdk/pull/16910 From vkempik at openjdk.org Tue Dec 19 14:27:46 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Tue, 19 Dec 2023 14:27:46 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: References: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> Message-ID: <36AfaIsKlwYLkXYHg4QFA7c-aSP3Tvvy4amp9Ayg5PQ=.caf6e7bd-fdcb-40ef-a542-76258f646cb7@github.com> On Tue, 19 Dec 2023 14:10:13 GMT, Fei Yang wrote: > > > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? > > > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 > > > > > > > > > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. > > > So here I just try to follow the current code, see how lw is changed to c_lw. > > > > > > Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu > > an example > > ``` > > 0.44% ? 0x0000003fa46a86c8: slli t3,t3,0x20 > > 0.48% ? 0x0000003fa46a86ca: addi t3,t3,-1 > > .... > > 3.11% ? 0x0000003fa46a86dc: lw a0,0(t1) > > 5.34% ? 0x0000003fa46a86e0: and a0,a0,t3 > > ``` > > > > > > > > > > > > > > > > > > > > > > > > Using Assembler::lwu directly resulted in a correctly generated lwu > > Interesting. This does not seem to reflect on the code of `MacroAssembler's lwu`. I wonder how could that happen. If you take this PR https://github.com/openjdk/jdk/pull/17046/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR3717 and change explicit Assembler::lwu() to lwu() then you are likely to see this issue ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1862857449 From mbaesken at openjdk.org Tue Dec 19 14:51:50 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Tue, 19 Dec 2023 14:51:50 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding Hi Thomas, thanks for the review ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1862899945 From goetz at openjdk.org Tue Dec 19 16:08:51 2023 From: goetz at openjdk.org (Goetz Lindenmaier) Date: Tue, 19 Dec 2023 16:08:51 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: <4Owrf3BOOYfX2TRr_umNoiMCTAblzSc4Es44GUnC5Vc=.9fc5e71c-2b50-4617-b82f-fc3afeb9db21@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> <7FfEZmI1lotj-z6P6mJtk-jH7vfiq_mO0EYtW2iHuGI=.033a826a-7083-48dc-882a-2ded7b8b0da1@github.com> <4Owrf3BOOYfX2TRr_umNoiMCTAblzSc4Es44GUnC5Vc=.9fc5e71c-2b50-4617-b82f-fc3afeb9db21@github.com> Message-ID: On Tue, 19 Dec 2023 13:55:09 GMT, Suchismith Roy wrote: > > Hi @suchismith1993 , you can place this change on top of #16920 by comparing it with branch origin/pr/16920 instead of master. This way you might be able to proceed with your change. But as Thomas says you can only push if you have appropriate reviews. > > Hi @GoeLin I am not sure how to do that . Could you tell me in brief ? Do I run the checkout command on the other PR and then place my change of top of it ? Yes, you can do that. You can also change the branch in this pr. Click edit on the top right. Choose an alternative for "openjdk:master" ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1863051404 From epeter at openjdk.org Tue Dec 19 16:13:53 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Tue, 19 Dec 2023 16:13:53 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Fri, 1 Dec 2023 07:56:04 GMT, Emanuel Peter wrote: > Before this patch, we always initialized the GrowableArray up to its `capacity`, and not just up to `length`. This is problematic for a few reasons: > > - It is not expected. `std::vector` also only initializes the elements up to its size, and not to capacity. > - It requires a default-constructor for the element type. And the default-constructor is then used to fill in the elements between length and capacity. If the elements do any allocation themselves, then this is a waste of resources. > - The implementation also required the copy-assignment-operator for the element type. This is a lesser restriction. But the copy-assignment-operator was used in cases like `append` (where placement copy-construct would be expected), and not just in true assignment kinds of cases like `at_put`. > > For this reason, I reworked a lot of the methods to ensure that only the "slots" up to `length` are ever initialized, and the space between `length` and `capacity` is always garbage. > > ----- > > Also, before this patch, one can CHeap allocate both with `GrowableArray` and `GrowableArrayCHeap`. This is unnecessary. It required more complex verification in `GrowableArray` to deal with all cases. And `GrowableArrayCHeap` is already explicitly a smaller object, and should hence be preferred. Hence I changed all CHeap allocating cases of `GrowableArray` to `GrowableArrayCHeap`. This also allows for a clear separation: > - `GrowableArray` only deals with arena / resource area allocation. These are arrays that are regularly abandoned at the end of their use, rather than deleted or even cleared. > - `GrowableArrayCHeap` only deals with CHeap allocated memory. We expect that the destructor for it is called eventually, either when it goes out of scope or when `delete` is explicitly called. We expect that the elements could be allocating resources internally, and hence rely on the destructors for the elements being called, which may free up those internally allocated resources. > > Therefore, we now only allow `GrowableArrayCHeap` to have element types with non-trivial destructors, but `GrowableArray` checks that element types do not have non-trivial destructors (since it is common practice to just abandon arena / resource area allocated arrays, rather than calling the destructor or clearing the array, which also destructs all elements). This more clearly separates the two worlds: clean-up your own mess (CHeap) vs abandon your mess (arena / resource area). > > ----- > > I also completely refactored and improved ... Filed: JDK-8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap Will work on that first, and then come back here later. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16918#issuecomment-1863061669 From rkennke at openjdk.org Tue Dec 19 16:26:05 2023 From: rkennke at openjdk.org (Roman Kennke) Date: Tue, 19 Dec 2023 16:26:05 GMT Subject: RFR: 8322383: G1: Only preserve marks on objects that are actually moved Message-ID: The G1 full-GC preserves marks during marking, for all live objects in compaction region. However, not all live objects do actually move. In particular, the start of a compaction chain may have a sediment of all-live objects which would not move, and thus don't need to have their marks preserved. The problem can easily be solved by preserving marks during forwarding. That also seems a more natural place to do that. Testing: - [x] hotspot_gc - [ ] tier1 - [ ] tier2 ------------- Commit messages: - 8322383: G1: Only preserve marks on objects that are actually moved Changes: https://git.openjdk.org/jdk/pull/17159/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17159&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322383 Stats: 41 lines in 9 files changed: 17 ins; 13 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/17159.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17159/head:pull/17159 PR: https://git.openjdk.org/jdk/pull/17159 From sroy at openjdk.org Tue Dec 19 16:28:53 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 16:28:53 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: <4Owrf3BOOYfX2TRr_umNoiMCTAblzSc4Es44GUnC5Vc=.9fc5e71c-2b50-4617-b82f-fc3afeb9db21@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> <7FfEZmI1lotj-z6P6mJtk-jH7vfiq_mO0EYtW2iHuGI=.033a826a-7083-48dc-882a-2ded7b8b0da1@github.com> <4Owrf3BOOYfX2TRr_umNoiMCTAblzSc4Es44GUnC5Vc=.9fc5e71c-2b50-4617-b82f-fc3afeb9db21@github.com> Message-ID: <8KgUJ7tjDJSJthb_9T1pAdJuRr88z_C9B714r5ZqJIE=.2443ed76-72b6-4e14-9821-6eea8f26c24d@github.com> On Tue, 19 Dec 2023 13:55:09 GMT, Suchismith Roy wrote: >>> > > >>> > >>> > >>> > @tstuefe 3rd parameter to pass the either of 2 things: >>> > ``` >>> > 1. The JvmTiAgent object "agent", so that after shifting the save_library_signature to os_aix,we can still access the agent object.-> For this i tried importing jvm/prims header file, but i get segmentation faults during build . Not sure if i am doing it the right way. >>> > >>> > 2. Pass a character buffer(and not const char*) where we copy the modified filename back to it and then use it in jvmAgent. code. >>> > ``` >>> >>> Does not sound really appealing tbh. We pile one hack atop of another. >>> >>> Please synchronize with @JoKern65 at SAP. He will rewrite the JVMTI handler code, which will make this point moot. See https://bugs.openjdk.org/browse/JDK-8320890. >> >> Hi @tstuefe Should i then wait for this code to be integrated and then rewrite the .a handling ? >> I mean this PR shall remain open then right ? >> @JoKern65 Are you even handling the .a handling case ? i would like this PR to stay open. Maybe i can wait for the design change that you are working on. > >> Hi @suchismith1993 , you can place this change on top of #16920 by comparing it with branch origin/pr/16920 instead of master. This way you might be able to proceed with your change. But as Thomas says you can only push if you have appropriate reviews. > > Hi @GoeLin I am not sure how to do that . Could you tell me in brief ? > Do I run the checkout command on the other PR and then place my change of top of it ? > > > Hi @suchismith1993 , you can place this change on top of #16920 by comparing it with branch origin/pr/16920 instead of master. This way you might be able to proceed with your change. But as Thomas says you can only push if you have appropriate reviews. > > > > > > Hi @GoeLin I am not sure how to do that . Could you tell me in brief ? Do I run the checkout command on the other PR and then place my change of top of it ? > > Yes, you can do that. You can also change the branch in this pr. Click edit on the top right. Choose an alternative for "openjdk:master" I see. So If I do that, will it reflect on my command line in local machine ? I mean I need run a rebase command with origin as pr/16920 ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1863086746 From goetz at openjdk.org Tue Dec 19 16:50:52 2023 From: goetz at openjdk.org (Goetz Lindenmaier) Date: Tue, 19 Dec 2023 16:50:52 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> Message-ID: On Wed, 22 Nov 2023 16:24:24 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > change macro position try it! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1863123599 From sroy at openjdk.org Tue Dec 19 17:06:02 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 17:06:02 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v3] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains eight commits: - Merge branch 'pr/16920' into jvmagent - change macro position - Adapt hotspot coding style - Improve comments and coding style. - Remove macro for file extension. - Move mapping function to aix specific file. - Introduce new macro for AIX archives. - Add support for .a extension in jvm agent. 1. Add support to load archive files and shared objects in jvm agent for AIX. ------------- Changes: https://git.openjdk.org/jdk/pull/16604/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=02 Stats: 13 lines in 3 files changed: 13 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From kim.barrett at oracle.com Tue Dec 19 17:23:38 2023 From: kim.barrett at oracle.com (Kim Barrett) Date: Tue, 19 Dec 2023 17:23:38 +0000 Subject: Use of C++ dynamic global object initialization with thread guards In-Reply-To: <87lea7d2o7.fsf@oldenburg.str.redhat.com> References: <87fs0izasf.fsf@oldenburg.str.redhat.com> <514F5A61-E3C2-4B24-A567-EF19C4292989@oracle.com> <87lea7d2o7.fsf@oldenburg.str.redhat.com> Message-ID: > On Dec 6, 2023, at 5:51 AM, Florian Weimer wrote: > > * Kim Barrett: > >>> The implementation of __cxa_guard_acquire is not entirely trivial >>> because it detects recursive initialization and throws >>> __gnu_cxx::recursive_init_error, which means that it pulls in the C++ >>> unwinder (at least with a traditional GNU/Linux build of libstdc++.a). >> >> Does it? Seems like it shouldn?t. We build with -fno-exceptions, and >> the definition of throw_recursive_init_exception is conditionalized on >> __cpp_exceptions, only throwing when that macro is defined. It calls >> __builtin_trap() if that macro isn?t defined. > > With upstream GCC (and presumably most distributions), there's one > libstdc++.a with one implementation of __cxa_guard_acquire, and it's > built with exception support. > > It's supposed to be possible to build libstdc++ without exception > support, but upstream GCC doesn't do this automatically for you if the > target supports exception handling. In principle, the GCC specs > mechanism allows you to treat -fno-exceptions as a linker flag and link > against a custom no-exceptions build of libstdc++.a. > > Maybe this is what your toolchain is doing if you don't see the unwinder > symbols in your builds? It should be easy enough to check if you have a > build with a symbol table: look for a call in __cxa_throw in the > disassembly of __cxa_guard_acquire.cold or __cxa_guard_acquire. One of > our builds looks like this: I've verified that the same is happening in Oracle builds. We don't build an exception-disabled libstdc++ as part of our devkit either. So my next question is, exactly what is the harm, and how serious is it? So far, I don't know of anyone noticing a problem arising from this. Obviously, if someone writes an initializer that can lead to recursive entry, that would lead to an attempt to throw an exception. That's likely to have pretty bad consequences. OTOH, this doesn't seem like a problem we have in practice. I'm not sure I've ever seen such a problem arise (not just in HotSpot); after all, it's UB to do so. The relevant throwing code seems to be tagged as cold, so maybe it doesn't even get mapped into memory unless it gets invoked. And while it's true that we try to minimize reliance on the C++ runtime library, it's not a strict rule. It might not even be feasible to completely avoid. And we do periodically have discussions about permitting the use of additional C++ Standard Library features. Dependency on the runtime library is one of the things that comes up in those discussions. > With > increasing use of libstdc++ facilities in Hotspot, the libstdc++.a > variant may be the only feasible long-term approach that is both > maintainable on the GCC side and truly avoids an unwinder dependency. So long as we stay away from APIs that can throw (or uses that can throw), I'm not sure there's a problem? For example, if we were to permit the use of std::vector, we'd likely forbid the use of std::vector::at. We already avoid throwing std::bad_alloc from operator new implementations, instead either terminating or returning nullptr. > (I don't want to turn this into a Restaurant Sketch scenario?there is > non-trivial libstdc++ usage beyond __cxa_guard_acquire in Hotspot. I > just wanted to start with a fairly simple example.) It would be interesting to know what else is there. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From sspitsyn at openjdk.org Tue Dec 19 17:29:56 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Tue, 19 Dec 2023 17:29:56 GMT Subject: Integrated: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable In-Reply-To: References: Message-ID: On Thu, 7 Dec 2023 06:28:43 GMT, Serguei Spitsyn wrote: > This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. > It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. > The deadlocking scenario is well described by Patricio in a bug report comment. > In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. > > The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. > This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. > > Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. > > New test was developed by Patricio: > `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > The test is very nice as it reliably in 100% reproduces the deadlock without the fix. > The test is never failing with this fix. > > Testing: > - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` > - tested with mach5 tiers 1-6 This pull request has now been integrated. Changeset: 0f8e4e0a Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/0f8e4e0a81257c678e948c341a241dc0b810494f Stats: 229 lines in 15 files changed: 196 ins; 0 del; 33 mod 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable Reviewed-by: lmesnik, alanb ------------- PR: https://git.openjdk.org/jdk/pull/17011 From stuefe at openjdk.org Tue Dec 19 17:34:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Dec 2023 17:34:21 GMT Subject: RFR: JDK-8322475: Extend printing for System.map Message-ID: This is an expansion on the new `System.map` command introduced with JDK-8318636. We now print valuable information per memory region, such as: - the actual resident set size - the actual number of huge pages - the actual used page size - the THP state of the region (was advised, is eligible, uses THP, ...) - whether the region is shared - whether the region had been committed (backed by swap) - whether the region has been swapped out. Example output: from to size rss hugetlb pgsz prot notes vm info/file 0x00000000c0000000 - 0x00000000ffe00000 1071644672 0 4194304 2M rw-p huge JAVAHEAP /anon_hugepage 0x00000000ffe00000 - 0x0000000100000000 2097152 0 0 2M rw-p huge JAVAHEAP /anon_hugepage 0x0000558016b67000 - 0x0000558016b68000 4096 4096 0 4K r--p /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java 0x0000558016b68000 - 0x0000558016b69000 4096 4096 0 4K r-xp /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java 0x00007f3a749f2000 - 0x00007f3a74c62000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'profiled nmethods') 0x00007f3a74c62000 - 0x00007f3a7be51000 119468032 0 0 4K ---p nores CODE(CodeHeap 'profiled nmethods') 0x00007f3a7be51000 - 0x00007f3a7c1c1000 3604480 3604480 0 4K rwxp CODE(CodeHeap 'profiled nmethods') 0x00007f3a7c1c1000 - 0x00007f3a7c592000 4001792 0 0 4K ---p nores CODE(CodeHeap 'non-nmethods') 0x00007f3a7c592000 - 0x00007f3a7c802000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'non-profiled nmethods') 0x00007f3a7c802000 - 0x00007f3a839f2000 119472128 0 0 4K ---p nores CODE(CodeHeap 'non-profiled nmethods') The summary section shows: - number of mappings - total vsize - total rss - total size of hugetlb memory - how much memory got merged to THPs - how much memory had been swapped out - used (dirty) pages by page size Example: (the machine uses THP mode "always" and the VM was started with +UseLargePages, therefore we see both static huge pages and THPs being used): Number of mappings: 334 vsize: 8649248768 (8248M) rss: 3318468608 (3164M) committed: 1431310336 (1365M) shared: 32768 (32768B) swapped out: 409600 (400K) using thp: 12582912 (12288K) hugetlb: 572522496 (546M) By page size: 4K: 810173 pages, 3318468608 bytes (3164M) 2M: 273 pages, 572522496 bytes (546M) ------------ Patch: - I simplified the back-and-forth between the OS-agnostic part of the printing and the OS-dependend part of the printing. - I removed the "human readable" option of the commands, since these were of not much use. ------------- Commit messages: - Extend System.map on Linux Changes: https://git.openjdk.org/jdk/pull/17158/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17158&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322475 Stats: 398 lines in 8 files changed: 271 ins; 85 del; 42 mod Patch: https://git.openjdk.org/jdk/pull/17158.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17158/head:pull/17158 PR: https://git.openjdk.org/jdk/pull/17158 From stuefe at openjdk.org Tue Dec 19 17:34:21 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Tue, 19 Dec 2023 17:34:21 GMT Subject: RFR: JDK-8322475: Extend printing for System.map In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 15:48:58 GMT, Thomas Stuefe wrote: > This is an expansion on the new `System.map` command introduced with JDK-8318636. > > We now print valuable information per memory region, such as: > > - the actual resident set size > - the actual number of huge pages > - the actual used page size > - the THP state of the region (was advised, is eligible, uses THP, ...) > - whether the region is shared > - whether the region had been committed (backed by swap) > - whether the region has been swapped out. > > Example output: > > > from to size rss hugetlb pgsz prot notes vm info/file > 0x00000000c0000000 - 0x00000000ffe00000 1071644672 0 4194304 2M rw-p huge JAVAHEAP /anon_hugepage > 0x00000000ffe00000 - 0x0000000100000000 2097152 0 0 2M rw-p huge JAVAHEAP /anon_hugepage > 0x0000558016b67000 - 0x0000558016b68000 4096 4096 0 4K r--p /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java > 0x0000558016b68000 - 0x0000558016b69000 4096 4096 0 4K r-xp /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java > 0x00007f3a749f2000 - 0x00007f3a74c62000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'profiled nmethods') > 0x00007f3a74c62000 - 0x00007f3a7be51000 119468032 0 0 4K ---p nores CODE(CodeHeap 'profiled nmethods') > 0x00007f3a7be51000 - 0x00007f3a7c1c1000 3604480 3604480 0 4K rwxp CODE(CodeHeap 'profiled nmethods') > 0x00007f3a7c1c1000 - 0x00007f3a7c592000 4001792 0 0 4K ---p nores CODE(CodeHeap 'non-nmethods') > 0x00007f3a7c592000 - 0x00007f3a7c802000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'non-profiled nmethods') > 0x00007f3a7c802000 - 0x00007f3a839f200... @stefank this could interest you ------------- PR Comment: https://git.openjdk.org/jdk/pull/17158#issuecomment-1863203622 From sroy at openjdk.org Tue Dec 19 17:38:52 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 17:38:52 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> Message-ID: On Tue, 19 Dec 2023 16:47:33 GMT, Goetz Lindenmaier wrote: > try it! error: failed to push some refs to 'github.com:suchismith1993/jdk.git' I tried the fork instructions . However I think I need to do a force push. Will that be fine since the PR is not in draft state ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1863210891 From mli at openjdk.org Tue Dec 19 17:40:51 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Dec 2023 17:40:51 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v2] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 16:12:24 GMT, Antonios Printezis wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Add some comments > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4448: > >> 4446: >> 4447: if (round%2 == 0) { >> 4448: __ ld(ws[round/2], Address(buf, 0)); > > Instead of incrementing `buf` 8 times, could you just increment the offset (0, 8, 16, etc.) and only increment `buf` once per loop iteration? Good suggestion. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431741088 From mli at openjdk.org Tue Dec 19 17:45:48 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Dec 2023 17:45:48 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v2] In-Reply-To: References: Message-ID: <5prBgYngQDbhsc7G2KphYH2VYKxeOcuDH6C8NDgGyvY=.99f6dea8-93b3-4ca2-bde3-3ff49b2f5c3f@github.com> On Mon, 18 Dec 2023 13:56:42 GMT, Antonios Printezis wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> Add some comments > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4592: > >> 4590: >> 4591: __ slli(tmp1, b, 32); >> 4592: __ andi(prev_ab, a, mask32, tmp2); > > I think this will materialize `mask32` in `tmp2` twice, once per `andi`, given that the value won't work as an intermediate. I'd do `__ mv(tmp2, mask32)` and use `__ andr(prev_ab, a, tmp2)` and `__ andr(prev_cd, c, tmp2)`. I think it will save 2-3 instructions here. No idea how performance-critical this section is, though! I assume not much? Good suggestion. And, I use a dedicated register to hold the 0xffffffff now, as the value is needed in the busy loop for multiple times. > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4664: > >> 4662: >> 4663: RegSet saved_regs = RegSet::range(x18, x27); >> 4664: saved_regs += RegSet::of(t2); > > Do you need to save t2? deleted. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431743579 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431746005 From mli at openjdk.org Tue Dec 19 17:54:51 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Dec 2023 17:54:51 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v3] In-Reply-To: References: Message-ID: <2LzKv6TzZ3ZJDuLOm1GpNcgoCCfZgOqEOtWDNRQs7O0=.2ced11c4-bba6-4e15-bfa1-f0ca06d53610@github.com> > Hi, > Can you review this patch to implement SHA-1 intrinsic for riscv? > Thanks! > > > ## Test > > ### Functionality > > tests under `test/hotspot/jtreg/compiler/intrinsics/sha` > tests found via `find test/jdk -iname "*SHA1*.java"` > > ### Performance > > tested on `T-HEAD Light Lichee Pi 4A` > > benchmark tests `MessageDigests.java GetMessageDigest.java MessageDigestBench.java MacBench.java` which are under `test/micro/org/openjdk/bench/`. > > **when intrinsic is enabled** > > o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 489.860 ? 6.277 ns/op > o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt 10 3477.197 ? 204.203 ns/op > o.o.b.java.security.GetMessageDigest.getInstanceWithProvider N/A N/A SHA-1 N/A N/A avgt 10 4111.164 ? 108.861 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 64 DEFAULT avgt 10 3454.207 ? 53.924 ns/op > o.o.b.java.security.MessageDigests.digest N/A N/A SHA-1 16384 DEFAULT avgt 10 184063.834 ? 677.635 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 64 DEFAULT avgt 10 8260.011 ? 150.045 ns/op > o.o.b.java.security.MessageDigests.getAndDigest N/A N/A SHA-1 16384 DEFAULT avgt 10 191325.246 ? 3298.882 ns/op > o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 128 N/A N/A avgt 10 8220.886 ? 53.684 ns/op > o.o.b.javax.crypto.full.MacBench.mac HmacSHA1 1024 N/A N/A avgt 10 18006.955 ? 92.432 ns/op > o.o.b.javax.crypto.small.MessageDigestBench.digest SHA1 1048576 N/A N/A avgt 10 11688843.558 ? 34924.678 ns/op > > > **when intrinsic is disabled** > > o.o.b.java.security.GetMessageDigest.cloneInstance N/A N/A SHA-1 N/A N/A avgt 10 496.890 ? 6.695 ns/op > o.o.b.java.security.GetMessageDigest.getInstance N/A N/A SHA-1 N/A N/A avgt ... Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: round 1 review ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17130/files - new: https://git.openjdk.org/jdk/pull/17130/files/c4dc07be..42f838a9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17130&range=01-02 Stats: 54 lines in 1 file changed: 19 ins; 7 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/17130.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17130/head:pull/17130 PR: https://git.openjdk.org/jdk/pull/17130 From mli at openjdk.org Tue Dec 19 17:54:53 2023 From: mli at openjdk.org (Hamlin Li) Date: Tue, 19 Dec 2023 17:54:53 GMT Subject: RFR: 8322179: RISC-V: Implement SHA-1 intrinsic [v3] In-Reply-To: References: Message-ID: <4x0jxhIkMvpcegIm_agHJUwzFLGvTe-hR_c-xampMfs=.05bd423b-51da-42fd-8250-cd1aa8af9c74@github.com> On Mon, 18 Dec 2023 13:45:36 GMT, Antonios Printezis wrote: >> Hamlin Li has updated the pull request incrementally with one additional commit since the last revision: >> >> round 1 review > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4585: > >> 4583: } >> 4584: >> 4585: void sha1_reserve_prev_abcde(Register a, Register b, Register c, Register d, Register e, > > I think it's safe to just use t0 and t1 for intermediate results without passing them as args. I used to pass them as args too, but I changed that for md5. Yes, it is. But I found out it's not that clear in some situation, and error-prone, e.g. when the code is a bit complicated. So, my policy here is that I only use built-in register names in generate_sha1_implCompress, in all other places, I just use the renamed ones and use them explicitly. > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4627: > >> 4625: >> 4626: // c_rarg0 - c_rarg3: x10 - x13 >> 4627: Register buf = c_rarg0; > > You could copy the four arguments to a different set of registers and use a0 -> a3 for some of the other values to see if you can increase the number of compressed instructions that can be used. Unclear whether it's worth it or not. Yeh, I'm not sure if we should take this approach. Good side might be some code size reduction, bad side might be it's a bit confusing to read and maintain the code. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431750068 PR Review Comment: https://git.openjdk.org/jdk/pull/17130#discussion_r1431757773 From sroy at openjdk.org Tue Dec 19 17:59:17 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 17:59:17 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v4] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains eight new commits since the last revision: - merge pr/16920 - change macro position - Adapt hotspot coding style - Improve comments and coding style. - Remove macro for file extension. - Move mapping function to aix specific file. - Introduce new macro for AIX archives. - Add support for .a extension in jvm agent. 1. Add support to load archive files and shared objects in jvm agent for AIX. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/151f6c20..eb09224d Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=02-03 Stats: 3 lines in 2 files changed: 0 ins; 3 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From duke at openjdk.org Tue Dec 19 18:04:50 2023 From: duke at openjdk.org (duke) Date: Tue, 19 Dec 2023 18:04:50 GMT Subject: Withdrawn: 8314644: Change "Rvalue references and move semantics" into an accepted feature In-Reply-To: References: Message-ID: On Tue, 22 Aug 2023 12:16:49 GMT, Johan Sj?len wrote: > Hi, > > I'd like to propose that rvalue references and move semantics are now considered permitted in the style guide. This change would allow for move constructors to be written. This enables more performant code, if the move ctr is less expensive than the copy ctr, but also more correct code. For the latter part, look at "8314571: GrowableArray should move its old data and not copy it". Here we can avoid using copy assignment, instead using move constructors, which more accurately reflects what is happening: The old elements are in fact moved, and not copied. > > Two useful std functions will become available to us with this change: > > 1. `std::move`, for explicitly moving a value. This is a slightly more powerful `static_cast(T)`, in that it also handles `T&` corectly. > 2. `std::forward`, which simplifies the usage of perfect forwarding. Perfect forwarding is a technique where in copying is minimized. To quote Scott Meyers ( https://cppandbeyond.com/2011/04/25/session-announcement-adventures-in-perfect-forwarding/ ): > >> Perfecting forwarding is an important C++0x technique built atop rvalue references. It allows move semantics to be automatically applied, even when the source and the destination of a move are separated by intervening function calls. Common examples include constructors and setter functions that forward arguments they receive to the data members of the class they are initializing or setting, as well as standard library functions like make_shared, which ?perfect-forwards? its arguments to the class constructor of whatever object the to-be-created shared_ptr is to point to. > > Looking forward to your feedback, thank you. > Johan This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/15386 From sgibbons at openjdk.org Tue Dec 19 18:42:19 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Tue, 19 Dec 2023 18:42:19 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v4] In-Reply-To: References: Message-ID: <6xRfGFR2RIYln31ivz1ZITs5G5bzQ5BlnPdsQeyVbX0=.2797afc7-40ec-482e-9c64-b3ba746d2bd7@github.com> > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Fix for JDK-8321599 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16753/files - new: https://git.openjdk.org/jdk/pull/16753/files/5e03173e..48088348 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=02-03 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From kvn at openjdk.org Tue Dec 19 19:03:51 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Tue, 19 Dec 2023 19:03:51 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: <4Gr1LLsOrG-7sJDE0mlR_x9QxrvQBMFzDe-atrmFAPs=.bf32dd9d-0abc-4de1-8ab7-3f12377e5098@github.com> References: <4Gr1LLsOrG-7sJDE0mlR_x9QxrvQBMFzDe-atrmFAPs=.bf32dd9d-0abc-4de1-8ab7-3f12377e5098@github.com> Message-ID: <6oWKE1GfvYTKQzDEptukFLwdxm1hTWOsMoieglnJgbg=.375a0b2a-fccb-4c18-b85f-4d831129d6e2@github.com> On Mon, 18 Dec 2023 20:19:49 GMT, Martin Doerr wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > Cleanup is not bad. Fewer objects and a bit shorter code at some places are an advantage. > Maybe Vladimir had some more reasons in mind when filing the issue. It's linked to https://bugs.openjdk.org/browse/JDK-8239472. It'd be nice if you or Vladimir could add a bit of motivation to the description of the PR or the JBS issue. @TheRealMDoerr motivation is to reduce memory consumption and speed up C2. `C2_MacroAssembler` is based on `ResourceObj` which allocates in compiler arena. Each small `C2_MacroAssembler masm()` will add allocation to arena until we finish compilation. Also speedup because we don't need to do such allocations. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1863321348 From luhenry at openjdk.org Tue Dec 19 19:06:42 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Tue, 19 Dec 2023 19:06:42 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v8] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 16:05:54 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Used jint_cast/julong_cast; moved mv between feq and beqz Given the check over the entire f32 bits space, happy to see it's giving the right results. On the static vs dynamic rounding mode, AFAICT here it's using static rounding mode as the rounding mode gets embedded into the instruction. Did I misunderstand something here? And static mode will generally be better than dynamic mode for all the reasons outlined in the RVI spec. ------------- Marked as reviewed by luhenry (Committer). PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1789544966 From sroy at openjdk.org Tue Dec 19 19:17:03 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Tue, 19 Dec 2023 19:17:03 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v5] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request incrementally with four additional commits since the last revision: - Change return type - Change dll load function signature that does dlopen - Remove AIX macros - Add wrapper function to check extension before dlopen ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/eb09224d..cd7e0e64 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=03-04 Stats: 34 lines in 2 files changed: 22 ins; 11 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From avoitylov at openjdk.org Tue Dec 19 21:34:53 2023 From: avoitylov at openjdk.org (Aleksei Voitylov) Date: Tue, 19 Dec 2023 21:34:53 GMT Subject: [jdk22] RFR: 8321515: ARM32: Move method resolution information out of the cpCache properly In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 11:54:53 GMT, Aleksei Voitylov wrote: > Hi all, > > This pull request contains a backport of commit [f573f6d2](https://github.com/openjdk/jdk/commit/f573f6d233d5ea1657018c3c806fee0fac382ac3) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. > > The commit being backported was authored by Aleksei Voitylov on 13 Dec 2023 and was reviewed by Aleksey Shipilev. > > Thanks! Thank you Aleksey! ------------- PR Comment: https://git.openjdk.org/jdk22/pull/17#issuecomment-1863500591 From matsaave at openjdk.org Tue Dec 19 21:53:11 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Tue, 19 Dec 2023 21:53:11 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new Message-ID: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds a the optimization for x86 and aarch64. Verified with tier 1-5 tests. ------------- Commit messages: - Added aarch64 - Merge branch 'master' into class_init_new_8320276 - 8320276: Improve class initialization barrier in TemplateTable::_new Changes: https://git.openjdk.org/jdk/pull/17006/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17006&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8320276 Stats: 15 lines in 2 files changed: 6 ins; 2 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17006.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17006/head:pull/17006 PR: https://git.openjdk.org/jdk/pull/17006 From dholmes at openjdk.org Tue Dec 19 22:22:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Dec 2023 22:22:48 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 06:19:15 GMT, Ioi Lam wrote: > `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. Text is generally fine. A couple of typos need fixing. src/java.base/share/man/java.1 line 1584: > 1582: \f[V]-XX:+VerifySharedSpaces\f[R] > 1583: If this option is specified, the JVM will load a CDS archive file only > 1584: if it passes an integrity check based on CRC32 checkums. typo: checkums src/java.base/share/man/java.1 line 1585: > 1583: If this option is specified, the JVM will load a CDS archive file only > 1584: if it passes an integrity check based on CRC32 checkums. > 1585: The purpose of this flag is to check for unintentional damages of CDS s/damages of/damage to/ ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17152#pullrequestreview-1789795401 PR Review Comment: https://git.openjdk.org/jdk/pull/17152#discussion_r1431990128 PR Review Comment: https://git.openjdk.org/jdk/pull/17152#discussion_r1431990558 From omikhaltcova at openjdk.org Tue Dec 19 22:45:13 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Tue, 19 Dec 2023 22:45:13 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v9] In-Reply-To: References: Message-ID: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Replaced li with mv ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/df70bcba..392671c1 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=07-08 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From iklam at openjdk.org Tue Dec 19 23:12:09 2023 From: iklam at openjdk.org (Ioi Lam) Date: Tue, 19 Dec 2023 23:12:09 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces [v2] In-Reply-To: References: Message-ID: > `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora review - fixed typos ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17152/files - new: https://git.openjdk.org/jdk/pull/17152/files/011a7990..c5a0288a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17152&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17152&range=00-01 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17152.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17152/head:pull/17152 PR: https://git.openjdk.org/jdk/pull/17152 From omikhaltcova at openjdk.org Tue Dec 19 23:30:51 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Tue, 19 Dec 2023 23:30:51 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v9] In-Reply-To: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> References: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> Message-ID: On Tue, 19 Dec 2023 22:45:13 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced li with mv Yes, we use here static rounding (RDN = Round Down (towards ??)). I feel like not fully getting the above mentioned concerning the dynamic rounding. There is obviously no need, but to keep accurate results I've remeasured them for the final version: **VisionFive2** Before Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 39.351 ? 0.150 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 39.323 ? 0.192 ops/ms After Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 38.917 ? 0.129 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 50.300 ? 0.032 ops/ms **T-Head** Before Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.853 0.227 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.889 0.145 ops/ms After Benchmark (TESTSIZE) Mode Cnt Score Error Units FpRoundingBenchmark.test_round_double 2048 thrpt 15 117.243 4.554 ops/ms FpRoundingBenchmark.test_round_float 2048 thrpt 15 121.064 0.274 ops/ms ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1863611977 From omikhaltcova at openjdk.org Tue Dec 19 23:30:53 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Tue, 19 Dec 2023 23:30:53 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v8] In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 01:03:54 GMT, Fei Yang wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Used jint_cast/julong_cast; moved mv between feq and beqz > > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4264: > >> 4262: void MacroAssembler::java_round_float(Register dst, FloatRegister src, FloatRegister ftmp) { >> 4263: Label done; >> 4264: li(t0, jint_cast(0.5f)); > > Nit: Can you change this `li` into `mv`? That will be consistent with other places where we move an immediate. Fixed. Thanks! > src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 4281: > >> 4279: void MacroAssembler::java_round_double(Register dst, FloatRegister src, FloatRegister ftmp) { >> 4280: Label done; >> 4281: li(t0, julong_cast(0.5)); > > Same as above here. Fixed. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1432008377 PR Review Comment: https://git.openjdk.org/jdk/pull/16382#discussion_r1432008488 From dholmes at openjdk.org Tue Dec 19 23:37:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Tue, 19 Dec 2023 23:37:48 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces [v2] In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 23:12:09 GMT, Ioi Lam wrote: >> `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora review - fixed typos src/java.base/share/man/java.1 line 1585: > 1583: If this option is specified, the JVM will load a CDS archive file only > 1584: if it passes an integrity check based on CRC32 checksums. > 1585: The purpose of this flag is to check for unintentional damages to CDS s/damages/damage/ ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17152#discussion_r1432036908 From luhenry at openjdk.org Wed Dec 20 00:22:50 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Wed, 20 Dec 2023 00:22:50 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v9] In-Reply-To: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> References: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> Message-ID: <1H--7S6qBSaaKuyPmxSoepwJNna8HxFNZwjnmJSjIP8=.c5868375-acf0-4503-9698-122131c25341@github.com> On Tue, 19 Dec 2023 22:45:13 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced li with mv Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1789892497 From fyang at openjdk.org Wed Dec 20 00:46:48 2023 From: fyang at openjdk.org (Fei Yang) Date: Wed, 20 Dec 2023 00:46:48 GMT Subject: [jdk22] RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: <6N7VbvE9dr0-08boMhEdpvvMMvurL_e-xUD1kp7uSVY=.77cf21f4-fa12-4c9a-bb9a-9d5d9b794ecd@github.com> On Tue, 19 Dec 2023 08:30:43 GMT, Gui Cao wrote: > Clean backport which adds back missing code change in MacroAssembler::load_reserved in file src/hotspot/cpu/riscv/macroAssembler_riscv.cpp for https://bugs.openjdk.org/browse/JDK-8315743. This is a riscv-specific change, risk is low. > > - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) Marked as reviewed by fyang (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/19#pullrequestreview-1789906535 From iklam at openjdk.org Wed Dec 20 00:57:02 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 20 Dec 2023 00:57:02 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces [v3] In-Reply-To: References: Message-ID: > `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: @dholmes-ora review #2 - fixed typos ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17152/files - new: https://git.openjdk.org/jdk/pull/17152/files/c5a0288a..ca07c362 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17152&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17152&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17152.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17152/head:pull/17152 PR: https://git.openjdk.org/jdk/pull/17152 From dholmes at openjdk.org Wed Dec 20 00:57:03 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 20 Dec 2023 00:57:03 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces [v3] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 00:54:06 GMT, Ioi Lam wrote: >> `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora review #2 - fixed typos Looks good. Only one review needed for this doc change. Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17152#pullrequestreview-1789912194 From ccheung at openjdk.org Wed Dec 20 01:06:47 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Wed, 20 Dec 2023 01:06:47 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces [v3] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 00:57:02 GMT, Ioi Lam wrote: >> `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. > > Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: > > @dholmes-ora review #2 - fixed typos Marked as reviewed by ccheung (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17152#pullrequestreview-1789919151 From gcao at openjdk.org Wed Dec 20 02:37:37 2023 From: gcao at openjdk.org (Gui Cao) Date: Wed, 20 Dec 2023 02:37:37 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions In-Reply-To: <36AfaIsKlwYLkXYHg4QFA7c-aSP3Tvvy4amp9Ayg5PQ=.caf6e7bd-fdcb-40ef-a542-76258f646cb7@github.com> References: <4s2uV2ELJ2B28JUN1lIepQVVg9Gbg3DfY5g37sNyuuM=.222b5dd0-3029-41b4-815e-d5dd883e71c0@github.com> <36AfaIsKlwYLkXYHg4QFA7c-aSP3Tvvy4amp9Ayg5PQ=.caf6e7bd-fdcb-40ef-a542-76258f646cb7@github.com> Message-ID: On Tue, 19 Dec 2023 14:25:27 GMT, Vladimir Kempik wrote: > > > > > We already have "macroses" for load and stores in macroAssembler_riscv.hpp, what's the reason to do compression decision in assembler_riscv.hpp instead ( not saying it's wrong) ? > > > > > https://github.com/openjdk/jdk/blob/38d94725a1a85156e30b72b325886b0e25d4db03/src/hotspot/cpu/riscv/macroAssembler_riscv.hpp#L880 > > > > > > > > > > > > No, you are correct I also think this is not optimal. I don't know the background, but it seems like this is the easiest way to add compressed transparently. But to fully utilize C instruction we should favor the x8->x15, we often don't get C due to e.g. BCP is in x22. I think to be able to better utilize C we can't have it so transparent. > > > > So here I just try to follow the current code, see how lw is changed to c_lw. > > > > > > > > > Not exactly related to this PR, but I also saw a strange behaviour from MacroAssembler's lwu. it was generating lw + and ( a kind of lwu emulation) instead of lwu > > > an example > > > ``` > > > 0.44% ? 0x0000003fa46a86c8: slli t3,t3,0x20 > > > 0.48% ? 0x0000003fa46a86ca: addi t3,t3,-1 > > > .... > > > 3.11% ? 0x0000003fa46a86dc: lw a0,0(t1) > > > 5.34% ? 0x0000003fa46a86e0: and a0,a0,t3 > > > ``` > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Using Assembler::lwu directly resulted in a correctly generated lwu > > > > > > Interesting. This does not seem to reflect on the code of `MacroAssembler's lwu`. I wonder how could that happen. > > If you take this PR https://github.com/openjdk/jdk/pull/17046/files#diff-7a5c3ed05b6f3f06ed1c59f5fc2a14ec566a6a5bd1d09606115767daa99115bdR3717 and change explicit Assembler::lwu() to lwu() then you are likely to see this issue Hi, I have tried to use MacroAssembler::lwu instead, and I see low difference in stub code emitted. I have added comment [1]. let's discuss on that PR. [1] https://github.com/openjdk/jdk/pull/17046/files#r1432182715 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17122#issuecomment-1863759405 From gcao at openjdk.org Wed Dec 20 02:42:36 2023 From: gcao at openjdk.org (Gui Cao) Date: Wed, 20 Dec 2023 02:42:36 GMT Subject: RFR: 8321972: test runtime/Unsafe/InternalErrorTest.java timeout on linux-riscv64 platform In-Reply-To: References: <9rOe1C_eoD2fz22nqlfzaK5kJ_gxYZBsVVRe4hQwhaw=.abf30b11-de1f-456e-baa4-208755e136ee@github.com> Message-ID: On Thu, 14 Dec 2023 09:50:54 GMT, Fei Yang wrote: >> As described on the JBS issue, JDK-8320886 extended InternalErrorTest.java adding extra test for Unsafe_SetMemory0 trying to access next page after truncation. This triggers SIGBUS error and control flow is transfered to JVM signal handler [1]. But the current logic doesn't consider 16-bit compressed instructions when calculating next_pc. It always add NativeCall::instruction_size which is 4 to pc and use the result as next_pc. This is not correct as the memset invoked in this case contains compressed instructions and it is those instructions that are triggering the SIGBUS error. >> >> The proposed fix is similar with other platform with variable-length instruction encoding like x86. >> The encoding of the instruction triggering the SIGBUS error is checked to see if it is a compressed instruction and then calculate next_pc based on that. The test case can now pass normally with this fix. >> >> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_riscv/os_linux_riscv.cpp#L274 >> >> ### Testing: >> - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) > > Looks reasonable to me. I find that the native GNU compiler toolchain on both my unmatched and licheepi-4a boards are compiling with RVC by default, which means native JDK builds on those hardware platforms will also have compressed instructions. @RealFYang : Thanks for taking a look. If there is no other comment, I will proceed to integrate. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17103#issuecomment-1863763236 From dholmes at openjdk.org Wed Dec 20 02:49:50 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 20 Dec 2023 02:49:50 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 10:08:14 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that changes the filler array class name (again) after user feedback. > > In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. > This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. > > Testing: tier1-6 > > Thanks, > Thomas I'm still struggling with what we are doing with this filler array stuff. IIUC the underlying type is actually `int[]` but we pretend it is `FillerElement[]`. Won't exposing this fake type just lead to further problems if tools try to inspect one of these arrays as-if it were an `Object[]` ?? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-1863767089 From dholmes at openjdk.org Wed Dec 20 03:04:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 20 Dec 2023 03:04:38 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new In-Reply-To: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Wed, 6 Dec 2023 22:02:19 GMT, Matias Saavedra Silva wrote: > The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds a the optimization for x86 and aarch64. Verified with tier 1-5 tests. src/hotspot/cpu/x86/templateTable_x86.cpp line 4052: > 4050: > 4051: // make sure klass is initialized > 4052: if (VM_Version::supports_fast_class_init_checks()) { Maybe this should use conditional compilation and an assert rather than a dynamic runtime check, as we expect this to always, and only, be true on 64-bit. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17006#discussion_r1432197640 From apangin at openjdk.org Wed Dec 20 03:39:43 2023 From: apangin at openjdk.org (Andrei Pangin) Date: Wed, 20 Dec 2023 03:39:43 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v15] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 15:29:06 GMT, Dmitry Chuyko wrote: >> Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. >> >> A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. >> >> It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). >> >> Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. >> >> A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. >> >> In addition, a new diagnostic command `Compiler.replace_directives... > > Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 33 commits: > > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - Merge branch 'openjdk:master' into compiler-directives-force-update > - ... and 23 more: https://git.openjdk.org/jdk/compare/fde5b168...44d680cd src/hotspot/share/code/codeCache.cpp line 1409: > 1407: while(iter.next()) { > 1408: CompiledMethod* nm = iter.method(); > 1409: methodHandle mh(thread, nm->method()); If there are two CompiledMethods for the same Java method, will it be scheduled for recompilation twice? Related question: if `nm` is an OSR method, does it make sense to go directly for deoptimization rather than compiling a non-OSR version? src/hotspot/share/code/codeCache.cpp line 1413: > 1411: ResourceMark rm; > 1412: // Try the max level and let the directives be applied during the compilation. > 1413: int complevel = CompLevel::CompLevel_full_optimization; Should the highest level depend on the configuration instead of the hard-coded constant? Perhaps, needs to be `highest_compile_level()` src/hotspot/share/compiler/compilerDirectives.cpp line 750: > 748: if (!dir->is_default_directive() && dir->match(method)) { > 749: match_found = true; > 750: break; `match_found` is redundant: for better readability, you may just return true. Curly braces around MutexLocker won't be needed either. src/hotspot/share/oops/method.hpp line 820: > 818: // Clear the flags related to compiler directives that were set by the compilerBroker, > 819: // because the directives can be updated. > 820: void clear_method_flags() { The function name is a bit misleading - it clears only flags related to directives. src/hotspot/share/oops/methodFlags.hpp line 61: > 59: status(has_loops_flag_init , 1 << 14) /* The loop flag has been initialized */ \ > 60: status(on_stack_flag , 1 << 15) /* RedefineClasses support to keep Metadata from being cleaned */ \ > 61: status(has_matching_directives , 1 << 16) /* The method has matching directives */ \ It's worth noting that the flag is temporary and is valid only during DCmd execution. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1432195677 PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1432187571 PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1432200716 PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1432210229 PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1432212171 From dholmes at openjdk.org Wed Dec 20 04:47:58 2023 From: dholmes at openjdk.org (David Holmes) Date: Wed, 20 Dec 2023 04:47:58 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 17:09:59 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - Merge > - review: improve an assert message > - review: moved a couple of comments out of try blocks > - review: moved notifyJvmtiDisableSuspend(true) out of try-block > - review: 1) replace CriticalLock with DisableSuspend; 2) minor tweaks > - review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods > - Resolved merge conflict in VirtualThread.java > - added @summary to new test SuspendWithInterruptLock.java > - add new test SuspendWithInterruptLock.java > - 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable src/hotspot/share/prims/jvm.cpp line 4024: > 4022: #else > 4023: fatal("Should only be called with JVMTI enabled"); > 4024: #endif You can't do this! The Java code knows nothing about JVM TI being enabled/disabled and will call this function unconditionally. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1432241016 From jwaters at openjdk.org Wed Dec 20 04:52:27 2023 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 20 Dec 2023 04:52:27 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v3] In-Reply-To: References: Message-ID: > Compile the JDK as C++17, enabling the use of all C++17 language features Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains seven additional commits since the last revision: - Compiler versions in toolchain.m4 - Merge branch 'openjdk:master' into patch-7 - Merge branch 'openjdk:master' into patch-7 - Revert vm_version_linux_riscv.cpp - vm_version_linux_riscv.cpp - allocation.cpp - 8310260 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14988/files - new: https://git.openjdk.org/jdk/pull/14988/files/a1f21bbd..477f6b94 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14988&range=01-02 Stats: 2377 lines in 158 files changed: 1669 ins; 268 del; 440 mod Patch: https://git.openjdk.org/jdk/pull/14988.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14988/head:pull/14988 PR: https://git.openjdk.org/jdk/pull/14988 From iklam at openjdk.org Wed Dec 20 05:53:52 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 20 Dec 2023 05:53:52 GMT Subject: RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces [v3] In-Reply-To: References: Message-ID: <0inLU7r1LTV0l_uM_080MiSdU4LK2ycYtDStH13inLs=.ccd1ea78-3999-42c0-89c6-101e8a4da535@github.com> On Wed, 20 Dec 2023 00:52:47 GMT, David Holmes wrote: >> Ioi Lam has updated the pull request incrementally with one additional commit since the last revision: >> >> @dholmes-ora review #2 - fixed typos > > Looks good. > > Only one review needed for this doc change. > > Thanks. Thanks @dholmes-ora @calvinccheung for the review ------------- PR Comment: https://git.openjdk.org/jdk/pull/17152#issuecomment-1863894143 From iklam at openjdk.org Wed Dec 20 05:53:54 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 20 Dec 2023 05:53:54 GMT Subject: Integrated: 8322321: Add man page doc for -XX:+VerifySharedSpaces In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 06:19:15 GMT, Ioi Lam wrote: > `VerifySharedSpaces` was disabled in [JDK-8221478](https://bugs.openjdk.org/browse/JDK-8221478) by default. We should add an entry in the "java" man page about the intended use for this flag. This pull request has now been integrated. Changeset: f7dc257a Author: Ioi Lam URL: https://git.openjdk.org/jdk/commit/f7dc257a206d3104d6d24c2079ef1fe349368c49 Stats: 9 lines in 1 file changed: 9 ins; 0 del; 0 mod 8322321: Add man page doc for -XX:+VerifySharedSpaces Reviewed-by: dholmes, ccheung ------------- PR: https://git.openjdk.org/jdk/pull/17152 From jwaters at openjdk.org Wed Dec 20 06:02:59 2023 From: jwaters at openjdk.org (Julian Waters) Date: Wed, 20 Dec 2023 06:02:59 GMT Subject: RFR: 8314644: Change "Rvalue references and move semantics" into an accepted feature [v2] In-Reply-To: <1umNl-OT5s7nHhWAsg1mUXwg6kPc_kOF_NQZmn6j8ik=.b46d688e-fc9e-40c4-8064-34d47da49d37@github.com> References: <1umNl-OT5s7nHhWAsg1mUXwg6kPc_kOF_NQZmn6j8ik=.b46d688e-fc9e-40c4-8064-34d47da49d37@github.com> Message-ID: On Tue, 24 Oct 2023 10:37:43 GMT, Johan Sj?len wrote: >> Hi, >> >> I'd like to propose that rvalue references and move semantics are now considered permitted in the style guide. This change would allow for move constructors to be written. This enables more performant code, if the move ctr is less expensive than the copy ctr, but also more correct code. For the latter part, look at "8314571: GrowableArray should move its old data and not copy it". Here we can avoid using copy assignment, instead using move constructors, which more accurately reflects what is happening: The old elements are in fact moved, and not copied. >> >> Two useful std functions will become available to us with this change: >> >> 1. `std::move`, for explicitly moving a value. This is a slightly more powerful `static_cast(T)`, in that it also handles `T&` corectly. >> 2. `std::forward`, which simplifies the usage of perfect forwarding. Perfect forwarding is a technique where in copying is minimized. To quote Scott Meyers ( https://cppandbeyond.com/2011/04/25/session-announcement-adventures-in-perfect-forwarding/ ): >> >>> Perfecting forwarding is an important C++0x technique built atop rvalue references. It allows move semantics to be automatically applied, even when the source and the destination of a move are separated by intervening function calls. Common examples include constructors and setter functions that forward arguments they receive to the data members of the class they are initializing or setting, as well as standard library functions like make_shared, which ?perfect-forwards? its arguments to the class constructor of whatever object the to-be-created shared_ptr is to point to. >> >> Looking forward to your feedback, thank you. >> Johan > > Johan Sj?len has updated the pull request incrementally with two additional commits since the last revision: > > - Add a (admittedly clunky) single asterisk > - Expand on the feature Shame that this never made it in, this sounds pretty useful ------------- PR Comment: https://git.openjdk.org/jdk/pull/15386#issuecomment-1863901085 From rehn at openjdk.org Wed Dec 20 06:52:32 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 20 Dec 2023 06:52:32 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v2] In-Reply-To: References: Message-ID: > Hi, this is the instructions for zcb. > > Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. > Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. > I think we need to do some rework here. > > I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). > (macro stuff was originally done when templates where blacklisted in hotspot) > > And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). > > I have done some modification since it passed tier1, so I'm running stuff over the weekend. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains two additional commits since the last revision: - Merge branch 'master' into zcb - zcb instruction set ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17122/files - new: https://git.openjdk.org/jdk/pull/17122/files/38d94725..f0206e57 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=00-01 Stats: 2228 lines in 156 files changed: 1524 ins; 268 del; 436 mod Patch: https://git.openjdk.org/jdk/pull/17122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17122/head:pull/17122 PR: https://git.openjdk.org/jdk/pull/17122 From alanb at openjdk.org Wed Dec 20 08:05:06 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 20 Dec 2023 08:05:06 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 04:44:35 GMT, David Holmes wrote: > You can't do this! The Java code knows nothing about JVM TI being enabled/disabled and will call this function unconditionally. Indeed. I wonder if anyone is testing minimal builds to catch issues like this. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1432377494 From stuefe at openjdk.org Wed Dec 20 08:15:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 20 Dec 2023 08:15:55 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 16:20:02 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-7 >> - Revert vm_version_linux_riscv.cpp >> - vm_version_linux_riscv.cpp >> - allocation.cpp >> - 8310260 > > I agree that before throwing this switch, we need to look at some specific > issues that might need to be addressed, discuss the benefits, and also the > costs. > > As was discussed for the change to C++14, there is *never* a good time to > start introducing the use of new language features as far as backporting is > concerned, unless one is going to backport the language change too. We didn't > do that for C++14, and I don't think we are going to (nor should) do it for > C++17 either. But backporting concerns can't be all powerful, as that will > forever prevent potentially significant improvements. > > I started to make a list of new language features that seem particularly > beneficial or otherwise important. I was going to write style guide updates > for these, but haven't gotten very far with that yet. > > P0035R4: Dynamic memory allocation for over-aligned data > P0135R1: Guaranteed copy elision > P0145R3: Refining Expression Evaluation Order for Idiomatic C++ > P0292R2: constexpr if > P0091R3/P0512R0: Template argument deduction for class templates > > Here are some others that might be of interest to us. > N4268: Allow constant evaluation for all non-type template arguments > N3928: Extending static_assert > P0118R1: [[fallthrough]] attribute > P0189R1: [[nodiscard]] attribute > P0212R1: [[maybe_unused]] attribute > P0170R1: Wording for constexpr lambda > P0283R2: Ignoring unsupported non-standard attributes > P0061R1: __has_include for C++17 > P0386R2: Inline variables @kimbarrett > P0035R4: Dynamic memory allocation for over-aligned data Do we really need this? I ask because, in the end, this will result in something like `posix_memalign` to be called, and I remember it being notorious for causing large footprint overhead depending on how smart the underlying allocator is about using alignment waste. It will also be non-trivial to implement in hotspot since NMT uses malloc headers. Barring a rewrite of NMT malloc metadata tracking (e.g. using a hash map, which would be more costly both in terms of performance and, probably, footprint), malloc headers would have to be revised. Probably would need to be dynamic-sized. This is the reason we did not bother wrapping posix_memalign. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1864039399 From rehn at openjdk.org Wed Dec 20 08:21:14 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 20 Dec 2023 08:21:14 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v7] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: - t2 caller saved, no need to push/pop - Merge branch 'master' into sha256 - Removed swap file - Index load, other comment - Merge branch 'master' into sha256 - Materialize constants address once - Removed template - Flag fixes - Merge branch 'master' into sha256 - Share code - ... and 1 more: https://git.openjdk.org/jdk/compare/7132e44a...be46fe4f ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/c92975e0..be46fe4f Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=05-06 Stats: 2823 lines in 219 files changed: 1770 ins; 348 del; 705 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From rehn at openjdk.org Wed Dec 20 08:21:16 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 20 Dec 2023 08:21:16 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v6] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 14:17:01 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > Removed swap file Great we could sort that out. Removed t2 and merge with master. Passes my testing, doing some extra testing. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1864045440 From sroy at openjdk.org Wed Dec 20 08:36:08 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 20 Dec 2023 08:36:08 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> Message-ID: On Tue, 28 Nov 2023 11:27:33 GMT, Suchismith Roy wrote: > > > i would have to repeat the line 1132 and 1139 in os_aix.cpp again , if the condition fails for .so files, because i have to reload it again and check if the .a exists. In the shared code i had repeat less number of lines i believe. Do you suggest moving lines 1132 to 1139 to another function then ? > > > > > > @tstuefe Any suggestion on this ? > > ``` > --- a/src/hotspot/os/aix/os_aix.cpp > +++ b/src/hotspot/os/aix/os_aix.cpp > @@ -1108,7 +1108,7 @@ bool os::dll_address_to_library_name(address addr, char* buf, > return true; > } > > -void *os::dll_load(const char *filename, char *ebuf, int ebuflen) { > +static void* dll_load_inner(const char *filename, char *ebuf, int ebuflen) { > > log_info(os)("attempting shared library load of %s", filename); > > @@ -1158,6 +1158,35 @@ void *os::dll_load(const char *filename, char *ebuf, int ebuflen) { > return nullptr; > } > > +void* os::dll_load(const char *filename, char *ebuf, int ebuflen) { > + > + void* result = nullptr; > + > + // First try using *.so suffix; failing that, retry with *.a suffix. > + const size_t len = strlen(filename); > + constexpr size_t safety = 3 + 1; > + constexpr size_t bufsize = len + safety; > + char* buf = NEW_C_HEAP_ARRAY(char, bufsize, mtInternal); > + strcpy(buf, filename); > + char* const dot = strrchr(buf, '.'); > + > + assert(dot != nullptr, "Attempting to load a shared object without extension? %s", filename); > + assert(strcmp(dot, ".a") == 0 || strcmp(dot, ".so") == 0, > + "Attempting to load a shared object that is neither *.so nor *.a", filename); > + > + sprintf(dot, ".so"); > + result = dll_load_inner(buf, ebuf, ebuflen); > + > + if (result == nullptr) { > + sprintf(dot, ".a"); > + result = dll_load_inner(buf, ebuf, ebuflen); > + } > + > + FREE_C_HEAP_ARRAY(char, buf); > + > + return result; > +} > + > ``` Hi Thomas May I know what is the reason to use constexpr over regular datatypes ? Also, I have used strcpy to avoid buffer overflow.(Though we have calculated the exact length). Would that be fine ? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1864063261 From sroy at openjdk.org Wed Dec 20 08:36:06 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 20 Dec 2023 08:36:06 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v6] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: <_pW-qCHFEz6FfVSqHf2DVilu60sQNs5Lb8iREf1KJO4=.0942e492-1fb0-4e7b-9507-f406e4fb509e@github.com> > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request incrementally with two additional commits since the last revision: - Restore lines - Remove trailing spaces. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/cd7e0e64..9df8c2c8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=05 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=04-05 Stats: 4 lines in 1 file changed: 1 ins; 1 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From tschatzl at openjdk.org Wed Dec 20 08:40:48 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 20 Dec 2023 08:40:48 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 02:46:41 GMT, David Holmes wrote: > I'm still struggling with what we are doing with this filler array stuff. IIUC the underlying type is actually `int[]` but we pretend it is `FillerElement[]`. The reason for having dedicated filler array klasses is detecting internally in the GC that we are currently looking at a dead object. That the VM can distinguish dead objects from live objects is required for some functionality (like heap dumping only live classes, but also verification) without undue performance (basically need to redo marking every time this is needed) or memory usage hit (1.5% of maximum Java heap size) when invoking some APIs. E.g. any reference from a live object to a dead object will immediately assert. There is also improved debugability due to that - i.e. if you see a reference to any such klass in a hs_err file, it is 99% an error with a reference where the barrier has been forgotten. The reason for having a custom array type and not having a klass referencing `int` elements is because Hotspot does not support that. > Won't exposing this fake type just lead to further problems if tools try to inspect one of these arrays as-if it were an `Object[]` ?? We do not expose these arrays to regular APIs (e.g. we filter them out when iterating available klasses), and they are never referenced by other live objects by definition. So it is impossible for a Java application to get a reference to such objects or these klasses. The only way they are exposed is in a heap dump also containing dead objects (and related functionality like the `jcmd` class histogram). Somebody at some time thought that this would be an interesting feature so we need to support it. Previously one would get the same dead objects in that output just intermingled with other `int[]` arrays. Imo this discrimination is preferable and much more useful for heap analysis. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-1864069245 From stuefe at openjdk.org Wed Dec 20 09:19:51 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 20 Dec 2023 09:19:51 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work I need this too. Lets ship this. ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16973#pullrequestreview-1790479882 From sroy at openjdk.org Wed Dec 20 09:24:43 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 20 Dec 2023 09:24:43 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v2] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <5RZicS1WS5xiFzcJMhxg_Gjrtdc2I1c4vNMMb37OK-4=.e4ba7692-b18a-4b91-9b35-e444710e38b1@github.com> Message-ID: <1iM9tYJea_gUItPqqZ32s3lwfYXLZTbb9tlmKUS8OkY=.18b42be6-6c62-4a08-bc37-4ffd8ea5295d@github.com> On Tue, 19 Dec 2023 16:47:33 GMT, Goetz Lindenmaier wrote: > try it! I got the instructions to replicate in my local repo later, so wasn't sure to proceed. Thanks for the suggestion. I think this makes it easier to keep in sync with the other change. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1864127477 From tschatzl at openjdk.org Wed Dec 20 09:36:40 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 20 Dec 2023 09:36:40 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: <1fieS0UmIZ9zIuRkWTbJ4v4vJlzH_CSabLuCulkFWd0=.9ea47be2-7c8b-46d3-8c69-c2c5d32dee33@github.com> On Wed, 6 Dec 2023 10:27:58 GMT, Aleksey Shipilev wrote: >> [JDK-8321137](https://bugs.openjdk.org/browse/JDK-8321137) needs a clean separation between cache line sizes and padding sizes. At least on x86, there is a wrinkle with "assuming" the cache line size is 128 bytes to cater for prefetchers. Cleanly separating cache line size and padding size resolves this. I rewrote uses of `DEFAULT_CACHE_LINE_SIZE` in padding contexts to new macro. >> >> The goal for this patch is to avoid actual values changes as much as possible. One of the changes come from cleaning up some of the old cases in x86 definition, thus simplifying the definition. I think the LP64 split is still useful there. >> >> Additional testing: >> - [x] Large build matrix of server/zero builds >> - [x] Linux AArch64 server fastdebug, `tier{1,2}` >> - [x] Linux x86_64 server fastdebug, `tier{1,2}` > > Aleksey Shipilev has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: > > - Better verbiage for *2 adjustment for x86_64 > - Merge branch 'master' into JDK-8237842-cache-line-padding-defs > - Work Marked as reviewed by tschatzl (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16973#pullrequestreview-1790510412 From rehn at openjdk.org Wed Dec 20 09:57:10 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Wed, 20 Dec 2023 09:57:10 GMT Subject: RFR: 8320069: RISC-V: Add Zcb instructions [v3] In-Reply-To: References: Message-ID: > Hi, this is the instructions for zcb. > > Due to over lack of infrastructure having multiple extension dependent instruction does not fit well. > Some of these compressed instructions are also missing 1 to 1 mapping, e.g. now we have a compressed not, but the corresponding instruction in uncompressed is still xor. > I think we need to do some rework here. > > I also I don't like the macro expansion as it hopeless in debugger and 'IDE's (vim+rtags for me). > (macro stuff was originally done when templates where blacklisted in hotspot) > > And I don't want an option for this, as zcb is coming in hwprobe, if you have compressed on you get them if they are supported (may depend on e.g. zbb). > > I have done some modification since it passed tier1, so I'm running stuff over the weekend. Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains three additional commits since the last revision: - Merge branch 'master' into zcb - Merge branch 'master' into zcb - zcb instruction set ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17122/files - new: https://git.openjdk.org/jdk/pull/17122/files/f0206e57..4fa46f3c Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17122&range=01-02 Stats: 2 lines in 1 file changed: 2 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17122.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17122/head:pull/17122 PR: https://git.openjdk.org/jdk/pull/17122 From tschatzl at openjdk.org Wed Dec 20 10:18:49 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Wed, 20 Dec 2023 10:18:49 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 10:08:14 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that changes the filler array class name (again) after user feedback. > > In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. > This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. > > Testing: tier1-6 > > Thanks, > Thomas Fwiw, the original issue introducing details the advantages too https://bugs.openjdk.org/browse/JDK-8284435; it does not particularly point out how much memory this saves, but it mentions that it removes the need for keeping around an extra mark bitmap covering the whole Java heap (that 1.5%). ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-1864209478 From dchuyko at openjdk.org Wed Dec 20 10:20:05 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Wed, 20 Dec 2023 10:20:05 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v16] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 34 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 24 more: https://git.openjdk.org/jdk/compare/14dab319...e337e56b ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=15 Stats: 372 lines in 15 files changed: 339 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From sspitsyn at openjdk.org Wed Dec 20 10:42:59 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 20 Dec 2023 10:42:59 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 08:02:14 GMT, Alan Bateman wrote: >> src/hotspot/share/prims/jvm.cpp line 4024: >> >>> 4022: #else >>> 4023: fatal("Should only be called with JVMTI enabled"); >>> 4024: #endif >> >> You can't do this! The Java code knows nothing about JVM TI being enabled/disabled and will call this function unconditionally. > >> You can't do this! The Java code knows nothing about JVM TI being enabled/disabled and will call this function unconditionally. > > Indeed. I wonder if anyone is testing minimal builds to catch issues like this. Good catch, David! Filed a cleanup bug: https://bugs.openjdk.org/browse/JDK-8322538 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1432548911 From mdoerr at openjdk.org Wed Dec 20 11:13:51 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 20 Dec 2023 11:13:51 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 19:44:09 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | ? | | >> | ARM32 | | | | >> | x86 | | | ? | >> | x64 | | | ? | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. Thanks for the explanation! Makes sense. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1864290083 From sroy at openjdk.org Wed Dec 20 11:16:03 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Wed, 20 Dec 2023 11:16:03 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v7] In-Reply-To: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: > J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. > After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. > Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: Spaces fix ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16604/files - new: https://git.openjdk.org/jdk/pull/16604/files/9df8c2c8..ffcbf786 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=06 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16604&range=05-06 Stats: 3 lines in 1 file changed: 0 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16604.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16604/head:pull/16604 PR: https://git.openjdk.org/jdk/pull/16604 From kbarrett at openjdk.org Wed Dec 20 12:23:41 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 20 Dec 2023 12:23:41 GMT Subject: RFR: 8314488: Compile the JDK as C++17 In-Reply-To: References: Message-ID: On Sat, 19 Aug 2023 07:45:50 GMT, Andrew Haley wrote: > Is it impractical to drop the obsolete features of C++11, working in the common subset of C++11 and C++17? I'm not sure what is being suggested. Maybe some examples would help. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1864383047 From kbarrett at openjdk.org Wed Dec 20 12:23:39 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 20 Dec 2023 12:23:39 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 13:52:28 GMT, Martin Doerr wrote: > In case you want to update the required compiler versions as part of this PR: We have tested -TOOLCHAIN_MINIMUM_VERSION_xlc="16.1.0.0011" +TOOLCHAIN_MINIMUM_VERSION_xlc="17.1.1.4" (Already discussed with Kim.) Also discussed with the aix-ppc port maintainers at IBM. ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1864380101 From kbarrett at openjdk.org Wed Dec 20 12:18:54 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 20 Dec 2023 12:18:54 GMT Subject: RFR: 8314488: Compile the JDK as C++17 [v2] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 16:20:02 GMT, Kim Barrett wrote: >> Julian Waters has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains five additional commits since the last revision: >> >> - Merge branch 'openjdk:master' into patch-7 >> - Revert vm_version_linux_riscv.cpp >> - vm_version_linux_riscv.cpp >> - allocation.cpp >> - 8310260 > > I agree that before throwing this switch, we need to look at some specific > issues that might need to be addressed, discuss the benefits, and also the > costs. > > As was discussed for the change to C++14, there is *never* a good time to > start introducing the use of new language features as far as backporting is > concerned, unless one is going to backport the language change too. We didn't > do that for C++14, and I don't think we are going to (nor should) do it for > C++17 either. But backporting concerns can't be all powerful, as that will > forever prevent potentially significant improvements. > > I started to make a list of new language features that seem particularly > beneficial or otherwise important. I was going to write style guide updates > for these, but haven't gotten very far with that yet. > > P0035R4: Dynamic memory allocation for over-aligned data > P0135R1: Guaranteed copy elision > P0145R3: Refining Expression Evaluation Order for Idiomatic C++ > P0292R2: constexpr if > P0091R3/P0512R0: Template argument deduction for class templates > > Here are some others that might be of interest to us. > N4268: Allow constant evaluation for all non-type template arguments > N3928: Extending static_assert > P0118R1: [[fallthrough]] attribute > P0189R1: [[nodiscard]] attribute > P0212R1: [[maybe_unused]] attribute > P0170R1: Wording for constexpr lambda > P0283R2: Ignoring unsupported non-standard attributes > P0061R1: __has_include for C++17 > P0386R2: Inline variables > @kimbarrett > > > P0035R4: Dynamic memory allocation for over-aligned data > > Do we really need this? I ask because, in the end, this will result in something like `posix_memalign` to be called, and I remember it being notorious for causing large footprint overhead depending on how smart the underlying allocator is about using alignment waste. > > It will also be non-trivial to implement in hotspot since NMT uses malloc headers. Barring a rewrite of NMT malloc metadata tracking (e.g. using a hash map, which would be more costly both in terms of performance and, probably, footprint), malloc headers would have to be revised. Probably would need to be dynamic-sized. This is the reason we did not bother wrapping posix_memalign. We already have code that (incorrectly) expects dynamic allocation to support overalignment. There are several classes that overalign (often cache align) a member to avoid false sharing, but are dynamically allocated. See most (all?) uses of ZCACHE_ALIGNED for some examples. One way to fix this would be to give those classes their own operator new to perform aligned allocation somehow. That's what was done for OopStorage::Block, but it's clumsy and likely wasteful of memory. And it's easy to forget. A general solution would probably be better. But yes, NMT malloc headers certainly make the general problem challenging. The approach currently used for OopStorage::Block can be generalized and hooked into the standard mechanism. But maybe there are (possibly non-portable) alternatives that avoid the memory waste? ------------- PR Comment: https://git.openjdk.org/jdk/pull/14988#issuecomment-1864375915 From epeter at openjdk.org Wed Dec 20 12:57:10 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 20 Dec 2023 12:57:10 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap Message-ID: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. This has a few advantages: - Clear separation between arena (and resource area) allocating array and C-heap allocating array. - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. - We should not have multiple implementations of the same thing (C-Heap backed array). - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. **Bonus** We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. **Testing** Tier1-3 + stress testing: pending ------------- Commit messages: - fix comment about trivial elements - check for trivial destructors - improve comment - add an explicit to constructor - improve comments - remove cheap internals from GrowableArray and fix verification - remove constructors for GrowableArray with Cheap backing - 8322476 Changes: https://git.openjdk.org/jdk/pull/17160/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17160&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322476 Stats: 774 lines in 127 files changed: 108 ins; 204 del; 462 mod Patch: https://git.openjdk.org/jdk/pull/17160.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17160/head:pull/17160 PR: https://git.openjdk.org/jdk/pull/17160 From epeter at openjdk.org Wed Dec 20 12:58:53 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 20 Dec 2023 12:58:53 GMT Subject: RFR: 8319115: GrowableArray: Do not initialize up to capacity In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 22:48:18 GMT, Kim Barrett wrote: >> @eme64 Is it feasible to split this up to solve each of the problems you identify in stages? There is also overlap here with JDK-8319709 IIUC. Thanks. > >> @dholmes-ora These are the "parts": >> >> 1. initialize up to capacity vs length >> >> 2. update the test to verify this (complete refactoring) >> >> 3. remove cheap use of GrowableArray -> use GrowableArrayCHeap instead >> >> >> The first 2 items are inseparable, I cannot make substantial changes to many GrowableArray methods without there even being tests for them. And the tests would not pass before the changes for item 1, since the tests also verify what elements of the array are initialized. So adding the tests first would not be very feasible. >> >> The 3rd item could maybe be split, and be done before the rest. Though it would also require lots of changes to the test, which then I would have to completely refactor with items 1+2 anyway. >> >> And the items are related conceptually, that is why I would felt ok pushing them together. It is all about when (item 1) and what kinds of (item 3) constructors / destructors are called for the elements of the arrays, and verifying that thoroughly (item 2). >> >> Hence: feasible probably, but lots of work overhead. Do you think it is worth it? > > I too would prefer that it be split up. It's very easy to miss important details in amongst all the mostly relatively > simple renamings. That is, I think 3 should be separate from the other changes. @kimbarrett @dholmes-ora I just published this: https://github.com/openjdk/jdk/pull/17160 It removes the C-Heap capability from `GrowableArray`, and replaces usages with `GrowableArrayCHeap`. Bonus: we can now check that all element types of `GrowableArray` should be trivially destructible (that way it is always ok to abandon elements on the array, when the arena or ResourceMark go out of scope). ------------- PR Comment: https://git.openjdk.org/jdk/pull/16918#issuecomment-1864429000 From jbechberger at openjdk.org Wed Dec 20 13:46:50 2023 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 20 Dec 2023 13:46:50 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Fri, 1 Dec 2023 09:05:22 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust macOS coding src/hotspot/share/jfr/metadata/metadata.xml line 961: > 959: > 960: > 961: minor grammatical issue: "Stores the result in the case of an FP environment correction" is better ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16903#discussion_r1432732460 From mbaesken at openjdk.org Wed Dec 20 13:52:01 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 20 Dec 2023 13:52:01 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v3] In-Reply-To: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: > [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. > However the information is not added to the JFR events, and this should be enhanced. > The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Adjust description ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16903/files - new: https://git.openjdk.org/jdk/pull/16903/files/c7e63a27..c2d67a66 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16903&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16903&range=01-02 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16903.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16903/head:pull/16903 PR: https://git.openjdk.org/jdk/pull/16903 From mbaesken at openjdk.org Wed Dec 20 13:52:04 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 20 Dec 2023 13:52:04 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v2] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: <__hGycPPeLvAdjKvvpFnTzpo_3iQVuEnJMjuFGiOy1M=.46329fd7-1fd4-44d8-b196-f45a89d8cab4@github.com> On Wed, 20 Dec 2023 13:43:42 GMT, Johannes Bechberger wrote: >> Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: >> >> Adjust macOS coding > > src/hotspot/share/jfr/metadata/metadata.xml line 961: > >> 959: >> 960: >> 961: > > minor grammatical issue: "Stores the result in the case of an FP environment correction" is better Thanks, I adjusted the description. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16903#discussion_r1432737259 From jkern at openjdk.org Wed Dec 20 13:52:46 2023 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 20 Dec 2023 13:52:46 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Wed, 20 Dec 2023 11:16:03 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > Spaces fix Only some minor suggestions. src/hotspot/os/aix/os_aix.cpp line 1168: > 1166: int extension_length = 3; > 1167: char* file_path = NEW_C_HEAP_ARRAY(char, buffer_length + extension_length + 1, mtInternal); > 1168: strncpy(file_path,filename, buffer_length + 1); Why not using `char* file_path = os::strdup (filename);` which would replace lines 1167+1168 and use the corresponding `os::free (file_path);` at the end src/hotspot/os/aix/os_aix.cpp line 1174: > 1172: result = dll_load_library(file_path, ebuf, ebuflen); > 1173: // If the load fails,we try to reload by changing the extension to .a for .so files only. > 1174: if(result == nullptr) { Space between if and ( also next line ------------- Changes requested by jkern (Author). PR Review: https://git.openjdk.org/jdk/pull/16604#pullrequestreview-1790895382 PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1432716207 PR Review Comment: https://git.openjdk.org/jdk/pull/16604#discussion_r1432738451 From jbechberger at openjdk.org Wed Dec 20 13:58:50 2023 From: jbechberger at openjdk.org (Johannes Bechberger) Date: Wed, 20 Dec 2023 13:58:50 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v3] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Wed, 20 Dec 2023 13:52:01 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust description Looks good to me ------------- Marked as reviewed by jbechberger (Committer). PR Review: https://git.openjdk.org/jdk/pull/16903#pullrequestreview-1790946180 From mbaesken at openjdk.org Wed Dec 20 14:12:48 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 20 Dec 2023 14:12:48 GMT Subject: RFR: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library [v3] In-Reply-To: References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Wed, 20 Dec 2023 13:52:01 GMT, Matthias Baesken wrote: >> [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. >> However the information is not added to the JFR events, and this should be enhanced. >> The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Adjust description Hi Johannes, thanks for the review ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16903#issuecomment-1864539299 From alanb at openjdk.org Wed Dec 20 14:19:04 2023 From: alanb at openjdk.org (Alan Bateman) Date: Wed, 20 Dec 2023 14:19:04 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 17:09:59 GMT, Serguei Spitsyn wrote: >> This fix is for JDK 23 but the intention is to back port it to 22 in RDP-1 time frame. >> It is fixing a deadlock issue between `VirtualThread` class critical sections with the `interruptLock` (in methods: `unpark()`, `interrupt()`, `getAndClearInterrupt()`, `threadState()`, `toString()`), `JvmtiVTMSTransitionDisabler` and JVMTI `Suspend/Resume` mechanisms. >> The deadlocking scenario is well described by Patricio in a bug report comment. >> In simple words, a virtual thread should not be suspended during 'interruptLock' critical sections. >> >> The fix is to record that a virtual thread is in a critical section (`JavaThread`'s `_in_critical_section` bit) by notifying the VM/JVMTI about begin/end of critical section. >> This bit is used in `HandshakeState::get_op_for_self()` to filter out any `HandshakeOperation` if a target `JavaThread` is in a critical section. >> >> Some of new notifications with `notifyJvmtiSync()` method is on a performance critical path. It is why this method has been intrincified. >> >> New test was developed by Patricio: >> `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> The test is very nice as it reliably in 100% reproduces the deadlock without the fix. >> The test is never failing with this fix. >> >> Testing: >> - tested with newly added test: `test/hotspot/jtreg/serviceability/jvmti/vthread/SuspendWithInterruptLock` >> - tested with mach5 tiers 1-6 > > Serguei Spitsyn has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: > > - Merge > - review: improve an assert message > - review: moved a couple of comments out of try blocks > - review: moved notifyJvmtiDisableSuspend(true) out of try-block > - review: 1) replace CriticalLock with DisableSuspend; 2) minor tweaks > - review: (1) rename notifyJvmti method; (2) add try-final statements to VirtualThread methods > - Resolved merge conflict in VirtualThread.java > - added @summary to new test SuspendWithInterruptLock.java > - add new test SuspendWithInterruptLock.java > - 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable src/hotspot/share/runtime/javaThread.hpp line 652: > 650: > 651: bool is_disable_suspend() const { return _is_disable_suspend; } > 652: void toggle_is_disable_suspend() { _is_disable_suspend = !_is_disable_suspend; }; Looking at this again then I don't think it can be a bit that is toggled on and off will work. Consider the case where several threads attempt to poll the state of a virtual Thread with Thread::getState at the same time. This can't work without an atomic counter and further coordination. So I think further work is required on this issue. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1432770204 From epeter at openjdk.org Wed Dec 20 14:33:03 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 20 Dec 2023 14:33:03 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v8] In-Reply-To: References: Message-ID: > I'm making sure that `allocate_bci_to_data` is only called when holding the `extra_data_lock`, so that no concurrent calls of it can ever occur. > > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: fix conflicts with tty lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/e0fc8d1b..70327d38 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=06-07 Stats: 43 lines in 4 files changed: 15 ins; 0 del; 28 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From stuefe at openjdk.org Wed Dec 20 14:32:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Wed, 20 Dec 2023 14:32:55 GMT Subject: RFR: JDK-8320005 : Native library suffix impact on hotspot code in AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Wed, 20 Dec 2023 11:16:03 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > Spaces fix Hi, some requests and questions: - Please modify the JBS title, PR title, and JBS issue text to reflect that this adds an alternative shared object loading path for shared objects on AIX. Something like "Allow loading shared objects with .a extension on AIX". Please describe the new logic in the JBS issue text. - Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? - What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? - What happens if the original path handed to os::dll_load is already a *.a file? Should the logic then be reversed? - We really need regression tests for this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1864572287 From epeter at openjdk.org Wed Dec 20 14:38:07 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Wed, 20 Dec 2023 14:38:07 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v9] In-Reply-To: References: Message-ID: > I'm making sure that `allocate_bci_to_data` is only called when holding the `extra_data_lock`, so that no concurrent calls of it can ever occur. > > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: remove more locking ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/70327d38..30e5aebc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=07-08 Stats: 15 lines in 1 file changed: 1 ins; 7 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From jkern at openjdk.org Wed Dec 20 14:53:06 2023 From: jkern at openjdk.org (Joachim Kern) Date: Wed, 20 Dec 2023 14:53:06 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: Message-ID: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: improve error handling ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/f79c89da..7486ddb9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=06-07 Stats: 14 lines in 3 files changed: 1 ins; 6 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From mbaesken at openjdk.org Wed Dec 20 17:35:57 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Wed, 20 Dec 2023 17:35:57 GMT Subject: Integrated: JDK-8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library In-Reply-To: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> References: <2gUnFY5JoLg3EALGuXTJwhv2oyMm02zL3ckZQ3vkFck=.e5a60d88-99f9-4e97-8bcd-6dee3bf6f208@github.com> Message-ID: On Thu, 30 Nov 2023 14:44:03 GMT, Matthias Baesken wrote: > [JDK-8295159](https://bugs.openjdk.org/browse/JDK-8295159) added some IEEE conformance checks and corrections of the floating point environment on Linux and macOS/BSD, and later some UL logging was added too. > However the information is not added to the JFR events, and this should be enhanced. > The already existing NativeLibraryLoad event can be used for storing the additional information, because the IEEE conformance check and fenv get/set is placed in the HS dlopen_helper , where already the NativeLibraryLoad event objects are created/commited . This pull request has now been integrated. Changeset: e2042421 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/e2042421187dafc1aea75ffe15caf8beb824205b Stats: 109 lines in 6 files changed: 42 ins; 45 del; 22 mod 8321017: Record in JFR that IEEE rounding mode was corrupted by loading a library Reviewed-by: stuefe, jbechberger ------------- PR: https://git.openjdk.org/jdk/pull/16903 From kvn at openjdk.org Wed Dec 20 19:26:43 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 20 Dec 2023 19:26:43 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Thu, 14 Dec 2023 19:44:09 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | ? | | >> | ARM32 | | | | >> | x86 | | | ? | >> | x64 | | | ? | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. src/hotspot/cpu/x86/x86_32.ad line 1541: > 1539: // in the MacroAssembler. Should go away once all "instruct" are > 1540: // patched to emit bytes only using methods in MacroAssembler. > 1541: enc_class SetInstMark %{ Do you have separate RFE for that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1433098730 From matsaave at openjdk.org Wed Dec 20 19:53:10 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Wed, 20 Dec 2023 19:53:10 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v2] In-Reply-To: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: > The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds a the optimization for x86 and aarch64. Verified with tier 1-5 tests. Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: David comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17006/files - new: https://git.openjdk.org/jdk/pull/17006/files/924f3e76..dc9e5ae3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17006&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17006&range=00-01 Stats: 8 lines in 1 file changed: 0 ins; 1 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/17006.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17006/head:pull/17006 PR: https://git.openjdk.org/jdk/pull/17006 From kvn at openjdk.org Wed Dec 20 20:04:54 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 20 Dec 2023 20:04:54 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: <8ZkfLag-QKsU8stjY8amUxO_0-yTr8SlXOSh6nhQ1PM=.6efb1eef-b8a9-437a-9a1f-f5aa74009f0f@github.com> On Thu, 14 Dec 2023 19:44:09 GMT, Cesar Soares Lucas wrote: >> # Description >> >> Please review this PR with a patch to re-use the same C2_MacroAssembler object to emit all instructions in the same compilation unit. >> >> Overall, the change is pretty simple. However, due to the renaming of the variable to access C2_MacroAssembler, from `_masm.` to `masm->`, and also some method prototype changes, the patch became quite large. >> >> # Help Needed for Testing >> >> I don't have access to all platforms necessary to test this. I hope some other folks can help with testing on `S390`, `RISC-V` and `PPC`. >> >> # Testing status >> >> ## tier1 >> >> | | Win | Mac | Linux | >> |----------|---------|---------|---------| >> | ARM64 | | ? | | >> | ARM32 | | | | >> | x86 | | | ? | >> | x64 | | | ? | >> | PPC64 | | | | >> | S390x | | | | >> | RiscV | n/a | n/a | ? | > > Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: > > - Merge with origin/master > - Fix build, copyright dates, m4 files. > - Fix merge > - Catch up with master branch. > > Merge remote-tracking branch 'origin/master' into reuse-macroasm > - Some inst_mark fixes; Catch up with master. > - Catch up with changes on master > - Reuse same C2_MacroAssembler object to emit instructions. This looks good. Thank you for such detail work. I will run our testing. src/hotspot/cpu/x86/assembler_x86.cpp line 4248: > 4246: void Assembler::vpermb(XMMRegister dst, XMMRegister nds, XMMRegister src, int vector_len) { > 4247: assert(VM_Version::supports_avx512_vbmi(), ""); > 4248: InstructionMark im(this); May be add short comment why you need `InstructionMark` in these instructions but not in others. ------------- PR Review: https://git.openjdk.org/jdk/pull/16484#pullrequestreview-1791577633 PR Review Comment: https://git.openjdk.org/jdk/pull/16484#discussion_r1433133251 From kvn at openjdk.org Wed Dec 20 20:13:49 2023 From: kvn at openjdk.org (Vladimir Kozlov) Date: Wed, 20 Dec 2023 20:13:49 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Mon, 18 Dec 2023 18:16:45 GMT, Cesar Soares Lucas wrote: >> It seems odd to me that this substantial and complex patch lacks any justification. As far as I can tell, the decision to make class MacroAssembler very lightweight so that new instances could be created as needed was deliberate. Why change now? Is it performance, or something else? > > @theRealAph , @TheRealMDoerr - I just picked a JBS work item that seemed important (P3..) and started working on it. To me the refactoring made a lot of sense as well - why just create thousands of objects if just a few would do. > > If this is something that doesn't need to be done, please let me know. It already took substantial effort as you said. @JohnTortugo please, merge latest JDK. Patch did not apply for RISC-V sources (x_riscv.ad and z_riscv.ad). ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1865069941 From sspitsyn at openjdk.org Wed Dec 20 21:06:59 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 20 Dec 2023 21:06:59 GMT Subject: RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable [v8] In-Reply-To: References: Message-ID: <91xGNVFk3N8kP6sdAHhaGp9HhSaGS52_D6xW3aDivSY=.3196af09-9998-462d-9fe0-f91b2afd184a@github.com> On Wed, 20 Dec 2023 14:15:48 GMT, Alan Bateman wrote: > Update: ignore this I mis-read that it updates the current thread's suspend value, not the thread's suspend value. Thanks, Alan. I've also got confused with this and even filed a follow up bug. :) Yes, the initial design was the `_is_disable_suspend` is set/modified/accessed on the current thread only. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17011#discussion_r1433182764 From kbarrett at openjdk.org Wed Dec 20 21:13:53 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 20 Dec 2023 21:13:53 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> Message-ID: <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> On Tue, 19 Dec 2023 16:59:05 GMT, Emanuel Peter wrote: > [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. > > This has a few advantages: > - Clear separation between arena (and resource area) allocating array and C-heap allocating array. > - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. > - We should not have multiple implementations of the same thing (C-Heap backed array). > - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. > > **Bonus** > We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. > > For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. > > **Testing** > Tier1-3 + stress testing: pending pre-existing: There are a lot of non-static class data members that are pointers to GrowableArray that seem like they would be better as direct, e.g. non-pointers. pre-existing: There are a lot of iterations over GrowableArray's that would be simplified by using range-based-for. I'm not a fan of the additional clutter in APIs that the static memory types add. If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable and took a memory type to use internally as a constructor argument, then I think a lot of that clutter could be eliminated. It could be used for ordinary data members that are direct GAs rather than pointers to GAs. I think there is a way to do something similar for static data members that are pointers that are dynamically allocated later, though that probably requires more work. I've not yet reviewed the changes to growableArray.[ch]pp yet, nor the test changes. But I've run out of time and energy for this for today. src/hotspot/share/cds/dumpTimeClassInfo.hpp line 162: > 160: private: > 161: template > 162: static int array_length_or_zero(GrowableArrayCHeap* array) { Argument could be `GrowableArrayView*`, removing the coupling on the memory type. Also, pre-existing: the argument should be const. src/hotspot/share/cds/metaspaceShared.cpp line 441: > 439: > 440: void dump_java_heap_objects(GrowableArrayCHeap* klasses) NOT_CDS_JAVA_HEAP_RETURN; > 441: void dump_shared_symbol_table(GrowableArrayView* symbols) { pre-existing: Perhaps the arguments to these should be const. src/hotspot/share/cds/metaspaceShared.cpp line 840: > 838: > 839: #if INCLUDE_CDS_JAVA_HEAP > 840: void VM_PopulateDumpSharedSpace::dump_java_heap_objects(GrowableArrayCHeap* klasses) { pre-existing: Perhaps the argument should be const. src/hotspot/share/classfile/compactHashtable.cpp line 54: > 52: > 53: _num_entries_written = 0; > 54: _buckets = NEW_C_HEAP_ARRAY(EntryBucket*, _num_buckets, mtSymbol); pre-existing: It seems like the code could be simpler if the type of _buckets was GrowableArrayCHeap. src/hotspot/share/classfile/javaClasses.cpp line 1824: > 1822: // Pick minimum length that will cover most cases > 1823: int init_length = 64; > 1824: _methods = new GrowableArrayCHeap(init_length); Consider renaming init_length => init_capacity. src/hotspot/share/code/codeCache.hpp line 92: > 90: private: > 91: // CodeHeaps of the cache > 92: typedef GrowableArrayCHeap CodeHeapArray; pre-existing: Consider moving CodeHeapArray to namespace scope and prefer using it to the long-form. If not at namespace scope, it could at least be public in this class and used throughout, including in public APIs. src/hotspot/share/memory/arena.hpp line 209: > 207: > 208: #ifdef ASSERT > 209: bool Arena_contains(const Arena* arena, const void* ptr); This function doesn't seem necessary. Directly calling arena->contains(ptr) in the one place it's being seems like it should suffice. src/hotspot/share/memory/heapInspection.cpp line 282: > 280: KlassInfoHisto::KlassInfoHisto(KlassInfoTable* cit) : > 281: _cit(cit) { > 282: _elements = new GrowableArrayCHeap(_histo_initial_size); pre-existing: Why is this initialization separate from the ctor-initializer? And this looks like an example of where it would be better as a direct GA member rather than a pointer to GA. ------------- Changes requested by kbarrett (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17160#pullrequestreview-1790925376 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1432733463 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433109448 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433110835 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433129906 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433133733 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433140564 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433160909 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433164132 From kbarrett at openjdk.org Wed Dec 20 21:13:54 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Wed, 20 Dec 2023 21:13:54 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: On Wed, 20 Dec 2023 19:37:52 GMT, Kim Barrett wrote: >> [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. >> >> This has a few advantages: >> - Clear separation between arena (and resource area) allocating array and C-heap allocating array. >> - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. >> - We should not have multiple implementations of the same thing (C-Heap backed array). >> - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. >> >> **Bonus** >> We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. >> >> For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. >> >> **Testing** >> Tier1-3 + stress testing: pending > > src/hotspot/share/cds/metaspaceShared.cpp line 840: > >> 838: >> 839: #if INCLUDE_CDS_JAVA_HEAP >> 840: void VM_PopulateDumpSharedSpace::dump_java_heap_objects(GrowableArrayCHeap* klasses) { > > pre-existing: Perhaps the argument should be const. pre-existing and can't attach comment to line#50: `int i;` is dead variable. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433116120 From sspitsyn at openjdk.org Wed Dec 20 21:34:19 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Wed, 20 Dec 2023 21:34:19 GMT Subject: [jdk22] RFR: 8311218: fatal error: stuck in JvmtiVTMSTransitionDisabler::VTMS_transition_disable Message-ID: Hi all, This pull request contains a backport of commit [0f8e4e0a](https://github.com/openjdk/jdk/commit/0f8e4e0a81257c678e948c341a241dc0b810494f) from the [openjdk/jdk](https://git.openjdk.org/jdk) repository. The commit being backported was authored by Serguei Spitsyn on 19 Dec 2023 and was reviewed by Leonid Mesnik and Alan Bateman. Thanks! ------------- Commit messages: - Backport 0f8e4e0a81257c678e948c341a241dc0b810494f Changes: https://git.openjdk.org/jdk22/pull/23/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=23&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8311218 Stats: 229 lines in 15 files changed: 196 ins; 0 del; 33 mod Patch: https://git.openjdk.org/jdk22/pull/23.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/23/head:pull/23 PR: https://git.openjdk.org/jdk22/pull/23 From dcubed at openjdk.org Wed Dec 20 22:04:48 2023 From: dcubed at openjdk.org (Daniel D. Daugherty) Date: Wed, 20 Dec 2023 22:04:48 GMT Subject: RFR: 8237842: Separate definitions for default cache line and padding sizes [v2] In-Reply-To: References: Message-ID: On Wed, 6 Dec 2023 22:34:24 GMT, Vladimir Kozlov wrote: > @dcubed-ojdk, as author of [JDK-8049737](https://bugs.openjdk.org/browse/JDK-8049737) changes, do you remember why we use double cacheline for padding? Sorry, I've been pretty much off the air w.r.t e-mail due to problems with my MBP13. We used double cache line size way back then because some hardware fetched two cache lines worth. I believe that was true of some Intel versions and SPARCV9. I got the two cache line worth request from Dave Dice. Way back in the time frame of JDK-8049737 I wrote some micro benchmarks that we tested with no padding, with one cache line worth of padding and two cache lines worth of padding and we saw performance improvements on linux-x64, solaris-sparc64, solaris-x64 and I think smaller improvements on windows-x64. We tried to convert my microbenchmarks into JMHs without success. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16973#issuecomment-1865199986 From sviswanathan at openjdk.org Wed Dec 20 22:35:43 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 20 Dec 2023 22:35:43 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v4] In-Reply-To: <6xRfGFR2RIYln31ivz1ZITs5G5bzQ5BlnPdsQeyVbX0=.2797afc7-40ec-482e-9c64-b3ba746d2bd7@github.com> References: <6xRfGFR2RIYln31ivz1ZITs5G5bzQ5BlnPdsQeyVbX0=.2797afc7-40ec-482e-9c64-b3ba746d2bd7@github.com> Message-ID: <1nXsCTf2xdJslW020JbSxgLRbdcz3VHBbFRw_SuT2bo=.12349ba4-d5bd-4709-a9fc-5f184b54a6e3@github.com> On Tue, 19 Dec 2023 18:42:19 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Fix for JDK-8321599 src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2543: > 2541: > 2542: // Strip pad characters, if any, and adjust length and mask > 2543: __ addq(length, start_offset); This change is unrelated to this PR. src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 2548: > 2546: > 2547: __ BIND(L_donePadding); > 2548: __ subq(length, start_offset); This change is unrelated to this PR. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 3: > 1: /* > 2: * Copyright (c) 2023, Intel Corporation. All rights reserved. > 3: * Intel Math Library (LIBM) Source Code Please remove the line "Intel Math Library ..." as this is not from there. test/jdk/java/lang/StringBuffer/IndexOf.java line 45: > 43: System.err.println(gg); > 44: > 45: } else { Some changes in this test file looks like a leftover from debugging. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1433245292 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1433245471 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1433245843 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1433247486 From sviswanathan at openjdk.org Wed Dec 20 22:35:46 2023 From: sviswanathan at openjdk.org (Sandhya Viswanathan) Date: Wed, 20 Dec 2023 22:35:46 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v2] In-Reply-To: References: Message-ID: On Wed, 29 Nov 2023 15:01:32 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark Score Latest >> StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x >> StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x >> StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x >> StringIndexOf.constantPattern 9.361 11.906 1.271872663x >> StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x >> StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x >> StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x >> StringIndexOf.success 9.186 9.713 1.057369911x >> StringIndexOf.successBig 14.341 46.343 3.231504079x >> StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x >> StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x >> StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x >> StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x >> StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x >> StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x >> StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x >> StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 > > Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: > > Only use optimization when EnableX86ECoreOpts is true src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 42: > 40: // 2. Broadcast the last byte of the needle to a different ymm register > 41: // 3. Compare the first-byte ymm register to the first 32 bytes of the haystack > 42: // 4. Compare the last-byte register to the 32 bytes of the haystack at the (k-1)st position It would be good to mention that k is the length of the needle. src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 117: > 115: /******************************************************************************/ > 116: > 117: void StubGenerator::loop_helper(int size, Label& bailout, Label& loop_top) { Let us name this method as string_indexof_loop_helper() instead of just loop_helper(). ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1409626570 PR Review Comment: https://git.openjdk.org/jdk/pull/16753#discussion_r1409623608 From duke at openjdk.org Wed Dec 20 23:20:58 2023 From: duke at openjdk.org (duke) Date: Wed, 20 Dec 2023 23:20:58 GMT Subject: Withdrawn: 8317466: Enable interpreter oopMapCache for concurrent GCs In-Reply-To: References: Message-ID: On Fri, 6 Oct 2023 13:25:27 GMT, Zhengyu Gu wrote: > Interpreter oop maps are computed lazily during GC root scan and they are expensive to compute. > > GCs uses a small hash table per instance class to cache computed oop maps during STW root scan, but not for concurrent root scan. > > This patch is intended to enable `OopMapCache` for concurrent GCs. > > Test: > tier1 and tier2 fastdebug and release on MacOSX, Linux 86_84 and Linux 86_32. This pull request has been closed without being integrated. ------------- PR: https://git.openjdk.org/jdk/pull/16074 From iklam at openjdk.org Wed Dec 20 23:30:09 2023 From: iklam at openjdk.org (Ioi Lam) Date: Wed, 20 Dec 2023 23:30:09 GMT Subject: [jdk22] RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces Message-ID: <1x5AhIZGNf73hGbQpCgSAm4MxuUv-iaDZCIgmBXNw9I=.f7986b45-5d89-4de6-8761-3392e416c3ca@github.com> Clean backport. ------------- Commit messages: - 8322321: Add man page doc for -XX:+VerifySharedSpaces Changes: https://git.openjdk.org/jdk22/pull/24/files Webrev: https://webrevs.openjdk.org/?repo=jdk22&pr=24&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322321 Stats: 9 lines in 1 file changed: 9 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk22/pull/24.diff Fetch: git fetch https://git.openjdk.org/jdk22.git pull/24/head:pull/24 PR: https://git.openjdk.org/jdk22/pull/24 From mdoerr at openjdk.org Thu Dec 21 00:13:56 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 21 Dec 2023 00:13:56 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Wed, 20 Dec 2023 14:53:06 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > improve error handling A pretty complex solution, but I couldn't spot any real bug. Please consider my suggestions. src/hotspot/os/aix/porting_aix.cpp line 25: > 23: */ > 24: // needs to be defined first, so that the implicit loaded xcoff.h header defines > 25: // the right structures to analyze the loader header of 32 and 64 Bit executable files I don't think we support 32 bit executables. src/hotspot/os/aix/porting_aix.cpp line 916: > 914: constexpr int max_handletable = 1024; > 915: static int g_handletable_used = 0; > 916: static struct handletableentry g_handletable[max_handletable] = {{0, 0, 0, 0}}; Wouldn't `ConcurrentHashTable` be a better data structure? It is already used in hotspot, can grow dynamically and doesn't need linear search. src/hotspot/os/aix/porting_aix.cpp line 921: > 919: // If the libpath cannot be retrieved return an empty path > 920: static const char* rtv_linkedin_libpath() { > 921: static char buffer[4096]; Maybe define a constant for the buffer size? src/hotspot/os/aix/porting_aix.cpp line 927: > 925: // let libpath point to buffer, which then contains a valid libpath > 926: // or an empty string > 927: if (libpath) { `!= nullptr` is common in hotspot. src/hotspot/os/aix/porting_aix.cpp line 934: > 932: // to open it > 933: snprintf(buffer, 100, "/proc/%ld/object/a.out", (long)getpid()); > 934: FILE* f = 0; Should be nullptr. src/hotspot/os/aix/porting_aix.cpp line 990: > 988: } > 989: ret = (0 == stat64x(combined.base(), stat)); > 990: os::free (path2); Please remove the extra whitespace. src/hotspot/os/aix/porting_aix.cpp line 1026: > 1024: > 1025: os::free (libpath); > 1026: os::free (path2); Same here. ------------- PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1791807521 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433267331 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433283111 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433273616 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433270399 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433289382 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433290839 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433291127 From gcao at openjdk.org Thu Dec 21 01:31:59 2023 From: gcao at openjdk.org (Gui Cao) Date: Thu, 21 Dec 2023 01:31:59 GMT Subject: Integrated: 8321972: test runtime/Unsafe/InternalErrorTest.java timeout on linux-riscv64 platform In-Reply-To: <9rOe1C_eoD2fz22nqlfzaK5kJ_gxYZBsVVRe4hQwhaw=.abf30b11-de1f-456e-baa4-208755e136ee@github.com> References: <9rOe1C_eoD2fz22nqlfzaK5kJ_gxYZBsVVRe4hQwhaw=.abf30b11-de1f-456e-baa4-208755e136ee@github.com> Message-ID: On Thu, 14 Dec 2023 08:28:42 GMT, Gui Cao wrote: > As described on the JBS issue, JDK-8320886 extended InternalErrorTest.java adding extra test for Unsafe_SetMemory0 trying to access next page after truncation. This triggers SIGBUS error and control flow is transfered to JVM signal handler [1]. But the current logic doesn't consider 16-bit compressed instructions when calculating next_pc. It always add NativeCall::instruction_size which is 4 to pc and use the result as next_pc. This is not correct as the memset invoked in this case contains compressed instructions and it is those instructions that are triggering the SIGBUS error. > > The proposed fix is similar with other platform with variable-length instruction encoding like x86. > The encoding of the instruction triggering the SIGBUS error is checked to see if it is a compressed instruction and then calculate next_pc based on that. The test case can now pass normally with this fix. > > [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_riscv/os_linux_riscv.cpp#L274 > > ### Testing: > - [x] Run tier1-3 tests on qemu 8.1.50 with UseRVV (release) This pull request has now been integrated. Changeset: e8768ae0 Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk/commit/e8768ae08dbee9c3e1ed01934142c03ffad5f349 Stats: 13 lines in 2 files changed: 11 ins; 0 del; 2 mod 8321972: test runtime/Unsafe/InternalErrorTest.java timeout on linux-riscv64 platform Co-authored-by: Fei Yang Reviewed-by: fyang ------------- PR: https://git.openjdk.org/jdk/pull/17103 From ccheung at openjdk.org Thu Dec 21 01:45:50 2023 From: ccheung at openjdk.org (Calvin Cheung) Date: Thu, 21 Dec 2023 01:45:50 GMT Subject: [jdk22] RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces In-Reply-To: <1x5AhIZGNf73hGbQpCgSAm4MxuUv-iaDZCIgmBXNw9I=.f7986b45-5d89-4de6-8761-3392e416c3ca@github.com> References: <1x5AhIZGNf73hGbQpCgSAm4MxuUv-iaDZCIgmBXNw9I=.f7986b45-5d89-4de6-8761-3392e416c3ca@github.com> Message-ID: On Wed, 20 Dec 2023 23:21:41 GMT, Ioi Lam wrote: > Clean backport. Marked as reviewed by ccheung (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk22/pull/24#pullrequestreview-1791988134 From gcao at openjdk.org Thu Dec 21 02:11:51 2023 From: gcao at openjdk.org (Gui Cao) Date: Thu, 21 Dec 2023 02:11:51 GMT Subject: [jdk22] RFR: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: <6N7VbvE9dr0-08boMhEdpvvMMvurL_e-xUD1kp7uSVY=.77cf21f4-fa12-4c9a-bb9a-9d5d9b794ecd@github.com> References: <6N7VbvE9dr0-08boMhEdpvvMMvurL_e-xUD1kp7uSVY=.77cf21f4-fa12-4c9a-bb9a-9d5d9b794ecd@github.com> Message-ID: On Wed, 20 Dec 2023 00:43:40 GMT, Fei Yang wrote: >> Clean backport which adds back missing code change in MacroAssembler::load_reserved in file src/hotspot/cpu/riscv/macroAssembler_riscv.cpp for https://bugs.openjdk.org/browse/JDK-8315743. This is a riscv-specific change, risk is low. >> >> - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) > > Marked as reviewed by fyang (Reviewer). @RealFYang : Thanks for taking a look. ------------- PR Comment: https://git.openjdk.org/jdk22/pull/19#issuecomment-1865375238 From gcao at openjdk.org Thu Dec 21 02:11:52 2023 From: gcao at openjdk.org (Gui Cao) Date: Thu, 21 Dec 2023 02:11:52 GMT Subject: [jdk22] Integrated: 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 08:30:43 GMT, Gui Cao wrote: > Clean backport which adds back missing code change in MacroAssembler::load_reserved in file src/hotspot/cpu/riscv/macroAssembler_riscv.cpp for https://bugs.openjdk.org/browse/JDK-8315743. This is a riscv-specific change, risk is low. > > - [x] Run tier1 tests on qemu 8.1.50 with UseRVV (release) This pull request has now been integrated. Changeset: c249229b Author: Gui Cao Committer: Fei Yang URL: https://git.openjdk.org/jdk22/commit/c249229b3cdcdac81187cd9cd99267cc2ced64ea Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod 8322154: RISC-V: JDK-8315743 missed change in MacroAssembler::load_reserved Reviewed-by: fyang Backport-of: 59073fa3eb7d04d9e0f08fbef70c9db6ffde296a ------------- PR: https://git.openjdk.org/jdk22/pull/19 From dholmes at openjdk.org Thu Dec 21 02:24:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 21 Dec 2023 02:24:48 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 10:08:14 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that changes the filler array class name (again) after user feedback. > > In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. > This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. > > Testing: tier1-6 > > Thanks, > Thomas Looks good. Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17155#pullrequestreview-1792011042 From dholmes at openjdk.org Thu Dec 21 02:24:49 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 21 Dec 2023 02:24:49 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 10:15:44 GMT, Thomas Schatzl wrote: >> Hi all, >> >> please review this change that changes the filler array class name (again) after user feedback. >> >> In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. >> This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. >> >> Testing: tier1-6 >> >> Thanks, >> Thomas > > Fwiw, the original issue introducing details the advantages too https://bugs.openjdk.org/browse/JDK-8284435; it does not particularly point out how much memory this saves, but it mentions that it removes the need for keeping around an extra mark bitmap covering the whole Java heap (that 1.5%). Thanks for clarifying @tschatzl that these pretend arrays are only exposed in this one case. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-1865383374 From iklam at openjdk.org Thu Dec 21 02:27:46 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 21 Dec 2023 02:27:46 GMT Subject: [jdk22] RFR: 8322321: Add man page doc for -XX:+VerifySharedSpaces In-Reply-To: References: <1x5AhIZGNf73hGbQpCgSAm4MxuUv-iaDZCIgmBXNw9I=.f7986b45-5d89-4de6-8761-3392e416c3ca@github.com> Message-ID: On Thu, 21 Dec 2023 01:42:57 GMT, Calvin Cheung wrote: >> Clean backport. > > Marked as reviewed by ccheung (Reviewer). Thanks @calvinccheung ------------- PR Comment: https://git.openjdk.org/jdk22/pull/24#issuecomment-1865385844 From iklam at openjdk.org Thu Dec 21 02:27:48 2023 From: iklam at openjdk.org (Ioi Lam) Date: Thu, 21 Dec 2023 02:27:48 GMT Subject: [jdk22] Integrated: 8322321: Add man page doc for -XX:+VerifySharedSpaces In-Reply-To: <1x5AhIZGNf73hGbQpCgSAm4MxuUv-iaDZCIgmBXNw9I=.f7986b45-5d89-4de6-8761-3392e416c3ca@github.com> References: <1x5AhIZGNf73hGbQpCgSAm4MxuUv-iaDZCIgmBXNw9I=.f7986b45-5d89-4de6-8761-3392e416c3ca@github.com> Message-ID: On Wed, 20 Dec 2023 23:21:41 GMT, Ioi Lam wrote: > Clean backport. This pull request has now been integrated. Changeset: ea6d79ff Author: Ioi Lam URL: https://git.openjdk.org/jdk22/commit/ea6d79ff94b029dbcc7162556cc3e1f470ffbd3e Stats: 9 lines in 1 file changed: 9 ins; 0 del; 0 mod 8322321: Add man page doc for -XX:+VerifySharedSpaces Reviewed-by: ccheung Backport-of: f7dc257a206d3104d6d24c2079ef1fe349368c49 ------------- PR: https://git.openjdk.org/jdk22/pull/24 From sspitsyn at openjdk.org Thu Dec 21 02:36:05 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 21 Dec 2023 02:36:05 GMT Subject: RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI Message-ID: The macro `#else` branch conditions of `#if INCLUDE_JVMTI` in the `JVM_VirtualThread*` methods/functions (see `jvm.cpp`) are incorrect and have to be removed. For example, the lines 4022-4023 have to be removed from the fragment below: 4013 JVM_ENTRY(void, JVM_VirtualThreadDisableSuspend(JNIEnv* env, jobject vthread, jboolean enter)) 4014 #if INCLUDE_JVMTI 4015 if (!DoJVMTIVirtualThreadTransitions) { 4016 assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); 4017 return; 4018 } 4019 assert(thread->is_disable_suspend() != (bool)enter, 4020 "nested or unbalanced monitor enter/exit is not allowed"); 4021 thread->toggle_is_disable_suspend(); 4022 #else 4023 fatal("Should only be called with JVMTI enabled"); 4024 #endif 4025 JVM_END ------------- Commit messages: - 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI Changes: https://git.openjdk.org/jdk/pull/17174/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17174&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322538 Stats: 12 lines in 1 file changed: 0 ins; 12 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17174.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17174/head:pull/17174 PR: https://git.openjdk.org/jdk/pull/17174 From dholmes at openjdk.org Thu Dec 21 02:54:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 21 Dec 2023 02:54:48 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v2] In-Reply-To: References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Wed, 20 Dec 2023 19:53:10 GMT, Matias Saavedra Silva wrote: >> The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds a the optimization for x86 and aarch64. Verified with tier 1-5 tests. > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > David comments The code itself seems fine. But I don't see any supporting evidence of how this optimisation performs. IIUC the only case it benefits is when a static initializer creates an instance of the class being initialized - how typical is that? The definitions of `supports_fast_class_init_checks` all have the comment: // ... supports fast class initialization checks for static methods. which needs updating now it is not just for static methods. (Actually the comments became out-dated with [JDK-8223216](https://bugs.openjdk.org/browse/JDK-8223216) but it would be nice to fix them all.) Would be nice to see this applied across all platforms too - or at least those that support fast_class_init - ppc and s390. Perhaps file follow-up RFE's for those platforms so this is not forgotten. I note that RISCV64 doesn't support fast_class_init at all yet. Thanks ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17006#pullrequestreview-1792072903 From dholmes at openjdk.org Thu Dec 21 05:25:48 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 21 Dec 2023 05:25:48 GMT Subject: RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 02:29:33 GMT, Serguei Spitsyn wrote: > The macro `#else` branch conditions of `#if INCLUDE_JVMTI` in the `JVM_VirtualThread*` methods/functions (see `jvm.cpp`) are incorrect and have to be removed. > For example, the lines 4022-4023 have to be removed from the fragment below: > > 4013 JVM_ENTRY(void, JVM_VirtualThreadDisableSuspend(JNIEnv* env, jobject vthread, jboolean enter)) > 4014 #if INCLUDE_JVMTI > 4015 if (!DoJVMTIVirtualThreadTransitions) { > 4016 assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); > 4017 return; > 4018 } > 4019 assert(thread->is_disable_suspend() != (bool)enter, > 4020 "nested or unbalanced monitor enter/exit is not allowed"); > 4021 thread->toggle_is_disable_suspend(); > 4022 #else > 4023 fatal("Should only be called with JVMTI enabled"); > 4024 #endif > 4025 JVM_END Looks good. I had not realized this code pattern was pre-existing! Thanks. ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17174#pullrequestreview-1792213697 From epeter at openjdk.org Thu Dec 21 06:13:48 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 21 Dec 2023 06:13:48 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: On Wed, 20 Dec 2023 20:35:05 GMT, Kim Barrett wrote: >> [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. >> >> This has a few advantages: >> - Clear separation between arena (and resource area) allocating array and C-heap allocating array. >> - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. >> - We should not have multiple implementations of the same thing (C-Heap backed array). >> - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. >> >> **Bonus** >> We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. >> >> For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. >> >> **Testing** >> Tier1-3 + stress testing: pending > > src/hotspot/share/memory/arena.hpp line 209: > >> 207: >> 208: #ifdef ASSERT >> 209: bool Arena_contains(const Arena* arena, const void* ptr); > > This function doesn't seem necessary. Directly calling arena->contains(ptr) in the one place it's being seems > like it should suffice. @kimbarrett the reason was that I need to call this from the hpp file, and I encountered some circular dependency I did could not resolve. So I needed to move something off to the cpp files. Either I put it in arena.cpp, or in growableArray.cpp. But If I put things off to growableArray.cpp from the GrowableArray class, then it will not easily instantiate the templates, so that is not possible then. Hence I have to put it into arena.cpp ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433582008 From gcao at openjdk.org Thu Dec 21 06:22:50 2023 From: gcao at openjdk.org (Gui Cao) Date: Thu, 21 Dec 2023 06:22:50 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v2] In-Reply-To: References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Thu, 21 Dec 2023 02:51:41 GMT, David Holmes wrote: > Would be nice to see this applied across all platforms too - or at least those that support fast_class_init - ppc and s390. Perhaps file follow-up RFE's for those platforms so this is not forgotten. I note that RISCV64 doesn't support fast_class_init at all yet. Hi, Regarding support for fast_cIass_init on linux-riscv64 platform, I took a quick look. It seems that the code for this is there, but not enabled by default. I will double check and see if we can further enable this feature on this platform. And I have created an issue to that [1]. [1] https://bugs.openjdk.org/browse/JDK-8322583 ------------- PR Comment: https://git.openjdk.org/jdk/pull/17006#issuecomment-1865582796 From stuefe at openjdk.org Thu Dec 21 06:39:53 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 21 Dec 2023 06:39:53 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: <8CW_22cMoJAT7fewoZOfo5IwtdVxu1-zHktzkX_8fb4=.c4643f5c-0ba2-4771-ab83-c481bc8857f9@github.com> On Wed, 20 Dec 2023 14:53:06 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > improve error handling still ok, small nit inside src/hotspot/os/aix/porting_aix.cpp line 1033: > 1031: // filled by os::dll_load(). This way we mimic dl handle equality for a library > 1032: // opened a second time, as it is implemented on other platforms. > 1033: void* Aix_dlopen(const char* filename, int Flags, const char** error_report) { add assert for error_report != nullptr ------------- Marked as reviewed by stuefe (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16920#pullrequestreview-1792301031 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433606032 From epeter at openjdk.org Thu Dec 21 06:45:48 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 21 Dec 2023 06:45:48 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: On Wed, 20 Dec 2023 21:11:09 GMT, Kim Barrett wrote: >> [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. >> >> This has a few advantages: >> - Clear separation between arena (and resource area) allocating array and C-heap allocating array. >> - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. >> - We should not have multiple implementations of the same thing (C-Heap backed array). >> - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. >> >> **Bonus** >> We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. >> >> For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. >> >> **Testing** >> Tier1-3 + stress testing: pending > > pre-existing: There are a lot of non-static class data members that are pointers to > GrowableArray that seem like they would be better as direct, e.g. non-pointers. > > pre-existing: There are a lot of iterations over GrowableArray's that would be > simplified by using range-based-for. > > I'm not a fan of the additional clutter in APIs that the static memory types add. > If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable > and took a memory type to use internally as a constructor argument, then I think a > lot of that clutter could be eliminated. It could be used for ordinary data members > that are direct GAs rather than pointers to GAs. I think there is a way to do something > similar for static data members that are pointers that are dynamically allocated later, > though that probably requires more work. > > I've not yet reviewed the changes to growableArray.[ch]pp yet, nor the test changes. > But I've run out of time and energy for this for today. @kimbarrett Thanks for looking at the PR! I see you address a lot of "pre-existing" issues. And you would like GrowableArrayCHeap not have the MEMFLAGS in the template argument but maybe as a constructor argument instead. Or maybe a GACH version that only allocates once, though I guess that would limit what kinds of methods you could call on it... Can we address these issues as separate RFE's? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17160#issuecomment-1865639695 From alanb at openjdk.org Thu Dec 21 07:24:36 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 21 Dec 2023 07:24:36 GMT Subject: RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: References: Message-ID: <_NLAhKr98u9T2JMWSx8_KJjiR-ZGsKGbyEE9weEuQhk=.b38f970c-3793-441e-8b4f-e30df9f988d6@github.com> On Thu, 21 Dec 2023 02:29:33 GMT, Serguei Spitsyn wrote: > The macro `#else` branch conditions of `#if INCLUDE_JVMTI` in the `JVM_VirtualThread*` methods/functions (see `jvm.cpp`) are incorrect and have to be removed. > For example, the lines 4022-4023 have to be removed from the fragment below: > > 4013 JVM_ENTRY(void, JVM_VirtualThreadDisableSuspend(JNIEnv* env, jobject vthread, jboolean enter)) > 4014 #if INCLUDE_JVMTI > 4015 if (!DoJVMTIVirtualThreadTransitions) { > 4016 assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); > 4017 return; > 4018 } > 4019 assert(thread->is_disable_suspend() != (bool)enter, > 4020 "nested or unbalanced monitor enter/exit is not allowed"); > 4021 thread->toggle_is_disable_suspend(); > 4022 #else > 4023 fatal("Should only be called with JVMTI enabled"); > 4024 #endif > 4025 JVM_END Marked as reviewed by alanb (Reviewer). ------------- PR Review: https://git.openjdk.org/jdk/pull/17174#pullrequestreview-1792366053 From alanb at openjdk.org Thu Dec 21 07:24:38 2023 From: alanb at openjdk.org (Alan Bateman) Date: Thu, 21 Dec 2023 07:24:38 GMT Subject: RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 05:22:47 GMT, David Holmes wrote: > Looks good. I had not realized this code pattern was pre-existing! I think this raises the question as to whether there is an any testing with minimal builds as the call to fatal when !INCLUDE_JVMTI are there since JDK 19. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17174#issuecomment-1865734975 From dholmes at openjdk.org Thu Dec 21 07:48:37 2023 From: dholmes at openjdk.org (David Holmes) Date: Thu, 21 Dec 2023 07:48:37 GMT Subject: RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 07:21:46 GMT, Alan Bateman wrote: > I think this raises the question as to whether there is an any testing with minimal builds as the call to fatal when !INCLUDE_JVMTI are there since JDK 19. We do not test any minimal builds. I don't know if GHA does, but if so it seems no virtual thread tests are run. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17174#issuecomment-1865795735 From sroy at openjdk.org Thu Dec 21 08:19:53 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Thu, 21 Dec 2023 08:19:53 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Wed, 20 Dec 2023 14:30:18 GMT, Thomas Stuefe wrote: > Hi, > > some requests and questions: > > * Please modify the JBS title, PR title, and JBS issue text to reflect that this adds an alternative shared object loading path for shared objects on AIX. Something like "Allow loading shared objects with .a extension on AIX". Please describe the new logic in the JBS issue text. > * Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? > * What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? > * What happens if the original path handed to os::dll_load is already a *.a file? Should the logic then be reversed? > * We really need regression tests for this. For some of the question I need to consult the folk working on J9. I will answer a few of them if that gives some clarity. > * Please modify the JBS title, PR title, and JBS issue text to reflect that this adds an alternative shared object loading path for shared objects on AIX. Something like "Allow loading shared objects with .an extension on AIX". Please describe the new logic in the JBS issue text. Sure working on it. > * Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? I am not sure how J9 handles this. I would have to consult . However as per current observation, this issue does not show up on Semuru. This issue is only happening on Adoptium. The team that release these file has always released *.a files which work fine for Semuru. > * What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. > * What happens if the original path handed to os::dll_load is already a *.a file? Should the logic then be reversed? I don't think so. We are not modifying the behaviour to handle *.a files here. We are just adding extra checks for *.so files if they fail to load. In the logic , when a load fails, I just check if it is a .so file and perform the loading again. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1865837275 From stuefe at openjdk.org Thu Dec 21 08:35:57 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 21 Dec 2023 08:35:57 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: On Thu, 21 Dec 2023 08:16:22 GMT, Suchismith Roy wrote: >> What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. No, what I meant, and what must be clarified before going forward with this solution, is the following: - is *every* `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object - or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. >> Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? > I am not sure how J9 handles this. I would have to consult . J9 is Open Source, can't you just look? :) > However as per current observation, this issue does not show up on Semuru. This issue is only happening on Adoptium. The team that release these file has always released *.a files which work fine for Semuru. I don't know what Semuru is. What is the context, is that a different VM? Also OpenJDK? J9 derived? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1865859345 From mli at openjdk.org Thu Dec 21 09:00:52 2023 From: mli at openjdk.org (Hamlin Li) Date: Thu, 21 Dec 2023 09:00:52 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v9] In-Reply-To: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> References: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> Message-ID: On Tue, 19 Dec 2023 22:45:13 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced li with mv I think usage of static rounding mode here is fine for riscv. In typical implementations, writes to the dynamic rounding mode CSR state will serialize the pipeline. Static rounding modes are used to implement specialized arithmetic operations that often have to switch frequently between different rounding modes. -- from `?F? Standard Extension` Looks good. Can you add some comments for java_round_float(or double)? As there were lots of discussion here, but all this information is not in the code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1865886113 PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1865889081 From mbaesken at openjdk.org Thu Dec 21 09:19:08 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Thu, 21 Dec 2023 09:19:08 GMT Subject: RFR: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 Message-ID: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : # # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 # # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) # Problematic frame: # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 # Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . ------------- Commit messages: - JDK-8322163 Changes: https://git.openjdk.org/jdk/pull/17175/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17175&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322163 Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod Patch: https://git.openjdk.org/jdk/pull/17175.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17175/head:pull/17175 PR: https://git.openjdk.org/jdk/pull/17175 From tschatzl at openjdk.org Thu Dec 21 09:20:47 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 21 Dec 2023 09:20:47 GMT Subject: RFR: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: <04pGtzKg8nb0VVmPooIYvEh9S9ljS_ABctMEwMznH6w=.209a5bbe-a1e4-4180-ae02-51d013ca8dbf@github.com> References: <04pGtzKg8nb0VVmPooIYvEh9S9ljS_ABctMEwMznH6w=.209a5bbe-a1e4-4180-ae02-51d013ca8dbf@github.com> Message-ID: On Tue, 19 Dec 2023 10:59:09 GMT, Albert Mingkun Yang wrote: >> Hi all, >> >> please review this change that changes the filler array class name (again) after user feedback. >> >> In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. >> This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. >> >> Testing: tier1-6 >> >> Thanks, >> Thomas > > Marked as reviewed by ayang (Reviewer). Thanks @albertnetymk @dholmes-ora for your reviews ------------- PR Comment: https://git.openjdk.org/jdk/pull/17155#issuecomment-1865916220 From tschatzl at openjdk.org Thu Dec 21 09:20:49 2023 From: tschatzl at openjdk.org (Thomas Schatzl) Date: Thu, 21 Dec 2023 09:20:49 GMT Subject: Integrated: 8319548: Unexpected internal name for Filler array klass causes error in VisualVM In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 10:08:14 GMT, Thomas Schatzl wrote: > Hi all, > > please review this change that changes the filler array class name (again) after user feedback. > > In particular, the previous name `Ljdk/internal/vm/FillerArray;` confuses some tools (https://github.com/oracle/visualvm/issues/523). I.e. it's not an array, but still variable sized. > This change adds the `[` array bracket, and renames the element name to not have `Array` inside to not try to pretend that the element is some other kind of array. > > Testing: tier1-6 > > Thanks, > Thomas This pull request has now been integrated. Changeset: 05745e3f Author: Thomas Schatzl URL: https://git.openjdk.org/jdk/commit/05745e3f1d56f71d7647e81fa5933c9f4ed18430 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod 8319548: Unexpected internal name for Filler array klass causes error in VisualVM Co-authored-by: Tom?? H?rka Reviewed-by: ayang, dholmes ------------- PR: https://git.openjdk.org/jdk/pull/17155 From jkern at openjdk.org Thu Dec 21 09:30:48 2023 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 21 Dec 2023 09:30:48 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Wed, 20 Dec 2023 23:10:29 GMT, Martin Doerr wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> improve error handling > > src/hotspot/os/aix/porting_aix.cpp line 25: > >> 23: */ >> 24: // needs to be defined first, so that the implicit loaded xcoff.h header defines >> 25: // the right structures to analyze the loader header of 32 and 64 Bit executable files > > I don't think we support 32 bit executables. Originally my code worked for 32 & 64 Bit executables, but Thomas mentioned that we have only 64 Bit executables. So I removed the 32 Bit implementation, but this comment was an artefact. I removed the 32 Bit reference now. > src/hotspot/os/aix/porting_aix.cpp line 921: > >> 919: // If the libpath cannot be retrieved return an empty path >> 920: static const char* rtv_linkedin_libpath() { >> 921: static char buffer[4096]; > > Maybe define a constant for the buffer size? Done > src/hotspot/os/aix/porting_aix.cpp line 927: > >> 925: // let libpath point to buffer, which then contains a valid libpath >> 926: // or an empty string >> 927: if (libpath) { > > `!= nullptr` is common in hotspot. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433797348 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433801010 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433798833 From sroy at openjdk.org Thu Dec 21 09:40:58 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Thu, 21 Dec 2023 09:40:58 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> Message-ID: <_WrW-iHHdU-IgC7Z1b6oe_Qh0dkC6P3KJAdl7J2S1Do=.712dd065-6207-4632-a82f-8e12ad023cd5@github.com> On Wed, 20 Dec 2023 11:16:03 GMT, Suchismith Roy wrote: >> J2SE agent does not start and throws error when it tries to find the shared library ibm_16_am. >> After searching for ibm_16_am.so ,the jvm agent throws and error as dll_load fails.It fails to identify the shared library ibm_16_am.a shared archive file on AIX. >> Hence we are providing a function which will additionally search for .a file on AIX ,when the search for .so file fails. > > Suchismith Roy has updated the pull request incrementally with one additional commit since the last revision: > > Spaces fix > > > What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? > > > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. > > No, what I meant, and what must be clarified before going forward with this solution, is the following: > > * is _every_ `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object > * or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. > > Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. > If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. > In AIX, both static and dynamic libraries have *.a extension. And AIX also supports *.so files.Bascially shared objects in AIX have both *.a and *.so extension. Hence we need to implement this logic. If we try loading a static archive specifically ,how the dlopen would behave , that is something probably @JoKern65 can answer ? > > > Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? > > > I am not sure how J9 handles this. I would have to consult . > > J9 is Open Source, can't you just look? :) I did try comparing the file structures, and i do not see a similar file structure over there. I am unable to find the jvmTiAgent code and also os_aix file. So i am not sure which functions over there are doing the same functionality. You have any suggestion on how i can check and correlate ? > > > However as per current observation, this issue does not show up on Semuru. This issue is only happening on Adoptium. The team that release these file has always released *.a files which work fine for Semuru. > > I don't know what Semuru is. What is the context, is that a different VM? Also OpenJDK? J9 derived? Semuru is J9 derived. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1865945011 From jkern at openjdk.org Thu Dec 21 09:42:08 2023 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 21 Dec 2023 09:42:08 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Wed, 20 Dec 2023 23:45:16 GMT, Martin Doerr wrote: >> Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: >> >> improve error handling > > src/hotspot/os/aix/porting_aix.cpp line 916: > >> 914: constexpr int max_handletable = 1024; >> 915: static int g_handletable_used = 0; >> 916: static struct handletableentry g_handletable[max_handletable] = {{0, 0, 0, 0}}; > > Wouldn't `ConcurrentHashTable` be a better data structure? It is already used in hotspot, can grow dynamically and doesn't need linear search. There will be only few libraries in the list. With this assumption Thomas suggested to use just a simple array. > src/hotspot/os/aix/porting_aix.cpp line 990: > >> 988: } >> 989: ret = (0 == stat64x(combined.base(), stat)); >> 990: os::free (path2); > > Please remove the extra whitespace. Done > src/hotspot/os/aix/porting_aix.cpp line 1026: > >> 1024: >> 1025: os::free (libpath); >> 1026: os::free (path2); > > Same here. Done ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433813137 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433814446 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433814755 From aph at openjdk.org Thu Dec 21 09:44:52 2023 From: aph at openjdk.org (Andrew Haley) Date: Thu, 21 Dec 2023 09:44:52 GMT Subject: RFR: JDK-8241503: C2: Share MacroAssembler between mach nodes during code emission [v6] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 11:11:07 GMT, Martin Doerr wrote: >> Cesar Soares Lucas has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains seven commits: >> >> - Merge with origin/master >> - Fix build, copyright dates, m4 files. >> - Fix merge >> - Catch up with master branch. >> >> Merge remote-tracking branch 'origin/master' into reuse-macroasm >> - Some inst_mark fixes; Catch up with master. >> - Catch up with changes on master >> - Reuse same C2_MacroAssembler object to emit instructions. > > Thanks for the explanation! Makes sense. > @TheRealMDoerr motivation is to reduce memory consumption and speed up C2. `C2_MacroAssembler` is based on `ResourceObj` which allocates in compiler arena. Each small `C2_MacroAssembler masm()` will add allocation to arena until we finish compilation. Also speedup because we don't need to do such allocations. I see. I updated the bug entry because we need an audit trail for such decisions. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16484#issuecomment-1865950233 From jkern at openjdk.org Thu Dec 21 09:55:04 2023 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 21 Dec 2023 09:55:04 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v9] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: cosmetic changes ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/7486ddb9..359080d3 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=07-08 Stats: 10 lines in 1 file changed: 2 ins; 0 del; 8 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From stuefe at openjdk.org Thu Dec 21 10:00:44 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 21 Dec 2023 10:00:44 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Thu, 21 Dec 2023 09:37:57 GMT, Joachim Kern wrote: >> src/hotspot/os/aix/porting_aix.cpp line 916: >> >>> 914: constexpr int max_handletable = 1024; >>> 915: static int g_handletable_used = 0; >>> 916: static struct handletableentry g_handletable[max_handletable] = {{0, 0, 0, 0}}; >> >> Wouldn't `ConcurrentHashTable` be a better data structure? It is already used in hotspot, can grow dynamically and doesn't need linear search. > > There will be only few libraries in the list. With this assumption Thomas suggested to use just a simple array. Let's keep it simple. A linear array of only a few items is easily scanned, probably faster than pointer hopping hash table entries. Not that it matters in any way for the few calls to dlopen. Also, avoiding hotspot structures preserves layer integrity (porting_aix does not pull anything from hotspot so far) and prevents initialisation time dependencies. Not sure whether ConcurrentHashTable works before VM init, but with Joachimes current solution, we can call dlopen at any time in VM life. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433839119 From stuefe at openjdk.org Thu Dec 21 10:03:55 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 21 Dec 2023 10:03:55 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: <_WrW-iHHdU-IgC7Z1b6oe_Qh0dkC6P3KJAdl7J2S1Do=.712dd065-6207-4632-a82f-8e12ad023cd5@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <_WrW-iHHdU-IgC7Z1b6oe_Qh0dkC6P3KJAdl7J2S1Do=.712dd065-6207-4632-a82f-8e12ad023cd5@github.com> Message-ID: <2sMyJ8mZ6EIULC67tK1IcI4uNnJMvpCzw1BKEDUaIms=.c90f1101-236e-4a80-869c-feca6abd3dc3@github.com> On Thu, 21 Dec 2023 09:37:55 GMT, Suchismith Roy wrote: > > > > What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? > > > > > > > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. > > > > > > No, what I meant, and what must be clarified before going forward with this solution, is the following: > > > > * is _every_ `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object > > * or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. > > > > Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. > > If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. > > In AIX, both static and dynamic libraries have *.a extension. And AIX also supports *.so files.Bascially shared objects in AIX have both *.a and *.so extension. Hence we need to implement this logic. If we try loading a static archive specifically ,how the dlopen would behave , that is something probably @JoKern65 can answer ? Rather, this is a question you have to ask your collegues at IBM that develop the AIX libc. Since AIX libc is not open source, we cannot look for ourselves, nor can Joachim (her works at SAP). > > > > > Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? > > > > > > > I am not sure how J9 handles this. I would have to consult . > > > > > > J9 is Open Source, can't you just look? :) > > I did try comparing the file structures, and i do not see a similar file structure over there. I am unable to find the jvmTiAgent code and also os_aix file. So i am not sure which functions over there are doing the same functionality. You have any suggestion on how i can check and correlate ? Someone must implement LoadLibrary. Try looking for places where dlopen() is called. > > > > However as per current observation, this issue does not show up on Semuru. This issue is only happening on Adoptium. The team that release these file has always released *.a files which work fine for Semuru. > > > > > > I don't know what Semuru is. What is the context, is that a different VM? Also OpenJDK? J9 derived? > > Semuru is J9 derived. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1865977132 From sspitsyn at openjdk.org Thu Dec 21 10:10:59 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 21 Dec 2023 10:10:59 GMT Subject: Integrated: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 02:29:33 GMT, Serguei Spitsyn wrote: > The macro `#else` branch conditions of `#if INCLUDE_JVMTI` in the `JVM_VirtualThread*` methods/functions (see `jvm.cpp`) are incorrect and have to be removed. > For example, the lines 4022-4023 have to be removed from the fragment below: > > 4013 JVM_ENTRY(void, JVM_VirtualThreadDisableSuspend(JNIEnv* env, jobject vthread, jboolean enter)) > 4014 #if INCLUDE_JVMTI > 4015 if (!DoJVMTIVirtualThreadTransitions) { > 4016 assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); > 4017 return; > 4018 } > 4019 assert(thread->is_disable_suspend() != (bool)enter, > 4020 "nested or unbalanced monitor enter/exit is not allowed"); > 4021 thread->toggle_is_disable_suspend(); > 4022 #else > 4023 fatal("Should only be called with JVMTI enabled"); > 4024 #endif > 4025 JVM_END This pull request has now been integrated. Changeset: aff659aa Author: Serguei Spitsyn URL: https://git.openjdk.org/jdk/commit/aff659aaf7c73ff8eb903fd3e426e1b42ea6d95a Stats: 12 lines in 1 file changed: 0 ins; 12 del; 0 mod 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI Reviewed-by: dholmes, alanb ------------- PR: https://git.openjdk.org/jdk/pull/17174 From sspitsyn at openjdk.org Thu Dec 21 10:10:57 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 21 Dec 2023 10:10:57 GMT Subject: RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: References: Message-ID: <52ooFlZzTtY1W-02L7z-WnIa6R2iIH8irDoPwczVtEk=.e27b0d25-b714-45fb-8633-6509a914918a@github.com> On Thu, 21 Dec 2023 07:21:46 GMT, Alan Bateman wrote: >> Looks good. I had not realized this code pattern was pre-existing! >> >> Thanks. > >> Looks good. I had not realized this code pattern was pre-existing! > > I think this raises the question as to whether there is an any testing with minimal builds as the call to fatal when !INCLUDE_JVMTI are there since JDK 19. @AlanBateman said: > I think this raises the question as to whether there is an any testing with minimal builds as the call to fatal when !INCLUDE_JVMTI are there since JDK 19. @dholmes-ora said: > We do not test any minimal builds. I don't know if GHA does, but if so it seems no virtual thread tests are run. My plan is to check with Leonid on this. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17174#issuecomment-1865984172 From sspitsyn at openjdk.org Thu Dec 21 10:10:58 2023 From: sspitsyn at openjdk.org (Serguei Spitsyn) Date: Thu, 21 Dec 2023 10:10:58 GMT Subject: RFR: 8322538: remove fatal from JVM_VirtualThread functions for !INCLUDE_JVMTI In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 02:29:33 GMT, Serguei Spitsyn wrote: > The macro `#else` branch conditions of `#if INCLUDE_JVMTI` in the `JVM_VirtualThread*` methods/functions (see `jvm.cpp`) are incorrect and have to be removed. > For example, the lines 4022-4023 have to be removed from the fragment below: > > 4013 JVM_ENTRY(void, JVM_VirtualThreadDisableSuspend(JNIEnv* env, jobject vthread, jboolean enter)) > 4014 #if INCLUDE_JVMTI > 4015 if (!DoJVMTIVirtualThreadTransitions) { > 4016 assert(!JvmtiExport::can_support_virtual_threads(), "sanity check"); > 4017 return; > 4018 } > 4019 assert(thread->is_disable_suspend() != (bool)enter, > 4020 "nested or unbalanced monitor enter/exit is not allowed"); > 4021 thread->toggle_is_disable_suspend(); > 4022 #else > 4023 fatal("Should only be called with JVMTI enabled"); > 4024 #endif > 4025 JVM_END Alan and David, thank you for review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17174#issuecomment-1865984694 From rehn at openjdk.org Thu Dec 21 10:12:52 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 21 Dec 2023 10:12:52 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v4] In-Reply-To: References: Message-ID: On Fri, 15 Dec 2023 11:32:26 GMT, Vladimir Kempik wrote: >> Hi all, I have address all comments. >> >> The only code change I didn't do was register caching of constants. >> This is because I don't have access to sha2 in performance simulator. >> Without it 256 and 512 have 'identical' path. >> I'll create a jira for that, so I can revisit it once I have access. >> I hope that is okay @RealFYang ? (i.e. ship this and do a follow-up) >> >> Also @VladimirKempik the flag issue is not resolved. >> For now we use this experimental flag which is inline with the other flags. >> >> Any other things to address, new or that I missed? >> >> (passes compiler/intrinsics/sha/) >> >> REF: https://bugs.openjdk.org/browse/JDK-8322177 > >> Hi all, I have address all comments. >> >> The only code change I didn't do was register caching of constants. This is because I don't have access to sha2 in performance simulator. Without it 256 and 512 have 'identical' path. I'll create a jira for that, so I can revisit it once I have access. I hope that is okay @RealFYang ? (i.e. ship this and do a follow-up) >> >> Also @VladimirKempik the flag issue is not resolved. For now we use this experimental flag which is inline with the other flags. >> >> Any other things to address, new or that I missed? >> >> (passes compiler/intrinsics/sha/) > > It was mostly a wish we look at flags later and simplify it @VladimirKempik @RealFYang @Hamlin-Li are we all good now? Could I get some more approval if so :) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1865988689 From mdoerr at openjdk.org Thu Dec 21 10:19:45 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 21 Dec 2023 10:19:45 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Thu, 21 Dec 2023 09:57:08 GMT, Thomas Stuefe wrote: >> There will be only few libraries in the list. With this assumption Thomas suggested to use just a simple array. > > Let's keep it simple. A linear array of only a few items is easily scanned, probably faster than pointer hopping hash table entries. Not that it matters in any way for the few calls to dlopen. > > Also, avoiding hotspot structures preserves layer integrity (porting_aix does not pull anything from hotspot so far) and prevents initialisation time dependencies. Not sure whether ConcurrentHashTable works before VM init, but with Joachimes current solution, we can call dlopen at any time in VM life. I don't like introducing unnecessary limitations. Are we sure nobody will ever need more than 1024 handles? Can't we at least use a GrowableArray or something like that? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433865222 From luhenry at openjdk.org Thu Dec 21 10:30:47 2023 From: luhenry at openjdk.org (Ludovic Henry) Date: Thu, 21 Dec 2023 10:30:47 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v7] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 08:21:14 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: > > - t2 caller saved, no need to push/pop > - Merge branch 'master' into sha256 > - Removed swap file > - Index load, other comment > - Merge branch 'master' into sha256 > - Materialize constants address once > - Removed template > - Flag fixes > - Merge branch 'master' into sha256 > - Share code > - ... and 1 more: https://git.openjdk.org/jdk/compare/137b5648...be46fe4f Marked as reviewed by luhenry (Committer). ------------- PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1792689226 From dchuyko at openjdk.org Thu Dec 21 11:13:25 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 21 Dec 2023 11:13:25 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v17] In-Reply-To: References: Message-ID: <-u99xleA_S6Se_2wET4iFO7jzqN-GmaiC8fBsAzzcQs=.cf28f08a-7de2-4073-a71b-8fda1620e402@github.com> > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 35 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 25 more: https://git.openjdk.org/jdk/compare/aff659aa...fbedf276 ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=16 Stats: 372 lines in 15 files changed: 339 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From jsjolen at openjdk.org Thu Dec 21 11:13:56 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 21 Dec 2023 11:13:56 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> Message-ID: On Tue, 19 Dec 2023 16:59:05 GMT, Emanuel Peter wrote: > [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. > > This has a few advantages: > - Clear separation between arena (and resource area) allocating array and C-heap allocating array. > - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. > - We should not have multiple implementations of the same thing (C-Heap backed array). > - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. > > **Bonus** > We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. > > For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. > > **Testing** > Tier1-3 + stress testing: pending Wow! Thank you for this Emanuel. I went through your changes and I am happy with them. There are some spelling issues and my opinions on how to write the doc strings. I also asked for some "length"/"size" naming to be changed to "capacity", you don't have to do this as it's pre-existing, but it would make that code clearer. src/hotspot/share/jfr/leakprofiler/chains/edgeStore.cpp line 287: > 285: assert(edge != nullptr, "invariant"); > 286: if (_leak_context_edges == nullptr) { > 287: _leak_context_edges = new GrowableArrayCHeap(initial_size); Pre-existing: `initial_capacity` is a better name. src/hotspot/share/jfr/leakprofiler/checkpoint/objectSampleCheckpoint.cpp line 55: > 53: template > 54: static GrowableArrayCHeap* c_heap_allocate_array(int size = initial_array_size) { > 55: return new GrowableArrayCHeap(size); `capacity` instead of `size` src/hotspot/share/jfr/recorder/checkpoint/types/jfrThreadGroup.cpp line 266: > 264: > 265: JfrThreadGroup::JfrThreadGroup() : > 266: _list(new GrowableArrayCHeap(initial_array_size)) {} `capacity` src/hotspot/share/jfr/recorder/jfrRecorder.cpp line 151: > 149: assert(length >= 1, "invariant"); > 150: assert(dcmd_recordings_array == nullptr, "invariant"); > 151: dcmd_recordings_array = new GrowableArrayCHeap(length); `capacity` src/hotspot/share/jfr/support/jfrKlassUnloading.cpp line 38: > 36: > 37: template > 38: static GrowableArrayCHeap* c_heap_allocate_array(int size = initial_array_size) { `capacity` src/hotspot/share/utilities/growableArray.hpp line 618: > 616: > 617: // The GrowableArray internal data is allocated from either: > 618: // - Resrouce area (default) Spelling src/hotspot/share/utilities/growableArray.hpp line 621: > 619: // - Arena > 620: // > 621: // Itself, it can be embedded, on stack, resource_arena or arena allocated. "Itself can be allocated on stack, resource area or arena allocated." src/hotspot/share/utilities/growableArray.hpp line 629: > 627: // For C-Heap allocation use GrowableArrayCHeap. > 628: // > 629: // Note, that with GrowableArray does not deallocate the allocated memory from "that the" not "that with" src/hotspot/share/utilities/growableArray.hpp line 638: > 636: // GrowableArray is copyable, but it only creates a shallow copy. Hence, one has > 637: // to be careful not to duplicate the state and then diverge while sharing the > 638: // underlying data. Sad but true :-( src/hotspot/share/utilities/growableArray.hpp line 644: > 642: friend class GrowableArrayWithAllocator >; > 643: > 644: // Since GrowableArray is arena / resource area allocated, it is a custom to "it is a custom to" can basically be removed "Since Growable array is arena/resource area allocated it does not destruct its elements. Therefore, ..." is sufficient. ------------- PR Review: https://git.openjdk.org/jdk/pull/17160#pullrequestreview-1792708591 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433894629 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433894947 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433895133 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433895741 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433896219 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433901496 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433904262 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433916326 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433918209 PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433920731 From jsjolen at openjdk.org Thu Dec 21 11:13:57 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 21 Dec 2023 11:13:57 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: On Wed, 20 Dec 2023 20:39:27 GMT, Kim Barrett wrote: >> [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. >> >> This has a few advantages: >> - Clear separation between arena (and resource area) allocating array and C-heap allocating array. >> - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. >> - We should not have multiple implementations of the same thing (C-Heap backed array). >> - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. >> >> **Bonus** >> We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. >> >> For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. >> >> **Testing** >> Tier1-3 + stress testing: pending > > src/hotspot/share/memory/heapInspection.cpp line 282: > >> 280: KlassInfoHisto::KlassInfoHisto(KlassInfoTable* cit) : >> 281: _cit(cit) { >> 282: _elements = new GrowableArrayCHeap(_histo_initial_size); > > pre-existing: Why is this initialization separate from the ctor-initializer? And this looks like an example of > where it would be better as a direct GA member rather than a pointer to GA. Can name be changed to `_histo_initial_capacity`? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433897055 From jsjolen at openjdk.org Thu Dec 21 11:13:58 2023 From: jsjolen at openjdk.org (Johan =?UTF-8?B?U2rDtmxlbg==?=) Date: Thu, 21 Dec 2023 11:13:58 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> Message-ID: On Thu, 21 Dec 2023 10:48:30 GMT, Johan Sj?len wrote: >> [JDK-8247755](https://bugs.openjdk.org/browse/JDK-8247755) introduced the `GrowableArrayCHeap`. This duplicates the current C-Heap allocation capability in `GrowableArray`. I now remove that from `GrowableArray` and move all usages to `GrowableArrayCHeap`. >> >> This has a few advantages: >> - Clear separation between arena (and resource area) allocating array and C-heap allocating array. >> - We can prevent assigning / copying between arrays of different allocation strategies already at compile time, and not only with asserts at runtime. >> - We should not have multiple implementations of the same thing (C-Heap backed array). >> - `GrowableArrayCHeap` is NONCOPYABLE. This is a nice restriction, we now know that C-Heap backed arrays do not get copied unknowingly. >> >> **Bonus** >> We can now restrict `GrowableArray` element type `E` to be `std::is_trivially_destructible::value == true`. The idea is that arena / resource allocated arrays get abandoned, often without being even cleared. Hence, the elements in the array are never destructed. But if we only use elements that are trivially destructible, then it makes no difference if the destructors are ever called, or the elements simply abandoned. >> >> For `GrowableArrayCHeap`, we expect that the user eventually calls the destructor for the array, which in turn calls the destructors of the remaining elements. Hence, it is up to the user to ensure the cleanup. And so we can allow non-trivial destructors. >> >> **Testing** >> Tier1-3 + stress testing: pending > > src/hotspot/share/utilities/growableArray.hpp line 621: > >> 619: // - Arena >> 620: // >> 621: // Itself, it can be embedded, on stack, resource_arena or arena allocated. > > "Itself can be allocated on stack, resource area or arena allocated." That it can be embedded into another class/struct is a given, imho. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1433904533 From jkern at openjdk.org Thu Dec 21 11:26:44 2023 From: jkern at openjdk.org (Joachim Kern) Date: Thu, 21 Dec 2023 11:26:44 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Thu, 21 Dec 2023 10:17:18 GMT, Martin Doerr wrote: >> Let's keep it simple. A linear array of only a few items is easily scanned, probably faster than pointer hopping hash table entries. Not that it matters in any way for the few calls to dlopen. >> >> Also, avoiding hotspot structures preserves layer integrity (porting_aix does not pull anything from hotspot so far) and prevents initialisation time dependencies. Not sure whether ConcurrentHashTable works before VM init, but with Joachimes current solution, we can call dlopen at any time in VM life. > > I don't like introducing unnecessary limitations. Are we sure nobody will ever need more than 1024 handles? > Can't we at least use a GrowableArray or something like that? In principle you are right, but in my opinion 1024 is an academical limit. I never saw processes with more than a few dozen loaded libraries. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433942205 From stuefe at openjdk.org Thu Dec 21 11:49:44 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 21 Dec 2023 11:49:44 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Thu, 21 Dec 2023 11:23:46 GMT, Joachim Kern wrote: >> I don't like introducing unnecessary limitations. Are we sure nobody will ever need more than 1024 handles? >> Can't we at least use a GrowableArray or something like that? > > In principle you are right, but in my opinion 1024 is an academical limit. I never saw processes with more than a few dozen loaded libraries. Dynamic allocation also opens us up to potential initialization issues, unless we explicitly use raw ::malloc. It should work, but I think its better avoided unless we really need it. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433963889 From mdoerr at openjdk.org Thu Dec 21 11:56:53 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 21 Dec 2023 11:56:53 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Thu, 21 Dec 2023 11:45:36 GMT, Thomas Stuefe wrote: >> In principle you are right, but in my opinion 1024 is an academical limit. I never saw processes with more than a few dozen loaded libraries. > > Dynamic allocation also opens us up to potential initialization issues, unless we explicitly use raw ::malloc. It should work, but I think its better avoided unless we really need it. Well we're fixing an academic issue by introducing another one? Doesn't make sense to me. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433972207 From stuefe at openjdk.org Thu Dec 21 12:16:54 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Thu, 21 Dec 2023 12:16:54 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v8] In-Reply-To: References: <1jQHeWyTuhMCLg2f-3-oMUECl3sjvkJNXmQoHYwxu1c=.d3355a3c-fe6f-46bb-bcff-968401a830f6@github.com> Message-ID: On Thu, 21 Dec 2023 11:54:17 GMT, Martin Doerr wrote: >> Dynamic allocation also opens us up to potential initialization issues, unless we explicitly use raw ::malloc. It should work, but I think its better avoided unless we really need it. > > Well we're fixing an academic issue by introducing another one? Doesn't make sense to me. Okay, I butt out, I don't care enough. Up to you both to decide what to do. My recommendation would still be to avoid hotspot infrastructure that relies on os::malloc and friends; other than that, rewriting this table to make it growable using realloc should be trivial. Note that we need *some* sort of limit though. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1433996779 From fyang at openjdk.org Thu Dec 21 12:48:55 2023 From: fyang at openjdk.org (Fei Yang) Date: Thu, 21 Dec 2023 12:48:55 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v7] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 08:21:14 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: > > - t2 caller saved, no need to push/pop > - Merge branch 'master' into sha256 > - Removed swap file > - Index load, other comment > - Merge branch 'master' into sha256 > - Materialize constants address once > - Removed template > - Flag fixes > - Merge branch 'master' into sha256 > - Share code > - ... and 1 more: https://git.openjdk.org/jdk/compare/1ebc5220...be46fe4f src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4039: > 4037: __ vslideup_vi(v16, v27, 2); // v16 = {d,c,_,_} > 4038: // Merge elements [1..0] of v26 ({a,b}) into elements [1..0] of v16 > 4039: __ vmerge_vvm(v16, v16, v26); // v16 = {d,c,b,a} Hi, Great to see that we are switching to use index-load to get {f,e,b,a},{h,g,d,c} pre-loop. But I was also expecting to use index-store to put {f,e,b,a},{h,g,d,c} back to {a,b,c,d},{e,f,g,h} post-loop as I mentioned in my previous comment [1]. Seems that was missed? I would prefer to have this change before we go. [1] https://github.com/openjdk/jdk/pull/16562#discussion_r1393767555 ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434027709 From dchuyko at openjdk.org Thu Dec 21 13:38:15 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Thu, 21 Dec 2023 13:38:15 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v18] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 36 commits: - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - Merge branch 'openjdk:master' into compiler-directives-force-update - ... and 26 more: https://git.openjdk.org/jdk/compare/6de23bf3...b348ebed ------------- Changes: https://git.openjdk.org/jdk/pull/14111/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=17 Stats: 372 lines in 15 files changed: 339 ins; 3 del; 30 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From rehn at openjdk.org Thu Dec 21 13:45:55 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 21 Dec 2023 13:45:55 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v7] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 12:44:25 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 11 additional commits since the last revision: >> >> - t2 caller saved, no need to push/pop >> - Merge branch 'master' into sha256 >> - Removed swap file >> - Index load, other comment >> - Merge branch 'master' into sha256 >> - Materialize constants address once >> - Removed template >> - Flag fixes >> - Merge branch 'master' into sha256 >> - Share code >> - ... and 1 more: https://git.openjdk.org/jdk/compare/2ecad50c...be46fe4f > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4039: > >> 4037: __ vslideup_vi(v16, v27, 2); // v16 = {d,c,_,_} >> 4038: // Merge elements [1..0] of v26 ({a,b}) into elements [1..0] of v16 >> 4039: __ vmerge_vvm(v16, v16, v26); // v16 = {d,c,b,a} > > Hi, Great to see that we are switching to use index-load to get {f,e,b,a},{h,g,d,c} pre-loop. But I was also expecting to use index-store to put {f,e,b,a},{h,g,d,c} back to {a,b,c,d},{e,f,g,h} post-loop as I mentioned in my previous comment [1]. Seems that was missed? I would prefer to have this change before we go. > > [1] https://github.com/openjdk/jdk/pull/16562#discussion_r1393767555 Yes, I missed that, thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434094296 From rehn at openjdk.org Thu Dec 21 14:41:06 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 21 Dec 2023 14:41:06 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: index store state back ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/be46fe4f..f4c511c7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=07 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=06-07 Stats: 32 lines in 2 files changed: 7 ins; 18 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From rehn at openjdk.org Thu Dec 21 14:44:55 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 21 Dec 2023 14:44:55 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v7] In-Reply-To: References: Message-ID: <8vUx-AzSEcm_M5qkOI53f_0bZXnERFmqdfdXIei_-xs=.ed1886a2-6408-4bc3-bca5-9294da701182@github.com> On Thu, 21 Dec 2023 13:42:47 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 4039: >> >>> 4037: __ vslideup_vi(v16, v27, 2); // v16 = {d,c,_,_} >>> 4038: // Merge elements [1..0] of v26 ({a,b}) into elements [1..0] of v16 >>> 4039: __ vmerge_vvm(v16, v16, v26); // v16 = {d,c,b,a} >> >> Hi, Great to see that we are switching to use index-load to get {f,e,b,a},{h,g,d,c} pre-loop. But I was also expecting to use index-store to put {f,e,b,a},{h,g,d,c} back to {a,b,c,d},{e,f,g,h} post-loop as I mentioned in my previous comment [1]. Seems that was missed? I would prefer to have this change before we go. >> >> [1] https://github.com/openjdk/jdk/pull/16562#discussion_r1393767555 > > Yes, I missed that, thanks! Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434160069 From rehn at openjdk.org Thu Dec 21 14:44:54 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Thu, 21 Dec 2023 14:44:54 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 14:41:06 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > index store state back Passing compiler/intrinsics/sha/ ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1866394076 From omikhaltcova at openjdk.org Thu Dec 21 14:49:57 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 21 Dec 2023 14:49:57 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v10] In-Reply-To: References: Message-ID: > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Added comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/392671c1..a8aa809e Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=08-09 Stats: 11 lines in 1 file changed: 4 ins; 0 del; 7 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From omikhaltcova at openjdk.org Thu Dec 21 14:54:04 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 21 Dec 2023 14:54:04 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v11] In-Reply-To: References: Message-ID: > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Fix comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/a8aa809e..72744a0a Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=09-10 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From omikhaltcova at openjdk.org Thu Dec 21 15:01:52 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 21 Dec 2023 15:01:52 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v9] In-Reply-To: References: <4z1nbsMARrwj1y1o53A0huOQj6rxGZ6oUGIA_BFu8jI=.5238ee8f-5575-4b55-b811-b790312c22a4@github.com> Message-ID: On Thu, 21 Dec 2023 08:56:12 GMT, Hamlin Li wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Replaced li with mv > > Looks good. > Can you add some more comments for java_round_float(or double)? As there were lots of discussion here, but all this information is not in the code. @Hamlin-Li I've just left some comments. Is this what was required? Could you take a look, please?! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1866427379 From sgibbons at openjdk.org Thu Dec 21 15:21:08 2023 From: sgibbons at openjdk.org (Scott Gibbons) Date: Thu, 21 Dec 2023 15:21:08 GMT Subject: RFR: JDK-8320448 Accelerate IndexOf using AVX2 [v5] In-Reply-To: References: Message-ID: > Re-write the IndexOf code without the use of the pcmpestri instruction, only using AVX2 instructions. This change accelerates String.IndexOf on average 1.3x for AVX2. The benchmark numbers: > > > Benchmark Score Latest > StringIndexOf.advancedWithMediumSub 343.573 317.934 0.925375393x > StringIndexOf.advancedWithShortSub1 1039.081 1053.96 1.014319384x > StringIndexOf.advancedWithShortSub2 55.828 110.541 1.980027943x > StringIndexOf.constantPattern 9.361 11.906 1.271872663x > StringIndexOf.searchCharLongSuccess 4.216 4.218 1.000474383x > StringIndexOf.searchCharMediumSuccess 3.133 3.216 1.02649218x > StringIndexOf.searchCharShortSuccess 3.76 3.761 1.000265957x > StringIndexOf.success 9.186 9.713 1.057369911x > StringIndexOf.successBig 14.341 46.343 3.231504079x > StringIndexOfChar.latin1_AVX2_String 6220.918 12154.52 1.953814533x > StringIndexOfChar.latin1_AVX2_char 5503.556 5540.044 1.006629895x > StringIndexOfChar.latin1_SSE4_String 6978.854 6818.689 0.977049957x > StringIndexOfChar.latin1_SSE4_char 5657.499 5474.624 0.967675646x > StringIndexOfChar.latin1_Short_String 7132.541 6863.359 0.962260014x > StringIndexOfChar.latin1_Short_char 16013.389 16162.437 1.009307711x > StringIndexOfChar.latin1_mixed_String 7386.123 14771.622 1.999915517x > StringIndexOfChar.latin1_mixed_char 9901.671 9782.245 0.987938803 Scott Gibbons has updated the pull request incrementally with one additional commit since the last revision: Addressing review comments. ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16753/files - new: https://git.openjdk.org/jdk/pull/16753/files/48088348..63db0961 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=04 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16753&range=03-04 Stats: 43 lines in 4 files changed: 1 ins; 11 del; 31 mod Patch: https://git.openjdk.org/jdk/pull/16753.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16753/head:pull/16753 PR: https://git.openjdk.org/jdk/pull/16753 From mli at openjdk.org Thu Dec 21 15:24:55 2023 From: mli at openjdk.org (Hamlin Li) Date: Thu, 21 Dec 2023 15:24:55 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v11] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 14:54:04 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Fix comments Thanks for updating. Yes, some like that make it better. Maybe more comments about the trick of `+ 0.5`? You could refere to comments at https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5899. BTW, some minor comments: 1. can you move the code added in src/hotspot/cpu/riscv/macroAssembler_riscv.cpp up to line 4241? Just to move it out of the block of a bunch of macro definitions. 2. And, comment style, maybe change from `/**/` back to `//`, which is consistent with other comments for non-macro code. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1866476680 From mdoerr at openjdk.org Thu Dec 21 16:07:49 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Thu, 21 Dec 2023 16:07:49 GMT Subject: RFR: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 In-Reply-To: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> References: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> Message-ID: On Thu, 21 Dec 2023 09:12:33 GMT, Matthias Baesken wrote: > We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). > test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : > > > # > # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 > # > # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) > # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 > # > > > Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . I think it's very bad to let C code run into signals (SIGBUS or SIGSEGV). In addition, the signal handler just skips the faulting instruction and continues with the next one. That does not yield defined behavior! (On linux aarch64, the new loop doesn't work, because gcc generates an strb instruction with internal address increment. The signal prevents both, the store and the address increment, so we end up in an endless loop.) If we only want to do a simple fix for Alpine x86_64, I suggest to use the new code only on that platform. It's bad for other platforms, too, but it works because it gets tested. (Doesn't mean that it will always work with every compiler. So, there's still room for improvement.) ------------- Changes requested by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17175#pullrequestreview-1793264095 From epeter at openjdk.org Thu Dec 21 16:29:02 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Thu, 21 Dec 2023 16:29:02 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v10] In-Reply-To: References: Message-ID: > I'm making sure that `allocate_bci_to_data` is only called when holding the `extra_data_lock`, so that no concurrent calls of it can ever occur. > > Testing: tier1-3 and stress. Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: removed some ttyl cases, which collided with the extra_data_lock ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16840/files - new: https://git.openjdk.org/jdk/pull/16840/files/30e5aebc..0ec53712 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16840&range=08-09 Stats: 28 lines in 7 files changed: 5 ins; 2 del; 21 mod Patch: https://git.openjdk.org/jdk/pull/16840.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16840/head:pull/16840 PR: https://git.openjdk.org/jdk/pull/16840 From matsaave at openjdk.org Thu Dec 21 17:47:37 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 21 Dec 2023 17:47:37 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v2] In-Reply-To: References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Thu, 21 Dec 2023 02:51:41 GMT, David Holmes wrote: > The code itself seems fine. But I don't see any supporting evidence of how this optimisation performs. IIUC the only case it benefits is when a static initializer creates an instance of the class being initialized - how typical is that? > > The definitions of `supports_fast_class_init_checks` all have the comment: > > ``` > // ... supports fast class initialization checks for static methods. > ``` > > which needs updating now it is not just for static methods. (Actually the comments became out-dated with [JDK-8223216](https://bugs.openjdk.org/browse/JDK-8223216) but it would be nice to fix them all.) > > Would be nice to see this applied across all platforms too - or at least those that support fast_class_init - ppc and s390. Perhaps file follow-up RFE's for those platforms so this is not forgotten. I note that RISCV64 doesn't support fast_class_init at all yet. > > Thanks Thank you for pointing this out, I did mean to report some performance metrics. While I'm not sure how typical this case is, the performance results indicate a meaningful improvement. I will attach them to the description at the top of this PR. With regard to other platforms, I agree that it's a good idea to open RFE's for the remaining platforms. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17006#issuecomment-1866703207 From matsaave at openjdk.org Thu Dec 21 17:51:04 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Thu, 21 Dec 2023 17:51:04 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v3] In-Reply-To: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: > The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds the optimization for x86 and aarch64. Verified with tier 1-5 tests. > > This change was tested with Spring Petclinic which reported the following startup times: > > Clean build: #### Booted and returned in 161941ms > Patched build: #### Booted and returned in 160657ms Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Fixed comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17006/files - new: https://git.openjdk.org/jdk/pull/17006/files/dc9e5ae3..51a0ce10 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17006&range=02 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17006&range=01-02 Stats: 2 lines in 2 files changed: 1 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17006.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17006/head:pull/17006 PR: https://git.openjdk.org/jdk/pull/17006 From omikhaltcova at openjdk.org Thu Dec 21 23:02:55 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 21 Dec 2023 23:02:55 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12] In-Reply-To: References: Message-ID: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Moved the code up + comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/72744a0a..4f56afb7 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=11 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=10-11 Stats: 54 lines in 1 file changed: 22 ins; 18 del; 14 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From omikhaltcova at openjdk.org Thu Dec 21 23:26:42 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 21 Dec 2023 23:26:42 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v11] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 15:22:11 GMT, Hamlin Li wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Fix comments > > Thanks for updating. > Yes, some like that make it better. > > Maybe more comments about the trick of `+ 0.5`? You could refere to comments at https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp#L5899. > > BTW, some minor comments: > 1. can you move the code added in src/hotspot/cpu/riscv/macroAssembler_riscv.cpp up to line 4241? Just to move it out of the block of a bunch of macro definitions. > 2. And, comment style, maybe change from `/**/` back to `//`, which is consistent with other comments for non-macro code. @Hamlin-Li Thanks for your advices! Fixed. IMHO this will be enough, otherwise, for clearlier understanding, a specific example should be given in the comments showing that the rounding ties to positive infinity is equal to calling sequentially of fadd with RDN and fcvt with RDN. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1867046416 From dholmes at openjdk.org Fri Dec 22 00:32:47 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 22 Dec 2023 00:32:47 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v3] In-Reply-To: References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: On Thu, 21 Dec 2023 17:51:04 GMT, Matias Saavedra Silva wrote: >> The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds the optimization for x86 and aarch64. Verified with tier 1-5 tests. >> >> This change was tested with Spring Petclinic which reported the following startup times: >> >> Clean build: #### Booted and returned in 161941ms >> Patched build: #### Booted and returned in 160657ms > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Fixed comments Please update the comments on `supports_fast_class_init_checks` for all architectures so they are consistent. As it just a comment there are no concerns about build/test. Thanks. ------------- Changes requested by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17006#pullrequestreview-1793813583 From stevenschlansker at gmail.com Fri Dec 22 01:12:18 2023 From: stevenschlansker at gmail.com (Steven Schlansker) Date: Thu, 21 Dec 2023 17:12:18 -0800 Subject: CDS filemap fastdebug assert while loading Graal CE Polyglot in isolated classloader Message-ID: Hi hotspot-dev, I tried to submit a JVM crash report through bugreport.java.com, but my attempts to submit it are rejected with a 'Request method 'POST' not supported' error. So I will send it here as a backup. While debugging a still-unexplained 'IncompatibleClassChangeError: disagree on InnerClasses attribute', I tried to run our application in a fastdebug jvm with additional logging. Instead of reproducing the original issue, I hit an assertion error in CDS. # assert(ent->in_named_module()) failed: must be # Internal Error (/build/src/hotspot/share/cds/filemap.cpp:590), pid=633408, tid=633409 V [libjvm.so+0x1414d60] ModuleEntry::ModuleEntry(Handle, bool, Symbol*, Symbol*, Symbol*, ClassLoaderData*)+0x300 (moduleEntry.cpp:63) V [libjvm.so+0x1414fc1] ModuleEntryTable::locked_create_entry(Handle, bool, Symbol*, Symbol*, Symbol*, ClassLoaderData*)+0x1c1 (moduleEntry.cpp:619) V [libjvm.so+0x141c23d] Modules::define_module(Handle, unsigned char, _jstring*, _jstring*, _jobjectArray*, JavaThread*)+0x1f1d (modules.cpp:402) V [libjvm.so+0x101b099] JVM_DefineModule+0xb9 (jvm.cpp:1112) V [libjvm.so+0xc130e7] FileMapInfo::get_module_shared_path_index(Symbol*)+0x377 (filemap.cpp:590) Fedora 39 Linux 6.6.6 x86_64 Reproduced on java 21u26 and 23 (519ecd352a66633589f160db7390647d90e36b99) hs_err: https://gist.github.com/stevenschlansker/14b748af7758e4ea846ac22f12e53453 To reproduce, this is the source: import java.io.File; import java.io.UncheckedIOException; import java.net.MalformedURLException; import java.net.URL; import java.net.URLClassLoader; import java.util.stream.Stream; public class PolyglotBoom { public static void main(final String... args) throws Exception { final var cl = new URLClassLoader( Stream.of( "polyglot", "nativeimage", "truffle-api", "js-language", "word", "collections", "icu4j", "regex") .map(fn -> { try { return new File("tmp/" + fn + "-23.1.1.jar").toURL(); } catch (final MalformedURLException e) { throw new UncheckedIOException(e); } }) .toArray(URL[]::new)); final var engine = Class.forName("org.graalvm.polyglot.Engine", false, cl) .getMethod("create") .invoke(null); System.err.println("engine = " + engine); } } You must use a **fastdebug** build to 'make images' target to trigger the assertion You need the Graal CE 23.1.1 distribution placed in 'tmp/', I got it from https://repo1.maven.org/maven2/org/graalvm/ : polyglot-23.1.1.jar nativeimage-23.1.1.jar truffle-api-23.1.1.jar js-language-23.1.1.jar word-23.1.1.jar collections-23.1.1.jar icu4j-23.1.1.jar regex-23.1.1.jar Also, at least glassfish jaxb-runtime is needed also in 'tmp/' : https://repo1.maven.org/maven2/org/glassfish/jaxb/jaxb-runtime/2.3.1/jaxb-runtime-2.3.1.jar Run attached program: % javac PolyglotBoom.java % jar cf tmp.jar PolyglotBoom.class % ~/code/jdk/build/linux-x86_64-server-fastdebug/images/jdk/bin/java -cp tmp.jar:tmp/\* -XX:ArchiveClassesAtExit=archive.jsa PolyglotBoom Please let me know if I can provide any additional information. And it would be good to get the bug reporting tool fixed :) Thank you! From david.holmes at oracle.com Fri Dec 22 01:31:52 2023 From: david.holmes at oracle.com (David Holmes) Date: Fri, 22 Dec 2023 11:31:52 +1000 Subject: CDS filemap fastdebug assert while loading Graal CE Polyglot in isolated classloader In-Reply-To: References: Message-ID: <27cb1c7c-d2f2-4cf0-801a-68efd663ff08@oracle.com> Hi Steven, On 22/12/2023 11:12 am, Steven Schlansker wrote: > Hi hotspot-dev, > > I tried to submit a JVM crash report through bugreport.java.com, but > my attempts to submit it are rejected with a 'Request method 'POST' > not supported' error. So I will send it here as a backup. I've filed https://bugs.openjdk.org/browse/JDK-8322657 for you. Regards, David ----- > While debugging a still-unexplained 'IncompatibleClassChangeError: > disagree on InnerClasses attribute', I tried to run our application in > a fastdebug jvm with additional logging. Instead of reproducing the > original issue, I hit an assertion error in CDS. > > # assert(ent->in_named_module()) failed: must be > # Internal Error (/build/src/hotspot/share/cds/filemap.cpp:590), > pid=633408, tid=633409 > > V [libjvm.so+0x1414d60] ModuleEntry::ModuleEntry(Handle, bool, > Symbol*, Symbol*, Symbol*, ClassLoaderData*)+0x300 > (moduleEntry.cpp:63) > V [libjvm.so+0x1414fc1] ModuleEntryTable::locked_create_entry(Handle, > bool, Symbol*, Symbol*, Symbol*, ClassLoaderData*)+0x1c1 > (moduleEntry.cpp:619) > V [libjvm.so+0x141c23d] Modules::define_module(Handle, unsigned char, > _jstring*, _jstring*, _jobjectArray*, JavaThread*)+0x1f1d > (modules.cpp:402) > V [libjvm.so+0x101b099] JVM_DefineModule+0xb9 (jvm.cpp:1112) > V [libjvm.so+0xc130e7] > FileMapInfo::get_module_shared_path_index(Symbol*)+0x377 > (filemap.cpp:590) > > Fedora 39 Linux 6.6.6 x86_64 > Reproduced on java 21u26 and 23 (519ecd352a66633589f160db7390647d90e36b99) > > hs_err: https://gist.github.com/stevenschlansker/14b748af7758e4ea846ac22f12e53453 > > To reproduce, this is the source: > > import java.io.File; > import java.io.UncheckedIOException; > import java.net.MalformedURLException; > import java.net.URL; > import java.net.URLClassLoader; > import java.util.stream.Stream; > > public class PolyglotBoom { > public static void main(final String... args) throws Exception { > final var cl = new URLClassLoader( > Stream.of( > "polyglot", > "nativeimage", > "truffle-api", > "js-language", > "word", > "collections", > "icu4j", > "regex") > .map(fn -> { > try { > return new File("tmp/" + fn + > "-23.1.1.jar").toURL(); > } catch (final MalformedURLException e) { > throw new UncheckedIOException(e); > } > }) > .toArray(URL[]::new)); > final var engine = > Class.forName("org.graalvm.polyglot.Engine", false, cl) > .getMethod("create") > .invoke(null); > System.err.println("engine = " + engine); > } > } > > You must use a **fastdebug** build to 'make images' target to trigger > the assertion > > You need the Graal CE 23.1.1 distribution placed in 'tmp/', I got it from > https://repo1.maven.org/maven2/org/graalvm/ : > > polyglot-23.1.1.jar > nativeimage-23.1.1.jar > truffle-api-23.1.1.jar > js-language-23.1.1.jar > word-23.1.1.jar > collections-23.1.1.jar > icu4j-23.1.1.jar > regex-23.1.1.jar > > Also, at least glassfish jaxb-runtime is needed also in 'tmp/' : > https://repo1.maven.org/maven2/org/glassfish/jaxb/jaxb-runtime/2.3.1/jaxb-runtime-2.3.1.jar > > Run attached program: > % javac PolyglotBoom.java > % jar cf tmp.jar PolyglotBoom.class > % ~/code/jdk/build/linux-x86_64-server-fastdebug/images/jdk/bin/java > -cp tmp.jar:tmp/\* -XX:ArchiveClassesAtExit=archive.jsa PolyglotBoom > > Please let me know if I can provide any additional information. And it > would be good to get the bug reporting tool fixed :) > Thank you! From kbarrett at openjdk.org Fri Dec 22 02:00:50 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 22 Dec 2023 02:00:50 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> Message-ID: On Thu, 21 Dec 2023 11:10:43 GMT, Johan Sj?len wrote: > ... I also asked for some "length"/"size" naming to be changed to "capacity", you don't have to do this as it's pre-existing, but it would make that code clearer. I think I only commented on one in my pass over the code, but I agree with all of @jdksjolen suggestions for those. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17160#issuecomment-1867135604 From kbarrett at openjdk.org Fri Dec 22 02:25:47 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 22 Dec 2023 02:25:47 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: On Thu, 21 Dec 2023 06:11:03 GMT, Emanuel Peter wrote: >> src/hotspot/share/memory/arena.hpp line 209: >> >>> 207: >>> 208: #ifdef ASSERT >>> 209: bool Arena_contains(const Arena* arena, const void* ptr); >> >> This function doesn't seem necessary. Directly calling arena->contains(ptr) in the one place it's being seems >> like it should suffice. > > @kimbarrett the reason was that I need to call this from the hpp file, and I encountered some circular dependency I did could not resolve. So I needed to move something off to the cpp files. Either I put it in arena.cpp, or in growableArray.cpp. But If I put things off to growableArray.cpp from the GrowableArray class, then it will not easily instantiate the templates, so that is not possible then. Hence I have to put it into arena.cpp I don't think a global API is warranted to support that local debug-only implementation detail. The inclusion problem arises because the PR eliminates GrowableArrayMetadata (a non-templated class), thereby forcing the init_checks helper to be moved to GrowableArray (a class template). That forced moving the implementation from the .cpp to the .hpp. One solution might be to move the new version of init_checks to growableArray.inline.hpp. After all, breaking circularities is one of the reasons one might have an inline.hpp file. However, that file doesn't currently exist, and I think introducing it would have too much fannout. And it might not even work without even more effort; there might be .hpp files that contain allocations of GrowableArray. Much simpler, and I think probably better, is to keep GrowableArrayMetadata (perhaps under a different name - GrowableArrayArenaHolder?), now holding the allocation arena and providing init_checks (and anything else that seems appropriate), with init_checks defined similarly to the current definition in the .cpp file. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17160#discussion_r1434655989 From matsaave at openjdk.org Fri Dec 22 05:08:19 2023 From: matsaave at openjdk.org (Matias Saavedra Silva) Date: Fri, 22 Dec 2023 05:08:19 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v4] In-Reply-To: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: > The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds the optimization for x86 and aarch64. Verified with tier 1-5 tests. > > This change was tested with Spring Petclinic which reported the following startup times: > > Clean build: #### Booted and returned in 161941ms > Patched build: #### Booted and returned in 160657ms Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: Added comment to remaining platforms ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17006/files - new: https://git.openjdk.org/jdk/pull/17006/files/51a0ce10..18f17cd9 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17006&range=03 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17006&range=02-03 Stats: 2 lines in 2 files changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/17006.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17006/head:pull/17006 PR: https://git.openjdk.org/jdk/pull/17006 From fyang at openjdk.org Fri Dec 22 05:47:53 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 22 Dec 2023 05:47:53 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 14:41:06 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > index store state back Thanks for the update. I still have several comments after a more closer look. src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3705: > 3703: // vl1reXX.v v15, ofs > 3704: // > 3705: // // Increment word contant address by stride (16/32 bytes, 4*4B/8B, 128b/256b) Nit: s/contant/constant/ src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3812: > 3810: if (vset_sew == Assembler::e64 && !multi_block) return "sha512_implCompress"; > 3811: if (vset_sew == Assembler::e64 && multi_block) return "sha512_implCompressMB"; > 3812: return "bad name lookup"; Maybe place a `ShouldNotReachHere();` immediately before the last return statement? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3935: > 3933: // x0 is not written, we known the number of vector elements. > 3934: > 3935: __ vsetivli(x0, 4, vset_sew, Assembler::m1, Assembler::ma, Assembler::ta); Shouldn't we set `LMUL` to `Assembler::m2` when `vset_sew` is `e64`? A single 128-bit wide vector register won't be able to hold 4 `e64` elements. I guess you were testing against 256-bit wide vector register so that this issue won't trigger? This also means most of the code won't work for sha512 with 128-bit RVV as `Assembler::m2` means register group of two. This menifests in the corresponding openssl sha512 implementation [1]. [1] https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha512-riscv64-zvkb-zvknhb.pl src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3942: > 3940: > 3941: // Step-over a,b, so we are pointing to c. > 3942: // const_add is equal to 4x state variable, div by 2 is thus 2, a,b I don't quite understand this code comment. ------------- Changes requested by fyang (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1793885724 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434664804 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434718482 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434708758 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434729652 From dholmes at openjdk.org Fri Dec 22 06:39:38 2023 From: dholmes at openjdk.org (David Holmes) Date: Fri, 22 Dec 2023 06:39:38 GMT Subject: RFR: 8320276: Improve class initialization barrier in TemplateTable::_new [v4] In-Reply-To: References: <8gg5lfFYH3aKKYXqWpQ6AP9be_rW4eMEJv-Vx5yBJTU=.3eeabdc8-08cc-4e3d-8275-c68621c12302@github.com> Message-ID: <8tff1HjtQpBC-XjBpbp05a0gVVsAxc90dT9fPTHAPfE=.33c0d809-919c-4ae7-9eae-1e740683503d@github.com> On Fri, 22 Dec 2023 05:08:19 GMT, Matias Saavedra Silva wrote: >> The class initialization barrier in TemplateTable::_new fast path check ensures that the class being instantiated is fully initialized. It can be improved by introducing additional fast path check when current thread is initializer thread as MacroAssembler::clinit_barrier() does. It avoids repeated calls into interpreter runtime for classes being initialized. This patch adds the optimization for x86 and aarch64. Verified with tier 1-5 tests. >> >> This change was tested with Spring Petclinic which reported the following startup times: >> >> Clean build: #### Booted and returned in 161941ms >> Patched build: #### Booted and returned in 160657ms > > Matias Saavedra Silva has updated the pull request incrementally with one additional commit since the last revision: > > Added comment to remaining platforms Thanks ------------- Marked as reviewed by dholmes (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17006#pullrequestreview-1794034996 From rehn at openjdk.org Fri Dec 22 07:06:43 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 07:06:43 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 02:44:13 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> index store state back > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3705: > >> 3703: // vl1reXX.v v15, ofs >> 3704: // >> 3705: // // Increment word contant address by stride (16/32 bytes, 4*4B/8B, 128b/256b) > > Nit: s/contant/constant/ Fixed ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434783000 From rehn at openjdk.org Fri Dec 22 07:12:44 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 07:12:44 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 05:10:11 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> index store state back > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3942: > >> 3940: >> 3941: // Step-over a,b, so we are pointing to c. >> 3942: // const_add is equal to 4x state variable, div by 2 is thus 2, a,b > > I don't quite understand this code comment. State is: `{a,b,c,d,e,f,g,h}` The index used assumes the pointer is at 'c'. As we know const_add is equal to the size of 4 state variables, if we divide it by 2 we have the size of two state variables. If that make sense, can you suggest a comment change? ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434785945 From mbaesken at openjdk.org Fri Dec 22 08:12:57 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 22 Dec 2023 08:12:57 GMT Subject: RFR: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 In-Reply-To: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> References: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> Message-ID: On Thu, 21 Dec 2023 09:12:33 GMT, Matthias Baesken wrote: > We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). > test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : > > > # > # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 > # > # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) > # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 > # > > > Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . Hi Martin, > If we only want to do a simple fix for Alpine x86_64, I suggest to use the new code only on that platform. I adjusted the code so that the loop is only done on Alpine/MUSL . ------------- PR Comment: https://git.openjdk.org/jdk/pull/17175#issuecomment-1867370002 From mbaesken at openjdk.org Fri Dec 22 08:12:56 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 22 Dec 2023 08:12:56 GMT Subject: RFR: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 [v2] In-Reply-To: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> References: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> Message-ID: > We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). > test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : > > > # > # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 > # > # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) > # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 > # > > > Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: Do loop only on Alpine/MUSL ------------- Changes: - all: https://git.openjdk.org/jdk/pull/17175/files - new: https://git.openjdk.org/jdk/pull/17175/files/9e12df94..b1526976 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=17175&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=17175&range=00-01 Stats: 5 lines in 1 file changed: 2 ins; 2 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/17175.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17175/head:pull/17175 PR: https://git.openjdk.org/jdk/pull/17175 From epeter at openjdk.org Fri Dec 22 08:41:53 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 22 Dec 2023 08:41:53 GMT Subject: RFR: 8306767: Concurrent repacking of extra data in MethodData is potentially unsafe [v10] In-Reply-To: References: Message-ID: On Tue, 28 Nov 2023 14:29:45 GMT, Erik ?sterlund wrote: >> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision: >> >> removed some ttyl cases, which collided with the extra_data_lock > > Looks good! @fisk @tkrodriguez I continued working on this now, and updated the description. It seems to pass the tests. One open question is about `MethodData::bci_to_extra_data`: The idea used to be that we would first read the extra data without a lock (concurrent == true): `ProfileData* result = bci_to_extra_data_helper(bci, m, dp, true);` and then if the record was missing, we would take the `extra_data_lock`, and then with `concurrent = false`: `ProfileData* result = bci_to_extra_data_helper(bci, m, dp, false);` and then if the record is still missing, we would allocate a new one. Now, this code is pointless, there is no concurrent action any more because the whole method is under the `extra_data_lock`. Should I refactor the code there, or leave that to a cleanup-RFE? (for now I left a TODO in the patch). ------------- PR Comment: https://git.openjdk.org/jdk/pull/16840#issuecomment-1867395966 From mli at openjdk.org Fri Dec 22 08:45:54 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Dec 2023 08:45:54 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 14:41:06 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: > > index store state back Some other comments src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3790: > 3788: // the cost of moving those vectors at the end of each quad-rounds. > 3789: void sha2_quad_round(Assembler::SEW vset_sew, VectorRegister rot1, VectorRegister rot2, VectorRegister rot3, VectorRegister rot4, > 3790: Register scalarconst, VectorRegister vtemp, VectorRegister vtemp2, VectorRegister vtemp3, VectorRegister vtemp4, maybe rename `vtemp3` -> `v_abef`, `vtemp4` -> `v_cdgh` src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3792: > 3790: Register scalarconst, VectorRegister vtemp, VectorRegister vtemp2, VectorRegister vtemp3, VectorRegister vtemp4, > 3791: bool gen_words = true, bool step_const = true) { > 3792: __ vl1reXX_v(vset_sew, vtemp, scalarconst); Seems we only incr `scalarconst ` conditionally, so load into `vtemp ` here could also be conditionally? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3803: > 3801: } > 3802: if (gen_words) { > 3803: __ vsha2ms_vv(rot1, vtemp2, rot4); when `gen_words == false` && `step_const == true`, is it necessary to call `vmerge_vvm(vtemp2, rot3, rot2);` above? src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3935: > 3933: // x0 is not written, we known the number of vector elements. > 3934: > 3935: __ vsetivli(x0, 4, vset_sew, Assembler::m1, Assembler::ma, Assembler::ta); Currently, when MaxVectorSize < 16 UseRVV = false, so there are conditions when MaxVectorSize == 16 && UseRVV == true, in this case, `vsetivli` will not work as expected, and neither the following codes. And 128 bits is the common one? ------------- PR Review: https://git.openjdk.org/jdk/pull/16562#pullrequestreview-1793619266 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434480065 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434484955 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434480079 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434480184 From mdoerr at openjdk.org Fri Dec 22 09:29:38 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Fri, 22 Dec 2023 09:29:38 GMT Subject: RFR: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 [v2] In-Reply-To: References: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> Message-ID: On Fri, 22 Dec 2023 08:12:56 GMT, Matthias Baesken wrote: >> We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). >> test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : >> >> >> # >> # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 >> # >> # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # Problematic frame: >> # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 >> # >> >> >> Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Do loop only on Alpine/MUSL I think this is an acceptable solution. ------------- Marked as reviewed by mdoerr (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17175#pullrequestreview-1794210054 From dchuyko at openjdk.org Fri Dec 22 09:33:08 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Fri, 22 Dec 2023 09:33:08 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v15] In-Reply-To: References: Message-ID: <8HNg975y467K6xKiocEvZQkjHB6GHeBRM-oKXgrmxOo=.2d0233a3-1887-4c14-b158-27d1402ee659@github.com> On Wed, 20 Dec 2023 02:40:40 GMT, Andrei Pangin wrote: >> Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 33 commits: >> >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - ... and 23 more: https://git.openjdk.org/jdk/compare/fde5b168...44d680cd > > src/hotspot/share/code/codeCache.cpp line 1413: > >> 1411: ResourceMark rm; >> 1412: // Try the max level and let the directives be applied during the compilation. >> 1413: int complevel = CompLevel::CompLevel_full_optimization; > > Should the highest level depend on the configuration instead of the hard-coded constant? Perhaps, needs to be `highest_compile_level()` Yes, changed to use `highest_compile_level()`. > src/hotspot/share/compiler/compilerDirectives.cpp line 750: > >> 748: if (!dir->is_default_directive() && dir->match(method)) { >> 749: match_found = true; >> 750: break; > > `match_found` is redundant: for better readability, you may just return true. Curly braces around MutexLocker won't be needed either. Thanks, that's indeed simpler. > src/hotspot/share/oops/method.hpp line 820: > >> 818: // Clear the flags related to compiler directives that were set by the compilerBroker, >> 819: // because the directives can be updated. >> 820: void clear_method_flags() { > > The function name is a bit misleading - it clears only flags related to directives. Changed to `clear_directive_flags`. > src/hotspot/share/oops/methodFlags.hpp line 61: > >> 59: status(has_loops_flag_init , 1 << 14) /* The loop flag has been initialized */ \ >> 60: status(on_stack_flag , 1 << 15) /* RedefineClasses support to keep Metadata from being cleaned */ \ >> 61: status(has_matching_directives , 1 << 16) /* The method has matching directives */ \ > > It's worth noting that the flag is temporary and is valid only during DCmd execution. Good point, updated the comment. This btw means that in another places this flag can be reused. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1434883459 PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1434884163 PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1434884612 PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1434885291 From dchuyko at openjdk.org Fri Dec 22 09:33:06 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Fri, 22 Dec 2023 09:33:06 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v19] In-Reply-To: References: Message-ID: > Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. > > A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. > > It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). > > Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. > > A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. > > In addition, a new diagnostic command `Compiler.replace_directives`, has been added for ... Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: Deopt osr, cleanups ------------- Changes: - all: https://git.openjdk.org/jdk/pull/14111/files - new: https://git.openjdk.org/jdk/pull/14111/files/b348ebed..d75daf64 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=18 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=14111&range=17-18 Stats: 41 lines in 4 files changed: 15 ins; 6 del; 20 mod Patch: https://git.openjdk.org/jdk/pull/14111.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/14111/head:pull/14111 PR: https://git.openjdk.org/jdk/pull/14111 From dchuyko at openjdk.org Fri Dec 22 09:35:53 2023 From: dchuyko at openjdk.org (Dmitry Chuyko) Date: Fri, 22 Dec 2023 09:35:53 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v15] In-Reply-To: References: Message-ID: On Wed, 20 Dec 2023 02:57:29 GMT, Andrei Pangin wrote: >> Dmitry Chuyko has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 33 commits: >> >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - Merge branch 'openjdk:master' into compiler-directives-force-update >> - ... and 23 more: https://git.openjdk.org/jdk/compare/fde5b168...44d680cd > > src/hotspot/share/code/codeCache.cpp line 1409: > >> 1407: while(iter.next()) { >> 1408: CompiledMethod* nm = iter.method(); >> 1409: methodHandle mh(thread, nm->method()); > > If there are two CompiledMethods for the same Java method, will it be scheduled for recompilation twice? Related question: if `nm` is an OSR method, does it make sense to go directly for deoptimization rather than compiling a non-OSR version? If there are multiple method versions, it will be recompiled several times. The alternative is too keep some additional information which may complicate the code. OSRs is a good catch, I changed their handling to deopt. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/14111#discussion_r1434887812 From mli at openjdk.org Fri Dec 22 09:37:43 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Dec 2023 09:37:43 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12] In-Reply-To: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> References: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> Message-ID: On Thu, 21 Dec 2023 23:02:55 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Moved the code up + comments For normal cases, I guess `RUP` of riscv will work; but for some corner cases, we need the trick of `+0.5`, am I right? But all this information is just mentioned with `some inputs produce incorrect results`, which is unclear for potential readers and maintainers in the future. So, in the comments, can you add some information about this corner case, a simple example will definitely help here. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1867458169 From sroy at openjdk.org Fri Dec 22 10:14:55 2023 From: sroy at openjdk.org (Suchismith Roy) Date: Fri, 22 Dec 2023 10:14:55 GMT Subject: RFR: JDK-8320005 : Allow loading of shared objects with .a extension on AIX [v7] In-Reply-To: <2sMyJ8mZ6EIULC67tK1IcI4uNnJMvpCzw1BKEDUaIms=.c90f1101-236e-4a80-869c-feca6abd3dc3@github.com> References: <8-buFPL9W3149qcnluk_XqTQr-cJYqu_XvwU5ovyAIA=.396e5005-f896-48b9-919c-94164229d7bf@github.com> <_WrW-iHHdU-IgC7Z1b6oe_Qh0dkC6P3KJAdl7J2S1Do=.712dd065-6207-4632-a82f-8e12ad023cd5@github.com> <2sMyJ8mZ6EIULC67tK1IcI4uNnJMvpCzw1BKEDUaIms=.c90f1101-236e-4a80-869c-feca6abd3dc3@github.com> Message-ID: On Thu, 21 Dec 2023 10:01:04 GMT, Thomas Stuefe wrote: > > > > > What happens if we accidentally attempt to load a "real" static library, which is also named *.a? Would dlopen() then crash? What would happen? > > > > > > > > > > I don't think the problem is with *.a . They would load as the default behaviour of the dlopen. It is only when the dlopen fails for *.so , we give another chance to check for .a file with the same name. > > > > > > > > > No, what I meant, and what must be clarified before going forward with this solution, is the following: > > > > > > * is _every_ `*.a` object on AIX loadable with `dlopen`, and will the result be the same as when loading a `*.so` object > > > * or, if we present arbitrary `*.a` files to dlopen, is there a chance for dlopen to crash or misbehave. > > > > > > Reason is that I was under the impression that *.a libraries are static libraries and cannot be loaded dynamically. This is what you now try to do. > > > If we cannot safely answer this question, I would opt for a more narrow solution by hard-wiring known alternative names. So, do the second *.a attempt only for your `ibm_16_am.a` which you know works. That could also be done in a reasonably maintainable manner. > > > > > > In AIX, both static and dynamic libraries have *.a extension. And AIX also supports *.so files.Bascially shared objects in AIX have both *.a and *.so extension. Hence we need to implement this logic. If we try loading a static archive specifically ,how the dlopen would behave , that is something probably @JoKern65 can answer ? > > Rather, this is a question you have to ask your collegues at IBM that develop the AIX libc. > > Since AIX libc is not open source, we cannot look for ourselves, nor can Joachim (her works at SAP). > > > > > > Does this really have to be handled in the OpenJDK? What does J9 on AIX do? Could this be done in a simpler way outside OpenJDK, e.g. by providing an *.so variant of the library in question? Where does this library come from? > > > > > > > > > > I am not sure how J9 handles this. I would have to consult . > > > > > > > > > J9 is Open Source, can't you just look? :) > > > > > > I did try comparing the file structures, and i do not see a similar file structure over there. I am unable to find the jvmTiAgent code and also os_aix file. So i am not sure which functions over there are doing the same functionality. You have any suggestion on how i can check and correlate ? > > Someone must implement LoadLibrary. Try looking for places where dlopen() is called. > > > > > However as per current observation, this issue does not show up on Semuru. This issue is only happening on Adoptium. The team that release these file has always released *.a files which work fine for Semuru. > > > > > > > > > I don't know what Semuru is. What is the context, is that a different VM? Also OpenJDK? J9 derived? > > > > > > Semuru is J9 derived. Ok , i was not able to find the right file yet. I will collaborate on this further once i am back from vacation, in January. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16604#issuecomment-1867498645 From rehn at openjdk.org Fri Dec 22 11:58:53 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 11:58:53 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 04:39:43 GMT, Fei Yang wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> index store state back > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3812: > >> 3810: if (vset_sew == Assembler::e64 && !multi_block) return "sha512_implCompress"; >> 3811: if (vset_sew == Assembler::e64 && multi_block) return "sha512_implCompressMB"; >> 3812: return "bad name lookup"; > > Maybe place a `ShouldNotReachHere();` immediately before the last return statement? Fixed > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3935: > >> 3933: // x0 is not written, we known the number of vector elements. >> 3934: >> 3935: __ vsetivli(x0, 4, vset_sew, Assembler::m1, Assembler::ma, Assembler::ta); > > Shouldn't we set `LMUL` to `Assembler::m2` when `vset_sew` is `e64`? A single 128-bit wide vector register won't be able to hold 4 `e64` elements. I guess you were testing against 256-bit wide vector register so that this issue won't trigger? This also means most of the code won't work for sha512 with 128-bit RVV as `Assembler::m2` means register group of two. This menifests in the corresponding openssl sha512 implementation [1]. > > (With that obervation, should we have two independent code for each algorithm like the openssl version? That seems more readable and maintainable to me.) > > [1] https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha512-riscv64-zvkb-zvknhb.pl The sha instruction don't care about lmul, except only in that lmul*vlen == 256. So there is no code difference except just setting m2 in case of 128. These crypto uses 'EGW' element group width which is specified. As quad rounds depend on result from previous quad round, you can only do 4xe64(for 256 e32), no more no less. Yes, good catch! I must remember to tests multiple vlens. Thanks! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434995475 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434995108 From mli at openjdk.org Fri Dec 22 11:58:54 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Dec 2023 11:58:54 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 20:13:15 GMT, Hamlin Li wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> index store state back > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3935: > >> 3933: // x0 is not written, we known the number of vector elements. >> 3934: >> 3935: __ vsetivli(x0, 4, vset_sew, Assembler::m1, Assembler::ma, Assembler::ta); > > Currently, when MaxVectorSize < 16 UseRVV = false, so there are conditions when MaxVectorSize == 16 && UseRVV == true, in this case, `vsetivli` will not work as expected, and neither the following codes. > > And 128 bits is the common one? I see this is also pointed by @RealFYang ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1434848969 From fyang at openjdk.org Fri Dec 22 12:43:52 2023 From: fyang at openjdk.org (Fei Yang) Date: Fri, 22 Dec 2023 12:43:52 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 11:55:04 GMT, Robbin Ehn wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3935: >> >>> 3933: // x0 is not written, we known the number of vector elements. >>> 3934: >>> 3935: __ vsetivli(x0, 4, vset_sew, Assembler::m1, Assembler::ma, Assembler::ta); >> >> Shouldn't we set `LMUL` to `Assembler::m2` when `vset_sew` is `e64`? A single 128-bit wide vector register won't be able to hold 4 `e64` elements. I guess you were testing against 256-bit wide vector register so that this issue won't trigger? This also means most of the code won't work for sha512 with 128-bit RVV as `Assembler::m2` means register group of two. This menifests in the corresponding openssl sha512 implementation [1]. >> >> (With that obervation, should we have two independent code for each algorithm like the openssl version? That seems more readable and maintainable to me.) >> >> [1] https://github.com/openssl/openssl/blob/master/crypto/sha/asm/sha512-riscv64-zvkb-zvknhb.pl > > The sha instruction don't care about lmul, except only in that lmul*vlen == 256. > So there is no code difference except just setting m2 in case of 128. > These crypto uses 'EGW' element group width which is specified. > As quad rounds depend on result from previous quad round, you can only do 4xe64(for 256 e32), no more no less. > > Yes, good catch! I must remember to tests multiple vlens. Thanks! I suppose we would need some more if conditions to distinguish the register usage for the sha512 case in places like indexed load/store, message block load, etc. Because we will work with vector register pairs as indicated by `m2` for this case. I am not sure if that's a good way. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435022104 From mli at openjdk.org Fri Dec 22 12:50:51 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 22 Dec 2023 12:50:51 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 12:38:22 GMT, Fei Yang wrote: >> The sha instruction don't care about lmul, except only in that lmul*vlen == 256. >> So there is no code difference except just setting m2 in case of 128. >> These crypto uses 'EGW' element group width which is specified. >> As quad rounds depend on result from previous quad round, you can only do 4xe64(for 256 e32), no more no less. >> >> Yes, good catch! I must remember to tests multiple vlens. Thanks! > > I suppose we would need some more if conditions to distinguish the register usage for the sha512 case in places like indexed load/store, message block load, etc. Because we will work with vector register pairs as indicated by `m2` for this case. I am not sure if that's a good way. Is it possible to just use one pieice of code to work for both sha-256/sha-512? I mean without conditions everywhere. As the difference between 256/512 is just some contants and rounds number. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435028334 From clanger at openjdk.org Fri Dec 22 13:10:47 2023 From: clanger at openjdk.org (Christoph Langer) Date: Fri, 22 Dec 2023 13:10:47 GMT Subject: RFR: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 [v2] In-Reply-To: References: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> Message-ID: <0227botSu0oLZmCOR1jrGEOUj4bd2WAn1DpeLVRTheA=.ca28a75c-00a1-46a3-b0ac-2d2457898e0d@github.com> On Fri, 22 Dec 2023 08:12:56 GMT, Matthias Baesken wrote: >> We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). >> test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : >> >> >> # >> # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 >> # >> # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # Problematic frame: >> # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 >> # >> >> >> Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Do loop only on Alpine/MUSL Sounds like a reasonable fix. ------------- Marked as reviewed by clanger (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/17175#pullrequestreview-1794458572 From stefank at openjdk.org Fri Dec 22 13:15:51 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 22 Dec 2023 13:15:51 GMT Subject: RFR: JDK-8322475: Extend printing for System.map In-Reply-To: References: Message-ID: On Tue, 19 Dec 2023 15:48:58 GMT, Thomas Stuefe wrote: > This is an expansion on the new `System.map` command introduced with JDK-8318636. > > We now print valuable information per memory region, such as: > > - the actual resident set size > - the actual number of huge pages > - the actual used page size > - the THP state of the region (was advised, is eligible, uses THP, ...) > - whether the region is shared > - whether the region had been committed (backed by swap) > - whether the region has been swapped out. > > Example output: > > > from to size rss hugetlb pgsz prot notes vm info/file > 0x00000000c0000000 - 0x00000000ffe00000 1071644672 0 4194304 2M rw-p huge JAVAHEAP /anon_hugepage > 0x00000000ffe00000 - 0x0000000100000000 2097152 0 0 2M rw-p huge JAVAHEAP /anon_hugepage > 0x0000558016b67000 - 0x0000558016b68000 4096 4096 0 4K r--p /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java > 0x0000558016b68000 - 0x0000558016b69000 4096 4096 0 4K r-xp /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java > 0x00007f3a749f2000 - 0x00007f3a74c62000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'profiled nmethods') > 0x00007f3a74c62000 - 0x00007f3a7be51000 119468032 0 0 4K ---p nores CODE(CodeHeap 'profiled nmethods') > 0x00007f3a7be51000 - 0x00007f3a7c1c1000 3604480 3604480 0 4K rwxp CODE(CodeHeap 'profiled nmethods') > 0x00007f3a7c1c1000 - 0x00007f3a7c592000 4001792 0 0 4K ---p nores CODE(CodeHeap 'non-nmethods') > 0x00007f3a7c592000 - 0x00007f3a7c802000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'non-profiled nmethods') > 0x00007f3a7c802000 - 0x00007f3a839f200... Just some initial comments after reading through the patch. src/hotspot/os/linux/memMapPrinter_linux.cpp line 31: > 29: #include "nmt/memMapPrinter.hpp" > 30: #include "utilities/align.hpp" > 31: #include "utilities/powerOfTwo.hpp" This include section is not sorted correctly. It would also be good to add a blank line between the hotspot includes and the system includes. src/hotspot/os/linux/memMapPrinter_linux.cpp line 206: > 204: _swapped_out(0), _hugetlb(0), _thp(0) {} > 205: > 206: void add_mapping(ProcMapsInfo& info) { `info` could be const ref. src/hotspot/os/linux/memMapPrinter_linux.cpp line 207: > 205: > 206: void add_mapping(ProcMapsInfo& info) { > 207: _num_mappings ++; We tend to not put a space before `++` in HotSpot code. src/hotspot/os/linux/memMapPrinter_linux.cpp line 250: > 248: st->print_cr(" thpel: mapping is eligible for THP"); > 249: st->print_cr(" thpadv: mapping is THP-madvised"); > 250: st->print_cr(" nothp: mapping will not THP"); "mapping will not THP" sounds a bit weird. src/hotspot/os/linux/memMapPrinter_linux.cpp line 276: > 274: } > 275: > 276: FILE* f = os::fopen("/proc/self/smaps", "r"); We have seen that reading the smaps file can be extremely bad for the latency of the process. We've seen multi-seconds hangs because of external tools reading smaps. If we add a tool like this it would be good to add a big warning somewhere. src/hotspot/share/nmt/memMapPrinter.cpp line 30: > 28: #ifdef LINUX > 29: > 30: #include "code/codeCache.hpp" The include lines below are weirdly sorted. Could you fix it while you are changing this file? ------------- PR Review: https://git.openjdk.org/jdk/pull/17158#pullrequestreview-1794452228 PR Review Comment: https://git.openjdk.org/jdk/pull/17158#discussion_r1435045106 PR Review Comment: https://git.openjdk.org/jdk/pull/17158#discussion_r1435037709 PR Review Comment: https://git.openjdk.org/jdk/pull/17158#discussion_r1435037943 PR Review Comment: https://git.openjdk.org/jdk/pull/17158#discussion_r1435040208 PR Review Comment: https://git.openjdk.org/jdk/pull/17158#discussion_r1435041926 PR Review Comment: https://git.openjdk.org/jdk/pull/17158#discussion_r1435042952 From stefank at openjdk.org Fri Dec 22 13:24:37 2023 From: stefank at openjdk.org (Stefan Karlsson) Date: Fri, 22 Dec 2023 13:24:37 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: On Wed, 20 Dec 2023 21:11:09 GMT, Kim Barrett wrote: > I'm not a fan of the additional clutter in APIs that the static memory types add. If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable and took a memory type to use internally as a constructor argument, then I think a lot of that clutter could be eliminated. It could be used for ordinary data members that are direct GAs rather than pointers to GAs. I think there is a way to do something similar for static data members that are pointers that are dynamically allocated later, though that probably requires more work. FWIW, I added the GrowableArrayCHeap and the static memory type. I did that because there was a perceived need to minimize the memory usage, because we were going to use an extreme amount of these arrays for one of our subsystems in ZGC. It later turned out that we really didn't need to squeeze out the last bit of memory for that use-case. I would really like to get rid of the the static memory type from GrowableArrayCHeap, and just add it as an instance member. ------------- PR Comment: https://git.openjdk.org/jdk/pull/17160#issuecomment-1867683570 From rehn at openjdk.org Fri Dec 22 13:31:53 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 13:31:53 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 20:20:38 GMT, Hamlin Li wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> index store state back > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3792: > >> 3790: Register scalarconst, VectorRegister vtemp, VectorRegister vtemp2, VectorRegister vtemp3, VectorRegister vtemp4, >> 3791: bool gen_words = true, bool step_const = true) { >> 3792: __ vl1reXX_v(vset_sew, vtemp, scalarconst); > > Seems we only incr `scalarconst ` conditionally, so load into `vtemp ` here could also be conditionally? We need to load the constants, but this is the last round so we don't need to step constant forward. Hence increment for next round. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435057502 From mbaesken at openjdk.org Fri Dec 22 13:32:54 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 22 Dec 2023 13:32:54 GMT Subject: RFR: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 [v2] In-Reply-To: References: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> Message-ID: <1zfV58IbvvBnXKT3AUKBs1ifFvO2QVf6zvygzBeKWrk=.27090321-5121-4dd5-a45c-7dd7fe825e04@github.com> On Fri, 22 Dec 2023 08:12:56 GMT, Matthias Baesken wrote: >> We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). >> test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : >> >> >> # >> # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 >> # >> # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) >> # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) >> # Problematic frame: >> # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 >> # >> >> >> Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . > > Matthias Baesken has updated the pull request incrementally with one additional commit since the last revision: > > Do loop only on Alpine/MUSL Hi Martin and Christoph, thanks for the reviews ! ------------- PR Comment: https://git.openjdk.org/jdk/pull/17175#issuecomment-1867690832 From mbaesken at openjdk.org Fri Dec 22 13:32:55 2023 From: mbaesken at openjdk.org (Matthias Baesken) Date: Fri, 22 Dec 2023 13:32:55 GMT Subject: Integrated: JDK-8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 In-Reply-To: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> References: <-F8PPWW4OFqhFo3SDcATXD31yertxXeAhzJxoM-DlOg=.5d084da4-3e35-49ae-ac57-52074681468c@github.com> Message-ID: <3vSL3Egpv0pWxKtCU_9zumsOw7OrPWcHRIrs3vCa4Hs=.5b6e5fd7-42d7-42cb-828e-65610f9e10ef@github.com> On Thu, 21 Dec 2023 09:12:33 GMT, Matthias Baesken wrote: > We notice failures/crashes on Alpine Linux, maybe after [JDK-8320886](https://bugs.openjdk.org/browse/JDK-8320886). > test runtime/Unsafe/InternalErrorTest.java crashes on Alpine (works fine on other test OS/CPU platforms) : > > > # > # SIGSEGV (0xb) at pc=0x00007fd3c080064f, pid=7075, tid=7161 > # > # JRE version: OpenJDK Runtime Environment (23.0) (build 23-internal-adhoc.jenkinsi.jdk) > # Java VM: OpenJDK 64-Bit Server VM (23-internal-adhoc.jenkinsi.jdk, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64) > # Problematic frame: > # C [ld-musl-x86_64.so.1+0x5464f] memset+0xa7 > # > > > Looks like the Alpine memset triggers unexpected SIGSEGV (not the expected SIGBUS). So we switch to a loop instead of memset. However I noticed that on Linux aarch64 the test starts to fail when the loop is used instead of the memset, so I keep the old coding on this platform . This pull request has now been integrated. Changeset: 12308533 Author: Matthias Baesken URL: https://git.openjdk.org/jdk/commit/1230853343c38787c90820d19d0626f0c37540dc Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod 8322163: runtime/Unsafe/InternalErrorTest.java fails on Alpine after JDK-8320886 Reviewed-by: mdoerr, clanger ------------- PR: https://git.openjdk.org/jdk/pull/17175 From epeter at openjdk.org Fri Dec 22 13:33:39 2023 From: epeter at openjdk.org (Emanuel Peter) Date: Fri, 22 Dec 2023 13:33:39 GMT Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: On Fri, 22 Dec 2023 13:22:19 GMT, Stefan Karlsson wrote: >> pre-existing: There are a lot of non-static class data members that are pointers to >> GrowableArray that seem like they would be better as direct, e.g. non-pointers. >> >> pre-existing: There are a lot of iterations over GrowableArray's that would be >> simplified by using range-based-for. >> >> I'm not a fan of the additional clutter in APIs that the static memory types add. >> If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable >> and took a memory type to use internally as a constructor argument, then I think a >> lot of that clutter could be eliminated. It could be used for ordinary data members >> that are direct GAs rather than pointers to GAs. I think there is a way to do something >> similar for static data members that are pointers that are dynamically allocated later, >> though that probably requires more work. >> >> I've not yet reviewed the changes to growableArray.[ch]pp yet, nor the test changes. >> But I've run out of time and energy for this for today. > >> I'm not a fan of the additional clutter in APIs that the static memory types add. If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable and took a memory type to use internally as a constructor argument, then I think a lot of that clutter could be eliminated. It could be used for ordinary data members that are direct GAs rather than pointers to GAs. I think there is a way to do something similar for static data members that are pointers that are dynamically allocated later, though that probably requires more work. > > FWIW, I added the GrowableArrayCHeap and the static memory type. I did that because there was a perceived need to minimize the memory usage, because we were going to use an extreme amount of these arrays for one of our subsystems in ZGC. It later turned out that we really didn't need to squeeze out the last bit of memory for that use-case. I would really like to get rid of the the static memory type from GrowableArrayCHeap, and just add it as an instance member. @stefank Maybe it would then make sense to do that before this change here? Because we will really have to touch all these changes here again for that. @kimbarrett @stefank @jdksjolen Or alternatively, we just say that we want to keep the C-Heap functionality inside `GrowableArray`, and simply remove `GrowableArrayCHeap`? Though I honestly would prefer a world where every allocation strategy has its own `GrowableArray*` variant, so that it is clear statically that they cannot be assigned to each other. So my preference would even be to have these: CHeapGrowableArray ArenaGrowableArray ResourceAreaGrowableArray What do you think? And how important is it that we do not use `virtual` with `GrowableArray`? Because the implementation and testing is much nastier without it. Is the V-table still such a overhead that we need to avoid it? ------------- PR Comment: https://git.openjdk.org/jdk/pull/17160#issuecomment-1867691744 From rehn at openjdk.org Fri Dec 22 13:39:56 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 13:39:56 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 08:44:20 GMT, Hamlin Li wrote: >> src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3935: >> >>> 3933: // x0 is not written, we known the number of vector elements. >>> 3934: >>> 3935: __ vsetivli(x0, 4, vset_sew, Assembler::m1, Assembler::ma, Assembler::ta); >> >> Currently, when MaxVectorSize < 16 UseRVV = false, so there are conditions when MaxVectorSize == 16 && UseRVV == true, in this case, `vsetivli` will not work as expected, and neither the following codes. >> >> And 128 bits is the common one? > > I see this is also pointed by @RealFYang sha uses fixed width instructions, we don't no actually care about lmul in that sense. We just need to set it correctly. Fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435061598 From rehn at openjdk.org Fri Dec 22 13:39:57 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 13:39:57 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 12:48:07 GMT, Hamlin Li wrote: >> I suppose we would need some more if conditions to distinguish the register usage for the sha512 case in places like indexed load/store, message block load, etc. Because we will work with vector register pairs as indicated by `m2` for this case. I am not sure if that's a good way. > > Is it possible to just use one pieice of code to work for both sha-256/sha-512? I mean without conditions everywhere. As the difference between 256/512 is just some contants and rounds number. Yes, we can't use v10 and v11, as there is only two alternative. vlen=128 we need to use two registers, and vlen>=256 we need one. Simple issue, fixed. (not passed all testing yet, but :) ) ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435062558 From rehn at openjdk.org Fri Dec 22 13:39:53 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 13:39:53 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v8] In-Reply-To: References: Message-ID: On Thu, 21 Dec 2023 20:13:04 GMT, Hamlin Li wrote: >> Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: >> >> index store state back > > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3790: > >> 3788: // the cost of moving those vectors at the end of each quad-rounds. >> 3789: void sha2_quad_round(Assembler::SEW vset_sew, VectorRegister rot1, VectorRegister rot2, VectorRegister rot3, VectorRegister rot4, >> 3790: Register scalarconst, VectorRegister vtemp, VectorRegister vtemp2, VectorRegister vtemp3, VectorRegister vtemp4, > > maybe rename `vtemp3` -> `v_abef`, `vtemp4` -> `v_cdgh` Good, thanks, fixed. > src/hotspot/cpu/riscv/stubGenerator_riscv.cpp line 3803: > >> 3801: } >> 3802: if (gen_words) { >> 3803: __ vsha2ms_vv(rot1, vtemp2, rot4); > > when `gen_words == false` && `step_const == true`, is it necessary to call `vmerge_vvm(vtemp2, rot3, rot2);` above? Great find, thanks, fixed. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435060694 PR Review Comment: https://git.openjdk.org/jdk/pull/16562#discussion_r1435060878 From rehn at openjdk.org Fri Dec 22 13:54:05 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 13:54:05 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v9] In-Reply-To: References: Message-ID: > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request incrementally with three additional commits since the last revision: - remove merge, renames - Easier reg layout and 128/m2 - Minor update ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/f4c511c7..cc4f2a83 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=08 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=07-08 Stats: 116 lines in 1 file changed: 16 ins; 8 del; 92 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From rehn at openjdk.org Fri Dec 22 14:10:13 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 14:10:13 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v10] In-Reply-To: References: Message-ID: <-iUbCXbfByEaraLzkSAK34_EewpImIUYRPLztruHOv0=.a6b5cab8-70ab-4bc5-8271-9e64232acfcf@github.com> > Hi, please consider. > > Main author is @luhenry, I only fixed some minor things and tested it. > > Such as: > test/hotspot/jtreg/compiler/intrinsics/sha/ > test/jdk/java/security/MessageDigest/ > test/jdk/jdk/security/ > tier1 > > And still running some test. Robbin Ehn has updated the pull request incrementally with one additional commit since the last revision: fixed lmul ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16562/files - new: https://git.openjdk.org/jdk/pull/16562/files/cc4f2a83..eefcd269 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16562&range=08-09 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.org/jdk/pull/16562.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16562/head:pull/16562 PR: https://git.openjdk.org/jdk/pull/16562 From rehn at openjdk.org Fri Dec 22 14:10:13 2023 From: rehn at openjdk.org (Robbin Ehn) Date: Fri, 22 Dec 2023 14:10:13 GMT Subject: RFR: 8319716: RISC-V: Add SHA-2 [v9] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 13:54:05 GMT, Robbin Ehn wrote: >> Hi, please consider. >> >> Main author is @luhenry, I only fixed some minor things and tested it. >> >> Such as: >> test/hotspot/jtreg/compiler/intrinsics/sha/ >> test/jdk/java/security/MessageDigest/ >> test/jdk/jdk/security/ >> tier1 >> >> And still running some test. > > Robbin Ehn has updated the pull request incrementally with three additional commits since the last revision: > > - remove merge, renames > - Easier reg layout and 128/m2 > - Minor update Passes compiler/intrinsics/sha/ with: QEMU_CPU=rv64,v=true,vext_spec=v1.0,vlen=128,elen=64,rvv_ma_all_1s=true,rvv_ta_all_1s=true,zvknhb=true QEMU_CPU=rv64,v=true,vext_spec=v1.0,vlen=256,elen=64,rvv_ma_all_1s=true,rvv_ta_all_1s=true,zvknhb=true ------------- PR Comment: https://git.openjdk.org/jdk/pull/16562#issuecomment-1867727508 From stuefe at openjdk.org Fri Dec 22 14:47:47 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 22 Dec 2023 14:47:47 GMT Subject: RFR: JDK-8322475: Extend printing for System.map In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 13:08:40 GMT, Stefan Karlsson wrote: >> This is an expansion on the new `System.map` command introduced with JDK-8318636. >> >> We now print valuable information per memory region, such as: >> >> - the actual resident set size >> - the actual number of huge pages >> - the actual used page size >> - the THP state of the region (was advised, is eligible, uses THP, ...) >> - whether the region is shared >> - whether the region had been committed (backed by swap) >> - whether the region has been swapped out. >> >> Example output: >> >> >> from to size rss hugetlb pgsz prot notes vm info/file >> 0x00000000c0000000 - 0x00000000ffe00000 1071644672 0 4194304 2M rw-p huge JAVAHEAP /anon_hugepage >> 0x00000000ffe00000 - 0x0000000100000000 2097152 0 0 2M rw-p huge JAVAHEAP /anon_hugepage >> 0x0000558016b67000 - 0x0000558016b68000 4096 4096 0 4K r--p /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java >> 0x0000558016b68000 - 0x0000558016b69000 4096 4096 0 4K r-xp /shared/projects/openjdk/jdk-jdk/output-fastdebug/images/jdk/bin/java >> 0x00007f3a749f2000 - 0x00007f3a74c62000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'profiled nmethods') >> 0x00007f3a74c62000 - 0x00007f3a7be51000 119468032 0 0 4K ---p nores CODE(CodeHeap 'profiled nmethods') >> 0x00007f3a7be51000 - 0x00007f3a7c1c1000 3604480 3604480 0 4K rwxp CODE(CodeHeap 'profiled nmethods') >> 0x00007f3a7c1c1000 - 0x00007f3a7c592000 4001792 0 0 4K ---p nores CODE(CodeHeap 'non-nmethods') >> 0x00007f3a7c592000 - 0x00007f3a7c802000 2555904 2555904 0 4K rwxp CODE(CodeHeap 'non-profiled nmethods') ... > > src/hotspot/os/linux/memMapPrinter_linux.cpp line 276: > >> 274: } >> 275: >> 276: FILE* f = os::fopen("/proc/self/smaps", "r"); > > We have seen that reading the smaps file can be extremely bad for the latency of the process. We've seen multi-seconds hangs because of external tools reading smaps. If we add a tool like this it would be good to add a big warning somewhere. That is a good point, and I was afraid of that myself. I think I will revise this coding and give the command a "detail" mode, and read smaps (with a warning in the "cost" jcmd category) only in details mode. In normal mode, "maps" is sufficient. And in summary mode I can get probably away with reading stat. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/17158#discussion_r1435128384 From jkern at openjdk.org Fri Dec 22 14:50:16 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 22 Dec 2023 14:50:16 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v10] In-Reply-To: References: Message-ID: > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: additional fix of sideeffect reported in JDK-8322691 ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/359080d3..dc2ea51b Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=09 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=08-09 Stats: 44 lines in 1 file changed: 22 ins; 0 del; 22 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From jkern at openjdk.org Fri Dec 22 15:57:05 2023 From: jkern at openjdk.org (Joachim Kern) Date: Fri, 22 Dec 2023 15:57:05 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v11] In-Reply-To: References: Message-ID: <5LVpPG0ADhvALrcHghtZ95N1-SYcgrnrHl704svJStY=.e498d17a-3534-4ad9-8a74-004b40f7487a@github.com> > On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. > > This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). > > We propose a different, cleaner way of handling this: > > - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. > - Cache dl handles; repeated opening of a library should return the cached handle. > - Increase handle-local ref counter on open, Decrease it on close > - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). > > This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: No need for malloc ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16920/files - new: https://git.openjdk.org/jdk/pull/16920/files/dc2ea51b..acf306d4 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=10 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16920&range=09-10 Stats: 28 lines in 1 file changed: 3 ins; 14 del; 11 mod Patch: https://git.openjdk.org/jdk/pull/16920.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16920/head:pull/16920 PR: https://git.openjdk.org/jdk/pull/16920 From stuefe at openjdk.org Fri Dec 22 15:57:06 2023 From: stuefe at openjdk.org (Thomas Stuefe) Date: Fri, 22 Dec 2023 15:57:06 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v10] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 14:50:16 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > additional fix of sideeffect reported in JDK-8322691 src/hotspot/os/aix/porting_aix.cpp line 1071: > 1069: if (max_handletable == 0) { > 1070: // First time we allocate memory for 128 Entries > 1071: char* ptmp = (char*)::malloc(128 * sizeof(struct handletableentry)); No need for malloc. You can start with realloc, since realloc(NULL, ...) is malloc. static handletablentry* tab = nullptr; static unsigned max_handles = 0; ... if (need more handles) unsigned new_max = MAX2(max_handles * 2, init_num_handles); handleentry* new_tab = ::realloc(p_handletable, sizeof(handleentry) * new_max); if (new_tab != nullptr) { max_handles = new_max; tab= new_tab; } ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1435139923 From daniel.daugherty at oracle.com Fri Dec 22 20:37:54 2023 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Fri, 22 Dec 2023 15:37:54 -0500 Subject: Too many open files problem on MacOS 14.1 In-Reply-To: References: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> Message-ID: <821ab42c-6874-4dea-917b-461074945cb4@oracle.com> On 12/13/23 4:24 AM, Andrew Haley wrote: > On 12/13/23 06:09, Gunnar Wagenknecht wrote: >> Greetings, >> >> I'm reaching out because of an issue with the JVM on MacOS that is >> hitting us a large scale. It started in MacOS 13 but got really worse >> in 14.1. We basically now need to use -XX:-MaxFDLimit on MacOS for >> everything with a classpath of more then 10k jars (monolith). That >> applies to the Java app itself as well as any Java based IDE >> (Eclipse, IntelliJ) or build tool. >> >> The reason is that the MaxFDLimit implementation on Mac is broken in >> the JVM. The JVM is applying a lower limit to itself. We discovered >> the -XX:-MaxFDLimit solution after our old workarounds of increasing >> the open files on MacOS stopped working. > > Have a look at https://bugs.openjdk.org/browse/JDK-8291060, and the > problem > discussed there. > > Gerard Ziemski will read this, and I expect he'd like to comment. > This fix caused problems during testing: ??? JDK-8291060 OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE and was backed out using: ??? JDK-8300055 [BACKOUT] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE I don't see the traditional [REDO] bug, but I do see this one: ??? JDK-8300088 [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE So I suspect it would be best to add comments to JDK-8300088 instead of JDK-8291060. Dan From kim.barrett at oracle.com Fri Dec 22 21:32:53 2023 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 22 Dec 2023 21:32:53 +0000 Subject: Use of C++ dynamic global object initialization with thread guards In-Reply-To: References: <87fs0izasf.fsf@oldenburg.str.redhat.com> <514F5A61-E3C2-4B24-A567-EF19C4292989@oracle.com> <87lea7d2o7.fsf@oldenburg.str.redhat.com> Message-ID: > On Dec 19, 2023, at 12:23 PM, Kim Barrett wrote: > >> On Dec 6, 2023, at 5:51 AM, Florian Weimer wrote: >> >> * Kim Barrett: >> >>>> The implementation of __cxa_guard_acquire is not entirely trivial >>>> because it detects recursive initialization and throws >>>> __gnu_cxx::recursive_init_error, which means that it pulls in the C++ >>>> unwinder (at least with a traditional GNU/Linux build of libstdc++.a). >>> >>> Does it? Seems like it shouldn?t. We build with -fno-exceptions, and >>> the definition of throw_recursive_init_exception is conditionalized on >>> __cpp_exceptions, only throwing when that macro is defined. It calls >>> __builtin_trap() if that macro isn?t defined. >> >> With upstream GCC (and presumably most distributions), there's one >> libstdc++.a with one implementation of __cxa_guard_acquire, and it's >> built with exception support. >> >> It's supposed to be possible to build libstdc++ without exception >> support, but upstream GCC doesn't do this automatically for you if the >> target supports exception handling. In principle, the GCC specs >> mechanism allows you to treat -fno-exceptions as a linker flag and link >> against a custom no-exceptions build of libstdc++.a. >> >> Maybe this is what your toolchain is doing if you don't see the unwinder >> symbols in your builds? It should be easy enough to check if you have a >> build with a symbol table: look for a call in __cxa_throw in the >> disassembly of __cxa_guard_acquire.cold or __cxa_guard_acquire. One of >> our builds looks like this: > > I've verified that the same is happening in Oracle builds. We don't build an > exception-disabled libstdc++ as part of our devkit either. > > So my next question is, exactly what is the harm, and how serious is it? So > far, I don't know of anyone noticing a problem arising from this. > > Obviously, if someone writes an initializer that can lead to recursive entry, > that would lead to an attempt to throw an exception. That's likely to have > pretty bad consequences. OTOH, this doesn't seem like a problem we have in > practice. I'm not sure I've ever seen such a problem arise (not just in > HotSpot); after all, it's UB to do so. Empirically, a recursive initialization attempt doesn't make any attempt to throw. Rather, it blocks forever waiting for a futex signal from a thread that succeeds in the initialization. Which of course will never come. And that makes sense, now that I've looked at the code. In __cxa_guard_acquire, with _GLIBCXX_USE_FUTEX, if the guard indicates initialization hasn't yet been completed, then it goes into a while loop. This while loop tries to claim initialization. Failing that, it checks whether initialization is complete. Failing that, it does a SYS_futex syscall, waiting for some other thread to perform the initialization. There's nothing there to check for recursion. throw_recursive_init_exception is only called if single-threaded (either by configuration or at runtime). -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From kim.barrett at oracle.com Fri Dec 22 23:22:24 2023 From: kim.barrett at oracle.com (Kim Barrett) Date: Fri, 22 Dec 2023 23:22:24 +0000 Subject: RFR: 8322476: Remove GrowableArray C-Heap version, replace usages with GrowableArrayCHeap In-Reply-To: References: <7BF6OZ3vRH791MKbVeJqQ5foScHax_gLMFjSKkm3J68=.f29e5b1c-0751-4257-b253-261c6e20a7b9@github.com> <-cpYKTYXVhVfplPdXUKYt0ZcT-ql497vy0UZRrGREtk=.8fd5f503-ae50-4e8e-a5d8-65824465e5fd@github.com> Message-ID: <300914D8-7893-4859-A32F-3250CF7C3C84@oracle.com> [Kim Barret wrote:] >>> pre-existing: There are a lot of non-static class data members that are pointers to >>> GrowableArray that seem like they would be better as direct, e.g. non-pointers. >>> >>> pre-existing: There are a lot of iterations over GrowableArray's that would be >>> simplified by using range-based-for. >>> >>> I'm not a fan of the additional clutter in APIs that the static memory types add. >>> If we had a variant of GrowableArrayCHeap that was not itself dynamically allocatable >>> and took a memory type to use internally as a constructor argument, then I think a >>> lot of that clutter could be eliminated. It could be used for ordinary data members >>> that are direct GAs rather than pointers to GAs. I think there is a way to do something >>> similar for static data members that are pointers that are dynamically allocated later, >>> though that probably requires more work. >>> >> >> On Fri, 22 Dec 2023 13:22:19 GMT, Stefan Karlsson wrote: >> FWIW, I added the GrowableArrayCHeap and the static memory type. I did that because there was a perceived need to minimize the memory usage, because we were going to use an extreme amount of these arrays for one of our subsystems in ZGC. It later turned out that we really didn't need to squeeze out the last bit of memory for that use-case. I would really like to get rid of the the static memory type from GrowableArrayCHeap, and just add it as an instance member. Currently the basic overhead for a static-MEMTYPE GA is 16 bytes (data pointer, int length and capacity). So 16 bytes on 64bit platforms. Adding the memory type adds 8 bytes (due to alignment), so 50% increase. Though it may not matter for dynamically allocated GAs, since std::max_align_t is probably at least 32 bytes. Though I think we seriously overuse dynamic allocation for GAs. Does it really matter. I don't know. StringDedup uses a lot of GAs. (Though it could use 1/2 as many with some work, since they are used in pairs that always have the same length and capacity. At the time I was working on it I was feeling lazy and just used pairs of GAs, with the intention of coming back to that at some point if it proved to be a significant problem.) > On Dec 22, 2023, at 8:33 AM, Emanuel Peter wrote: > @stefank Maybe it would then make sense to do that before this change here? Because we will really have to touch all these changes here again for that. > > @kimbarrett @stefank @jdksjolen > Or alternatively, we just say that we want to keep the C-Heap functionality inside `GrowableArray`, and simply remove `GrowableArrayCHeap`? Though I honestly would prefer a world where every allocation strategy has its own `GrowableArray*` variant, so that it is clear statically that they cannot be assigned to each other. So my preference would even be to have these: > > CHeapGrowableArray > ArenaGrowableArray > ResourceAreaGrowableArray > > What do you think? I think having distinct CHeap, Resource, and Arena allocated types would be an improvement. If we were using std::vector with associated allocators we'd have something like that. (The allocator type is a template parameter for std::vector.) Refactoring GA in that direction is certainly possible, and might be an improvement. We could also have two variants of CHeap GAs, one with the memtype being a constructor argument (and possibly different from the memtype used to dynamically allocate the GA, as is currently the case, although I think that feature is never used), and a more compact one with a static memtype. I did some work on HotSpot allocators for standard containers like std::vector a while ago that could be applied to GA. It includes support for dynamic and static memtypes, and shows that isn't hard to do. I thought about whether this PR should go ahead without some of that stuff in place. It touches a lot of places that would probably be touched again if we went in that sort of direction. OTOH, a lot of these same places would probably need to be touched repeatedly anyway, one way or another. For example, dealing with what appears to be a large amount of unnecessary dynamic allocation of GAs would also touch a lot of these places, and I'm suspicious of combining the type changes and the allocation changes in one PR. > And how important is it that we do not use `virtual` with `GrowableArray`? Because the implementation and testing is much nastier without it. Is the V-table still such a overhead that we need to avoid it? I don't think there is any need for virtual functions in GA. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: Message signed with OpenPGP URL: From gunnar at wagenknecht.org Sat Dec 23 14:56:58 2023 From: gunnar at wagenknecht.org (Gunnar Wagenknecht) Date: Sat, 23 Dec 2023 15:56:58 +0100 Subject: Too many open files problem on MacOS 14.1 In-Reply-To: <821ab42c-6874-4dea-917b-461074945cb4@oracle.com> References: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> <821ab42c-6874-4dea-917b-461074945cb4@oracle.com> Message-ID: <81718069-188A-48F1-85D5-CA3DB10019EA@wagenknecht.org> > On Dec 22, 2023, at 21:37, daniel.daugherty at oracle.com wrote: > I don't see the traditional [REDO] bug, but I do see this one: > > JDK-8300088 [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for RLIMIT_NOFILE > > So I suspect it would be best to add comments to JDK-8300088 instead of JDK-8291060. Thanks Dan! Is it possible someone can add my email as comment there? I don't have an OpenJDK account. -- Gunnar Wagenknecht gunnar at wagenknecht.org, http://guw.io/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From forax at univ-mlv.fr Sat Dec 23 15:03:35 2023 From: forax at univ-mlv.fr (Remi Forax) Date: Sat, 23 Dec 2023 16:03:35 +0100 (CET) Subject: Too many open files problem on MacOS 14.1 In-Reply-To: <81718069-188A-48F1-85D5-CA3DB10019EA@wagenknecht.org> References: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> <821ab42c-6874-4dea-917b-461074945cb4@oracle.com> <81718069-188A-48F1-85D5-CA3DB10019EA@wagenknecht.org> Message-ID: <1298646367.89620160.1703343815593.JavaMail.zimbra@univ-eiffel.fr> > From: "Gunnar Wagenknecht" > To: "daniel daugherty" , "Gerard Ziemski" > , "Andrew Haley" > Cc: "hotspot-dev" > Sent: Saturday, December 23, 2023 3:56:58 PM > Subject: Re: Too many open files problem on MacOS 14.1 >> On Dec 22, 2023, at 21:37, daniel.daugherty at oracle.com wrote: >> I don't see the traditional [REDO] bug, but I do see this one: >> JDK-8300088 [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= 10.6 for >> RLIMIT_NOFILE >> So I suspect it would be best to add comments to JDK-8300088 instead of >> JDK-8291060. > Thanks Dan! Is it possible someone can add my email as comment there? I don't > have an OpenJDK account. We can't. Only people with an OpenJDK account can watch a bug :( > -- > Gunnar Wagenknecht > gunnar at wagenknecht.org, http://guw.io/ regards, R?mi -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniel.daugherty at oracle.com Sat Dec 23 15:08:28 2023 From: daniel.daugherty at oracle.com (daniel.daugherty at oracle.com) Date: Sat, 23 Dec 2023 10:08:28 -0500 Subject: [External] : Re: Too many open files problem on MacOS 14.1 In-Reply-To: <81718069-188A-48F1-85D5-CA3DB10019EA@wagenknecht.org> References: <6A33F43E-7121-475C-9DB2-4081D27AD6DF@wagenknecht.org> <821ab42c-6874-4dea-917b-461074945cb4@oracle.com> <81718069-188A-48F1-85D5-CA3DB10019EA@wagenknecht.org> Message-ID: <6f1f1ab9-57cb-40da-ba8d-4e958724bc6c@oracle.com> Done. It looks like I cannot add your email as a watcher on this bug because you do not have an OpenJDK account. Dan On 12/23/23 9:56 AM, Gunnar Wagenknecht wrote: > >> On Dec 22, 2023, at 21:37, daniel.daugherty at oracle.com wrote: >> I don't see the traditional [REDO] bug, but I do see this one: >> >> JDK-8300088 [IMPROVE] OPEN_MAX is no longer the max limit on macOS >= >> 10.6 for RLIMIT_NOFILE >> >> So I suspect it would be best to add comments to JDK-8300088 instead >> of JDK-8291060. > > > Thanks Dan! Is it possible someone can add my email as comment there? > I don't have an OpenJDK account. > > -- > Gunnar Wagenknecht > gunnar at wagenknecht.org, http://guw.io/ > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mdoerr at openjdk.org Wed Dec 27 10:52:59 2023 From: mdoerr at openjdk.org (Martin Doerr) Date: Wed, 27 Dec 2023 10:52:59 GMT Subject: RFR: JDK-8320890: [AIX] Find a better way to mimic dl handle equality [v11] In-Reply-To: <5LVpPG0ADhvALrcHghtZ95N1-SYcgrnrHl704svJStY=.e498d17a-3534-4ad9-8a74-004b40f7487a@github.com> References: <5LVpPG0ADhvALrcHghtZ95N1-SYcgrnrHl704svJStY=.e498d17a-3534-4ad9-8a74-004b40f7487a@github.com> Message-ID: On Fri, 22 Dec 2023 15:57:05 GMT, Joachim Kern wrote: >> On AIX, repeated calls to dlopen referring to the same shared library may result in different, unique dl handles to be returned from libc. In that it differs from typical libc implementations that cache dl handles. >> >> This causes problems in the JVM with code that assumes equality of handles. One such problem is in the JVMTI agent handler. That problem was fixed with a local fix to said handler ([JDK-8315706](https://bugs.openjdk.org/browse/JDK-8315706)). However, this fix causes follow-up problems since it assumes that the file name passed to `os::dll_load()` is the file that has been opened. It prevents efficient, os_aix.cpp-local workarounds for other AIX issues like the *.so/*.a duality. See [JDK-8320005](https://bugs.openjdk.org/browse/JDK-8320005). As such, it is a hack that causes other, more uglier hacks to follow (see discussion of https://github.com/openjdk/jdk/pull/16604). >> >> We propose a different, cleaner way of handling this: >> >> - Handle this entirely inside the AIX versions of os::dll_load and os::dll_unload. >> - Cache dl handles; repeated opening of a library should return the cached handle. >> - Increase handle-local ref counter on open, Decrease it on close >> - Make sure calls to os::dll_load are matched to os::dll_unload (See [JDK-8320830](https://bugs.openjdk.org/browse/JDK-8320830)). >> >> This way we mimic dl handle equality as it is implemented on other platforms, and this works for all callers of os::dll_load. > > Joachim Kern has updated the pull request incrementally with one additional commit since the last revision: > > No need for malloc src/hotspot/os/aix/porting_aix.cpp line 975: > 973: return false; > 974: > 975: char* path2 = os::strdup (path); Whitespace between `os::strdup` and `(path)`. src/hotspot/os/aix/porting_aix.cpp line 1019: > 1017: } > 1018: > 1019: char* libpath = os::strdup (Libpath.base()); Whitespace! ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1436947479 PR Review Comment: https://git.openjdk.org/jdk/pull/16920#discussion_r1436947817 From apangin at openjdk.org Thu Dec 28 00:25:48 2023 From: apangin at openjdk.org (Andrei Pangin) Date: Thu, 28 Dec 2023 00:25:48 GMT Subject: RFR: 8309271: A way to align already compiled methods with compiler directives [v19] In-Reply-To: References: Message-ID: On Fri, 22 Dec 2023 09:33:06 GMT, Dmitry Chuyko wrote: >> Compiler Control (https://openjdk.org/jeps/165) provides method-context dependent control of the JVM compilers (C1 and C2). The active directive stack is built from the directive files passed with the `-XX:CompilerDirectivesFile` diagnostic command-line option and the Compiler.add_directives diagnostic command. It is also possible to clear all directives or remove the top from the stack. >> >> A matching directive will be applied at method compilation time when such compilation is started. If directives are added or changed, but compilation does not start, then the state of compiled methods doesn't correspond to the rules. This is not an error, and it happens in long running applications when directives are added or removed after compilation of methods that could be matched. For example, the user decides that C2 compilation needs to be disabled for some method due to a compiler bug, issues such a directive but this does not affect the application behavior. In such case, the target application needs to be restarted, and such an operation can have high costs and risks. Another goal is testing/debugging compilers. >> >> It would be convenient to optionally reconcile at least existing matching nmethods to the current stack of compiler directives (so bypass inlined methods). >> >> Natural way to eliminate the discrepancy between the result of compilation and the broken rule is to discard the compilation result, i.e. deoptimization. Prior to that we can try to re-compile the method letting compile broker to perform it taking new directives stack into account. Re-compilation helps to prevent hot methods from execution in the interpreter. >> >> A new flag `-r` has beed introduced for some directives related to compile commands: `Compiler.add_directives`, `Compiler.remove_directives`, `Compiler.clear_directives`. The default behavior has not changed (no flag). If the new flag is present, the command scans already compiled methods and puts methods that have any active non-default matching compiler directives to re-compilation if possible, otherwise marks them for deoptimization. There is currently no distinction which directives are found. In particular, this means that if there are rules for inlining into some method, it will be refreshed. On the other hand, if there are rules for a method and it was inlined, top-level methods won't be refreshed, but this can be achieved by having rules for them. >> >> In addition, a new diagnostic command `Compiler.replace_directives... > > Dmitry Chuyko has updated the pull request incrementally with one additional commit since the last revision: > > Deopt osr, cleanups The logic looks good to me now, thanks. ------------- Marked as reviewed by apangin (no project role). PR Review: https://git.openjdk.org/jdk/pull/14111#pullrequestreview-1797576900 From duke at openjdk.org Thu Dec 28 09:26:19 2023 From: duke at openjdk.org (Liming Liu) Date: Thu, 28 Dec 2023 09:26:19 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v21] In-Reply-To: References: Message-ID: <8zahBxsT8WsccyfYZvONLJvPcfTp66XN4_8TN8D_Z9o=.9715d373-611c-4e23-bcb8-21fb6d06977e@github.com> > As described at [JDK-8315923](https://bugs.openjdk.org/browse/JDK-8315923), this patch uses madvise with MADV_POPULATE_WRITE to pretouch memory when supported (since kernel 5.14). > > Ran the newly added jtreg test on 64c Neoverse-N1 machines with kernel 4.18, 5.13 and 6.1, and observed that transparent huge pages formed right after pretouch on kernel 6.1. Recorded the time spent on the test in *seconds* with `VERBOSE=time` as the table below, and got that the patch takes improvements when the system call is supported, while does not hurt if not supported: > > > > > > > > > > > >
Kernel-XX:-TransparentHugePages-XX:+TransparentHugePages
UnpatchedPatchedUnpatchedPatched
4.1811.3011.300.250.25
5.130.220.223.423.42
6.10.270.333.540.33
Liming Liu has updated the pull request incrementally with one additional commit since the last revision: Use pthread instead ------------- Changes: - all: https://git.openjdk.org/jdk/pull/15781/files - new: https://git.openjdk.org/jdk/pull/15781/files/ae9f6f3a..f974a393 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=20 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=15781&range=19-20 Stats: 25 lines in 1 file changed: 6 ins; 7 del; 12 mod Patch: https://git.openjdk.org/jdk/pull/15781.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/15781/head:pull/15781 PR: https://git.openjdk.org/jdk/pull/15781 From kbarrett at openjdk.org Thu Dec 28 20:20:01 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Thu, 28 Dec 2023 20:20:01 GMT Subject: RFR: 8315923: pretouch_memory by atomic-add-0 fragments huge pages unexpectedly [v17] In-Reply-To: <-yGcrNxBa91rrdyLb4zNbgz_VRuht7MXBpnel_-WWxg=.6eec01fb-03e7-42d4-b07c-d5617f34bdc2@github.com> References: <-yGcrNxBa91rrdyLb4zNbgz_VRuht7MXBpnel_-WWxg=.6eec01fb-03e7-42d4-b07c-d5617f34bdc2@github.com> Message-ID: On Thu, 14 Dec 2023 11:14:00 GMT, Johan Sj?len wrote: >> Liming Liu has updated the pull request incrementally with one additional commit since the last revision: >> >> Replace to char* when type casting > > test/hotspot/gtest/runtime/test_os_linux.cpp line 377: > >> 375: EXPECT_TRUE(os::release_memory(heap, 1 * G)); >> 376: UseTransparentHugePages = useThp; >> 377: } > > This seems like it's concurrently running `madvise(..., MADV_POPULATE_WRITE)`, correct? This is not what I meant. > > What I meant was having at least 2 threads, where one thread is running `os::pretouch_memory` and another using the memory for something. For example, 1 thread pretouching, the other thread filling out the memory with an incrementing integer array `[0,1,2,3,4,...]`. I think this is what Kim meant also, or am I the one misunderstanding him? [Sorry, I lost track of this and didn't respond to the earlier comment from @jdksjolen.] Yes, that's correct. The reason for adding the safe for concurrent use pretouch mechanism was https://bugs.openjdk.org/browse/JDK-8260332. The idea is that presently, when a thread needs to expand the oldgen, it pretouches while holding the expansion lock. Any other threads that also need need the oldgen to be expanded have to wait until the holder of that lock completes. Most of the work involved in expansion is quick and short, but not so much for pretouching. So it was found that we're sometimes blocking a bunch of threads for a long-ish time. The original proposal there was to allow the otherwise waiting threads to cooperate in the pretouch. But the protocol involved was complicated and messy. A simpler approach was suggested; allow other threads to use the newly expanded memory concurrently with the expanding thread doing the pretouch. There's obviously some racing there, with the using threads possibly touching pages before the pretouching reaches them, but the thinking is that the pretouched wave-front will likely surge ahead of the using threads. And if not, then the using threads are effectively cooperating in the "pretouch". That approach needed https://bugs.openjdk.org/browse/JDK-8272807 as a building block. But I discovered there were a bunch of places with similar problems, suggesting the need for some more general mechanism. I did a bit of prototyping in that direction, but got distracted by other work and haven't gotten back to it. (The idea is to record needed pretouching, deferring it up the call chain, to a point where other threads are not being blocked waiting for the expansion operation. A complicating factor is that some of those places may have multiple distinct memory ranges being allocated and needing pretouch, all within the same expansion operation.) But that approach may interact poorly with the madvise approach. It might be that the madvise _should_ be done down inside the expansion operation where the pretouches currently happen, rather than being deferred up the call chain and permitting the madvise to be concurrent with using threads that might introduce the same "shredding" problem the madvise is attempting to fix. That would be yet another complicating factor that my prototyping didn't address at all. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/15781#discussion_r1437864972 From omikhaltcova at openjdk.org Thu Dec 28 23:01:51 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 28 Dec 2023 23:01:51 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12] In-Reply-To: References: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> Message-ID: On Fri, 22 Dec 2023 09:35:08 GMT, Hamlin Li wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Moved the code up + comments > > For normal cases, I guess `RUP` of riscv will work; but for some corner cases, we need the trick of `+0.5`, am I right? > But all this information is just mentioned with `some inputs produce incorrect results`, which is unclear for potential readers and maintainers in the future. > So, in the comments, can you add some information about this corner case, a simple example will definitely help here. @Hamlin-Li No, that?s wrong: RUP of riscv won?t work, it?ll give incorrect results. Look at the rounding of some values below for example: JAVA Math.round: src = 0.345200 dst = 0 JAVA Math.round: src = -0.555550 dst = -1 JAVA Math.round: src = -1.500000 dst = -1 JAVA Math.round: src = 1.500000 dst = 2 JAVA Math.round: src = -1.345000 dst = -1 JAVA Math.round: src = -1.450000 dst = -1 JAVA Math.round: src = -0.444460 dst = 0 JAVA Math.round: src = -0.999990 dst = -1 JAVA Math.round: src = 0.999999 dst = 1 JAVA Math.round: src = 0.000000 dst = 0 JAVA Math.round: src = 0.001000 dst = 0 JAVA Math.round: src = -0.001000 dst = 0 FCVT.W.S RUP: src = 0.345200 dst = 1 FCVT.W.S RUP: src = -0.555550 dst = 0 FCVT.W.S RUP: src = -1.500000 dst = -1 FCVT.W.S RUP: src = 1.500000 dst = 2 FCVT.W.S RUP: src = -1.345000 dst = -1 FCVT.W.S RUP: src = -1.450000 dst = -1 FCVT.W.S RUP: src = -0.444460 dst = 0 FCVT.W.S RUP: src = -0.999990 dst = 0 FCVT.W.S RUP: src = 0.999990 dst = 1 FCVT.W.S RUP: src = 0.000000 dst = 0 FCVT.W.S RUP: src = 0.001000 dst = 1 FCVT.W.S RUP: src = -0.001000 dst = 0 ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1871612334 From omikhaltcova at openjdk.org Thu Dec 28 23:13:51 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 28 Dec 2023 23:13:51 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12] In-Reply-To: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> References: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> Message-ID: On Thu, 21 Dec 2023 23:02:55 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Moved the code up + comments fadd_s requires setting the explicit rounding mode RDN (round down towards ??) because adding 0.5f to some floats exceeds the precision limits for a float and therefore rounding takes place. This leads to the incorrect results in case of the default rounding mode RNE (round to nearest, ties to even) for some inputs: error: src = 8388609.000000 dst = 8388610 etalon = 8388609 error: src = 8388611.000000 dst = 8388612 etalon = 8388611 error: src = 8388613.000000 dst = 8388614 etalon = 8388613 error: src = 8388615.000000 dst = 8388616 etalon = 8388615 error: src = 8388617.000000 dst = 8388618 etalon = 8388617 error: src = 8388619.000000 dst = 8388620 etalon = 8388619 error: src = 8388621.000000 dst = 8388622 etalon = 8388621 error: src = 8388623.000000 dst = 8388624 etalon = 8388623 error: src = 8388625.000000 dst = 8388626 etalon = 8388625 error: src = 8388627.000000 dst = 8388628 etalon = 8388627 error: src = 8388629.000000 dst = 8388630 etalon = 8388629 error: src = 8388631.000000 dst = 8388632 etalon = 8388631 error: src = 8388633.000000 dst = 8388634 etalon = 8388633 error: src = 8388635.000000 dst = 8388636 etalon = 8388635 error: src = 8388637.000000 dst = 8388638 etalon = 8388637 error: src = 8388639.000000 dst = 8388640 etalon = 8388639 etc. Let?s consider two of them with RNE for fadd.s: fadd.s rne (src + 0.5f): src = 8388609.000000 dst = 8388610.000000 fcvt.w.s rdn: src = 8388610.000000 dst = 8388610 RESULT: 8388610 (JAVA Math.round: 8388609) fadd.s rne (src + 0.5f): src = 8388611.000000 dst = 8388612.000000 fcvt.w.s rdn: src = 8388612.000000 dst = 8388612 RESULT: 8388612 (JAVA Math.round: 8388611) if RDN is set for fadd.s then: fadd.s rdn (src + 0.5f): src = 8388609.000000 dst = 8388609.000000 fcvt.w.s rdn: src = 8388609.000000 dst = 8388609 RESULT: 8388609 (JAVA Math.round: 8388609) fadd.s rdn (src + 0.5f): src = 8388611.000000 dst = 8388611.000000 fcvt.w.s rdn: src = 8388611.000000 dst = 8388611 RESULT: 8388611 (JAVA Math.round: 8388611) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1871616845 From omikhaltcova at openjdk.org Thu Dec 28 23:34:42 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Thu, 28 Dec 2023 23:34:42 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12] In-Reply-To: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> References: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> Message-ID: On Thu, 21 Dec 2023 23:02:55 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Moved the code up + comments In addition, some examples with RDN for fadd.s and fcvt.w.s: fadd.s (src + 0.5f): src = 0.345200 dst = 0.845200 fcvt.w.s: src = 0.845200 dst = 0 RESULT: 0 (JAVA Math.round: 0) fadd.s (src + 0.5f): src = -0.555550 dst = -0.055550 fcvt.w.s: src = -0.055550 dst = -1 RESULT: -1 (JAVA Math.round: -1) fadd.s (src + 0.5f): src = -1.500000 dst = -1.000000 fcvt.w.s: src = -1.000000 dst = -1 RESULT: -1 (JAVA Math.round: -1) fadd.s (src + 0.5f): src = 1.500000 dst = 2.000000 fcvt.w.s: src = 2.000000 dst = 2 RESULT: 2 (JAVA Math.round: 2) fadd.s (src + 0.5f): src = -1.345000 dst = -0.845000 fcvt.w.s: src = -0.845000 dst = -1 RESULT: -1 (JAVA Math.round: -1) fadd.s (src + 0.5f): src = -1.450000 dst = -0.950000 fcvt.w.s: src = -0.950000 dst = -1 RESULT: -1 (JAVA Math.round: -1) fadd.s (src + 0.5f): src = -0.444460 dst = 0.055540 fcvt.w.s: src = 0.055540 dst = 0 RESULT: 0 (JAVA Math.round: 0) fadd.s (src + 0.5f): src = -0.999990 dst = -0.499990 fcvt.w.s: src = -0.499990 dst = -1 RESULT: -1 (JAVA Math.round: -1) fadd.s (src + 0.5f): src = 0.999990 dst = 1.499990 fcvt.w.s: src = 1.499990 dst = 1 RESULT: 1 (JAVA Math.round: 1) fadd.s (src + 0.5f): src = 0.000000 dst = 0.500000 fcvt.w.s: src = 0.500000 dst = 0 RESULT: 0 (JAVA Math.round: 0) fadd.s (src + 0.5f): src = 0.001000 dst = 0.501000 fcvt.w.s: src = 0.501000 dst = 0 RESULT: 0 (JAVA Math.round: 0) fadd.s (src + 0.5f): src = -0.001000 dst = 0.499000 fcvt.w.s: src = 0.499000 dst = 0 RESULT: 0 (JAVA Math.round: 0) ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1871623924 From omikhaltcova at openjdk.org Fri Dec 29 00:12:14 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 29 Dec 2023 00:12:14 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v13] In-Reply-To: References: Message-ID: > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/4f56afb7..0b5e73dc Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=12 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=11-12 Stats: 12 lines in 1 file changed: 9 ins; 0 del; 3 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From omikhaltcova at openjdk.org Fri Dec 29 00:18:54 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 29 Dec 2023 00:18:54 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12] In-Reply-To: References: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> Message-ID: On Fri, 22 Dec 2023 09:35:08 GMT, Hamlin Li wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Moved the code up + comments > > For normal cases, I guess `RUP` of riscv will work; but for some corner cases, we need the trick of `+0.5`, am I right? > But all this information is just mentioned with `some inputs produce incorrect results`, which is unclear for potential readers and maintainers in the future. > So, in the comments, can you add some information about this corner case, a simple example will definitely help here. @Hamlin-Li @RealFYang Could you take a look once again, please, whether these comments are sufficient or something else is needed? ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1871639418 From kbarrett at openjdk.org Fri Dec 29 06:31:59 2023 From: kbarrett at openjdk.org (Kim Barrett) Date: Fri, 29 Dec 2023 06:31:59 GMT Subject: RFR: 8322765: Eliminate -Wparentheses warnings in runtime code Message-ID: Please review this change to eliminate some -Wparentheses warnings. This involved simply adding a few parentheses to make some implicit operator precedence explicit. Testing: mach5 tier1 Also ran mach5 tier1 with these changes in conjunction enabling -Wparentheses and other changes needed to make that work. ------------- Commit messages: - fix -Wparentheses warnings in runtime code Changes: https://git.openjdk.org/jdk/pull/17201/files Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=17201&range=00 Issue: https://bugs.openjdk.org/browse/JDK-8322765 Stats: 9 lines in 5 files changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/17201.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/17201/head:pull/17201 PR: https://git.openjdk.org/jdk/pull/17201 From mli at openjdk.org Fri Dec 29 13:26:52 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 29 Dec 2023 13:26:52 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v13] In-Reply-To: References: Message-ID: <0pgEnkItCdvyME-U-cnoX_nFpnIwCHvUaLM2b_qB2e8=.554784fb-ea51-4158-b71d-339af0756a7d@github.com> On Fri, 29 Dec 2023 00:12:14 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Comments Thanks for updating. In fact by java api spec, the corner cases also include Integer/Long.MIN/MAX_VALUE Can you add some comments like below? It also works for -2.1474836E9f which is corresponding to -2147483648 (Integer.MIN_VALUE) and even less float value; it also works for 2.1474836E9f which is corresponding to 2147483647 (Integer.MAX_VALUE) and even greater float value; BTW, some minor comments, I think you mean `java.lang.Math` or `j.l.Math` instead of `java.math`. Otherwise it looks good to me. ------------- Marked as reviewed by mli (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/16382#pullrequestreview-1798992551 From omikhaltcova at openjdk.org Fri Dec 29 15:37:26 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 29 Dec 2023 15:37:26 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v14] In-Reply-To: References: Message-ID: > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: Replaced java.math.round with java.lang.Math.round in comments ------------- Changes: - all: https://git.openjdk.org/jdk/pull/16382/files - new: https://git.openjdk.org/jdk/pull/16382/files/0b5e73dc..ba6e21a8 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=13 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=16382&range=12-13 Stats: 2 lines in 1 file changed: 0 ins; 0 del; 2 mod Patch: https://git.openjdk.org/jdk/pull/16382.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/16382/head:pull/16382 PR: https://git.openjdk.org/jdk/pull/16382 From omikhaltcova at openjdk.org Fri Dec 29 15:37:27 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 29 Dec 2023 15:37:27 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v12] In-Reply-To: References: <7sdld7t1oTEM_k08EV0expeUu3xKNhaYn9AqISytRSY=.7691d9ae-9edd-42d6-ba43-3f875aa25682@github.com> Message-ID: <2N0m9C9R-EuSnQr4y3w5IOgwgFAhMosHdSkz8eSaohk=.6b2facc7-e929-460a-9963-1bb1c1086dd7@github.com> On Fri, 22 Dec 2023 09:35:08 GMT, Hamlin Li wrote: >> Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: >> >> Moved the code up + comments > > For normal cases, I guess `RUP` of riscv will work; but for some corner cases, we need the trick of `+0.5`, am I right? > But all this information is just mentioned with `some inputs produce incorrect results`, which is unclear for potential readers and maintainers in the future. > So, in the comments, can you add some information about this corner case, a simple example will definitely help here. @Hamlin-Li thank you for reviewing! You suggested to write comments similar to aarch64 https://github.com/openjdk/jdk/pull/16382#issuecomment-1866476680. IMHO `java.math.round` doesn't bring any mess but I've just replaced `java.math.round` with `java.lang.Math.round` in order to be more accurate. Concerning comments about Integer.MIN_VALUE/Integer.MAX_VALUE, I don't think it's worth writing about because: - none platform contains such comments; - the other cases should be mentioned as well in this case such as: +/-0, +/-subnormal numbers, signaling/quiet NaN, +/-inf; - during this review the entire 32-bit range was tested against the current Java implementation and @RealFYang rechecked and confirmed it. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1872172421 From mli at openjdk.org Fri Dec 29 16:44:56 2023 From: mli at openjdk.org (Hamlin Li) Date: Fri, 29 Dec 2023 16:44:56 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v14] In-Reply-To: References: Message-ID: On Fri, 29 Dec 2023 15:37:26 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced java.math.round with java.lang.Math.round in comments Just FYI, there is a `java.math` package. For special cases, 1. we're implementing an intrinsic for java api, which is defined clearly in corresponding java doc about which are special cases, so better to clearly state how we handle them. 2. For other special cases, unless riscv has different behaviour with java spec, I don't think it's necessary to mention them either, but it does not do harm if it's mentioned. ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1872210852 From omikhaltcova at openjdk.org Fri Dec 29 18:27:45 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 29 Dec 2023 18:27:45 GMT Subject: RFR: 8318158: RISC-V: implement roundD/roundF intrinsics [v14] In-Reply-To: References: Message-ID: On Fri, 29 Dec 2023 15:37:26 GMT, Olga Mikhaltsova wrote: >> Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. >> >> In the table below it is shown that NaN argument should be processed as a special case. >> >> RISC-V Java >> (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) >> Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE >> Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE >> Output for NaN 2^31 ? 1 2^63 - 1 0 0 >> >> The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: >> >> **Before** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms >> >> >> **After** >> >> Benchmark (TESTSIZE) Mode Cnt Score Error Units >> FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms >> FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms > > Olga Mikhaltsova has updated the pull request incrementally with one additional commit since the last revision: > > Replaced java.math.round with java.lang.Math.round in comments Thank you all very much for the review! ------------- PR Comment: https://git.openjdk.org/jdk/pull/16382#issuecomment-1872262217 From omikhaltcova at openjdk.org Fri Dec 29 18:36:58 2023 From: omikhaltcova at openjdk.org (Olga Mikhaltsova) Date: Fri, 29 Dec 2023 18:36:58 GMT Subject: Integrated: 8318158: RISC-V: implement roundD/roundF intrinsics In-Reply-To: References: Message-ID: On Thu, 26 Oct 2023 17:20:49 GMT, Olga Mikhaltsova wrote: > Please, review this Implementation of the roundD/roundF intrinsics for RISC-V platform. > > In the table below it is shown that NaN argument should be processed as a special case. > > RISC-V Java > (FCVT.W.S) (FCVT.L.D) (long round(double a)) (int round(float a)) > Minimum valid input (after rounding) ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Maximum valid input (after rounding) 2^31 ? 1 2^63 ? 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for out-of-range negative input ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for ?? ?2^31 ?2^63 Long.MIN_VALUE Integer.MIN_VALUE > Output for out-of-range positive input 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for +? 2^31 ? 1 2^63 - 1 Long.MAX_VALUE Integer.MAX_VALUE > Output for NaN 2^31 ? 1 2^63 - 1 0 0 > > The benchmark running with the 2nd fixed implementation on the T-Head RVB-ICE board shows the following performance improvement:: > > **Before** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 59.555 0.179 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 49.760 0.103 ops/ms > > > **After** > > Benchmark (TESTSIZE) Mode Cnt Score Error Units > FpRoundingBenchmark.test_round_double 2048 thrpt 15 110.956 0.186 ops/ms > FpRoundingBenchmark.test_round_float 2048 thrpt 15 115.947 0.122 ops/ms This pull request has now been integrated. Changeset: 19147f32 Author: Olga Mikhaltsova Committer: Vladimir Kempik URL: https://git.openjdk.org/jdk/commit/19147f326c6b0e78fe72f9a7e7100047f16a0921 Stats: 82 lines in 3 files changed: 82 ins; 0 del; 0 mod 8318158: RISC-V: implement roundD/roundF intrinsics Co-authored-by: Vladimir Kempik Reviewed-by: luhenry, fyang, mli ------------- PR: https://git.openjdk.org/jdk/pull/16382 From vkempik at openjdk.org Sat Dec 30 18:00:47 2023 From: vkempik at openjdk.org (Vladimir Kempik) Date: Sat, 30 Dec 2023 18:00:47 GMT Subject: RFR: 8318227: RISC-V: C2 ConvHF2F [v3] In-Reply-To: References: <13Ot4D45ppGcgnXjlGP1xrYEcZ8LejbI5cxjRruUD4c=.4cd4ca6f-8e4f-4679-9706-59a86d867b6f@github.com> Message-ID: On Tue, 5 Dec 2023 11:12:50 GMT, Hamlin Li wrote: > > Hi Hamlin, updated change looks good to me. Please wait a while for the kernel patch to land. Thanks. > > Sure, I will wait for the kernel patch merged. Thanks for your reviewing! But you were testing it on licheepi which has old 5.10 kernel, So perhaps Zfh autodetection can be added later ------------- PR Comment: https://git.openjdk.org/jdk/pull/16802#issuecomment-1872574606